According to a recent article in Wired UK, as supercomputers increase in power and the tools to use them become less obtuse, whole new academic disciplines are beginning to feel the benefits of crunching data.
Believe it or not, some people even think we can forecast the future with big data. Predicting world-changing events is a possibility, some claim, if you treat society and history like a big data problem. It’s how big data analyst Kalev Leetaru found where Osama bin Laden had been hiding, in a way.
Leetaru’s work made the news in 2011, as it claimed to pick up early clues to where the al-Qaeda leader had been living in Pakistan, just from publicly available sources of information. This was after the fact, of course, but the point was it could have found him, maybe, if someone had thought to look for him. That same method could, possibly, pick up where the next bout of social unrest will appear in the Middle East, or reveal a new history of the US Civil War — or at least, that’s the claim.
His research involves taking vast quantities of data — usually on the scale of millions, if not billions, of individual data points — and running algorithms that look for the connections between them on supercomputers. This is the essence of big data, a field with a name that both summarizes the problem and offers nothing of what that actually means. One possible definition of it might be how humanity copes with all the information that it produces, and the web, and social media, means that there is a lot of information out there to look through. Exabytes upon exabytes.
“The thing about the reality about everything that happens on Earth, a hundred years ago we only understood the tiniest fraction of that,” Leetaru explains, the day I met him. “Political scientists, over the last half century, have largely studied political unrest in other countries through the New York Times. Literally, paying teams of graduate students to open the New York Times each day, read through it, and clip articles or check boxes on what they’re seeing. The New York Times is a wonderful paper, but it’s not necessarily the best paper to understand the Liberian civil war through.”
For scientists and mathematicians, working with supercomputers makes sense — their information is numerical. It already exists in a language that machines can read. The interesting thing here for historians and sociologists and literature critics, and everyone else who works with language and the vagaries of the human condition, is that we’ve reached a point where supercomputers are fast enough to crunch that data just as easily as anything else.
The big data approach to intelligence gathering allows an analyst to get the full resolution on worldwide affairs. Nothing is lost from looking too closely at one particular section of data; nothing is lost from trying to get too wide a perspective on a situation that the fine detail is lost. The algorithms find the patterns and the hypothesis follows from the data. The analyst doesn’t even have to bother proposing a hypothesis any more. Her role switches from proactive to reactive, with the algorithms doing the contextual work.
For science, it makes sense to see big data as a revolution. Algorithms will spot patterns and generates theories, so there’s a decreasing need to worry about inventing a hypothesis first and then testing it with a sample of data. But the thing with Leetaru’s work is that it isn’t working with numerical data. It’s working with words, with jokes, with sarcasm and sincerity. Those are the kinds of things that we have humans for, because they are what makes us human. Right? Read the rest of the article on Wired UK.