We defined data as “A form of information that can exist without an observer” in the previous blog. It’s information in raw or unorganized form, and it represents values attributed to parameters. For example, a human heart beats all day long. The amount of times it beats per minute is a parameter of the cardiovascular system. The value of that parameter can give us some information. Every heart beat, or the lack of it, generates data. And it’s doing that whether we are looking or not. This is because data is limitless and present everywhere in the universe. It does not require an observer.
Practically however, unobserved data without context is not very useful. But when we measure and record it, and then analyze it, the knowledge is exposed. Consider again a human heart beating. By itself, not terribly interesting. If it’s beating you’re alive, if it’s not beating you’re dead. However, when we measure a heart beat over the span of a day, week, or month, some information starts surfacing. We can calculate an average and/or median value, and use those values to start investigating periods of time during which deviations are measured. This only tells half the story however, as the below image demonstrates. It’s a graphical representation of a single person’s average heartbeat over the period of one month. Try to make sense of what is happening to this person, without context. What could explain this graph?
It’s pretty much guesswork. Now let’s add context. The heartrate in the graph belongs to reddit user Jimmy, an engineering student who sat his professional engineering exam on the 22nd of October 2015. Suddenly, the graph looks completely different. We can deduce that two things are responsible for the rapid change in his average resting heart rate. The first is his physiological response to mental anticipation buildup. The second is stimulant use as a study aid in the form of caffeine and adderall.
Now, this is very much data at a small scale. This data set is unique to Jimmy as an individual, and no conclusions can be drawn from it which don’t pertain to Jimmy the individual engineering student. However, what if we had similar data sets for all engineering students at Jimmy’s school? Or all engineering students in America? Or the world? What kind of conclusions could we draw then? Enter the domain of Big Data.
Big Data for Big Stories
If personal data sets can help us paint a better picture of our own lives, Big Data sets can help us tell the story of all of us. Imagine a world where average heart rate tracking is commonplace. The stories you could tell by looking at data sets from crowds watching sports events, or elections. Speaking of elections, votings patterns can tell some interesting stories as well. Consider the still below, taken from an interactive data visualisation of exit polls during the 2008 and 2012 US presidential elections. You can click on the image to check out the interactive version.
This shows a strong move towards the left in the 2008 election, which helped Obama secure the victory. In the 2012 election a general shift towards the right is observable, but swing voters came through for Obama where it counted.
A far more inspiring example is mapping of crisis data. During the 2010 earthquake in Haiti, a non-profit organisation called Ushahidi used its data crowdsourcing tools to create a real time map of the disaster unfolding by scraping social media posts. The map was made available to the public, which empowered them to organize and respond to events as they were unfolding.
Perhaps the greatest thing about Big Data is that it allows us to look at the world through an objective lens, and make better decisions thanks to that. There’s no one better at it than Swedish data visionary Hans Rosling, founder of the Gapminder Foundation. The below video is a fantastic demonstration of that ability, and it dispells many myths about developing countries in the process.
Now consider the fact that due to the exponential growth of digital technologies, ever more data is being captured as time progresses. Everyone with a smartphone is generating and capturing incredible amounts of data through the many sensors embedded in the devices. Matters get even more interesting as we look towards the Internet of Things. It’s estimated that by 2020, over 50 billion devices will come on line, most of them equipped with sensors, and capable of transmitting the data they capture in real time. Just a few years from now, we’ll have Big datasets that will make current ones look like child’s play. Imagine all the things we can learn about our selves and our world, all the good we can do. This writer for one, is buzzing with anticipation.
Next time we’ll take a closer look at what the cloud is obscuring. In the mean time, make sure to check out the fantastic documentary by Rick Smolan called The Human Face of Big Data. Incredibly inspiring stuff.
If you enjoyed this blog, make sure to subscribe to the Storybook on the right, share it on your social media of choice using the buttons below, and leave a comment. I look forward to hearing your thoughts.