Big Data Is Not Just Size And Speed Of Complex Data – It Is Moving Us From “Information” To “Knowledge”

We discussed this in our Why we focus on spatial data science article. The progress of knowledge fields, such as history, math, engineering, science and philosophy, or the individual pursuit of knowledge is based on moving from experiments to hypotheses to computation. As explained in The Fourth Paradigm: Data-Intensive Scientific Discovery, Big Data is another sort of knowledge field that moves us from information to knowledge. These progressions has happened over the course of human history and are now abstracting itself on the internet.

When You Are Ready To Move Into Big Data, It Means You Are Wanting To Answer New Questions.

That said, the Big Data phenomenon is not about the input of the raw data. It is not about the explosion that the Internet of Things is being touted as. And it is not just about answering new questions. It is really about the knowledge.

It is about what we can do with the technology on the cheap. Not just on the cheap, but without the required supercomputer clusters that only the big boys had. And now, with cloud, internet, and plenty of standards, if we have good and improving data, we all have the environment to answer  complicated questions while sifting through the noise. It is about the enablement of the initial phase of knowledge discovery that everyone is complaining about the “web” right now “too much information” or “drowning in data”.

The article on Throwing a Lifeline to Scientists Drowning in Data discusses how we need to be able to “sift through the noise” and make search faster. That is the roadblock, the tall pole in the tent, the showstopper.

Parallelizing The Search Is The Killer App – This Is The Big Deal, We Should Call It Big Search

If you have to search billions of records and map them to another billion records, doing it all at once can be a problem. You need to shorten the time it takes to sift through the noise. That is why Google became such an amazing search engine success out of nowhere. They did and are currently doing “searching” better than anyone else: sifting through the noise (or, data).

Furthermore, the United States’ amazing growth is because of two things: 1)  We have resources and 2) We found out how to get to them faster. In other words, knowledge is power and we used that knowledge to answer the question, “how can we do this better?” much faster than everyone else. Each growth phase of the United states was based on that fact alone. Some softball examples out of hundreds:

    • Expanding West dramatically exploded after trains, which allowed for regional foraging and mining
    • Manufacturing dramatically exploded production output, which allowed for city growth
    • Engines shortened time between towns and cities, which allowed for job explosion
    • Highway systems shortened time between large cities, which allowed for regional economies
    • Airplanes shorten time between the legacy railroad time zones, which allowed for national economies
    • Internet shortened access to national resources internationally, which allowed for international economies
    • Computing shortened processing time of information, which allows for micro-targeted economies worldwide

So What?

Each historical “technology age” has resulted in shortening the distance from A to B. Google is sifting through data in a very similar manner. Scientists are trying to sift as well through defined data sensors, link them together and ask very targeted simulated or modeled questions. And it all comes back to Big Data. See, it is a lot like the highway systems, airplanes and trains. Big Data shortens the distance from A to B with the knowledge it provides on how to solve these issues faster than normally. Now, with all that being said, there are limits, barriers. We need to address these barriers limiting entities success in order to properly use Big Data.