To do BigData, address Data Quality – People and Processes – Tech Access to information

As a follow on to the “cliffhanger” on BigData is a big deal because it can help answer questions fast, there are three top limitations right now: Data Quality, People and Process, Tech Access to Information.

Lets jump right in.

Number One and by far the biggest – Data Quality

Climate Change isn’t a myth, but it is the first science to ever be presented on a data premise. And in doing so, they prematurely presented models that didn’t take into account the driving variables. Their models have changed over and over again. Their resolution of source data has increased. Their simulations on top of simulations have proven countless theories of various models that can only be demonstrated simply by Hollywood blockblusters. Point being, we are dealing with inferior data for a world scale problem, and we jump into the political, emotional driven world with a data report? We will be the frog in slowly warming water, and we will hit that boiling point late. All because we started with a data justification approach using low quality data. Are they right the world is warming? Yes. Do they have enough data to proven the right mitigation, mediation, or policy adjustments? No, and not until either we increase the data quality or take a non-data tact.

People and processes is a generation away.

Our processes in IT have been driven by Defense and GSA business models from the fifties. Put anyone managing 0s and 1s technology in the back. They are nerds, look goofy, can’t talk, don’t understand what we actually do here and by the way, they smell funny. That has been the approach to IT since the 50s – nothing has changed with the exception that their are a few bakers dozen of the hoodie wearing, mountain dew drinking, late night owls who happen to be loaded now, and their is a pseudo culture of geek chic. We have not matured our people talent investment to balance maturity of service, data, governance, design, and product lifecycle to embrace that engine culture as core to the business. This means, more effective information sharing processes to get the right information to the right people. This also means, investing in the right skills – not just feeding doritos and free soda to hackers – to manage the information sharing and data lifecycle. I am not as worried about this one. As the baby boomer generation retires, it will leave a massive vacuum as Generation X is too small and we’ll have to groom Generation Y fast. That said, we will mess up a lot missing a lot of brain drain, but market will demand relevancy which will, albeit slowly, create this workforce model in 10-15 years.

Access to Environments

If you asked this pre-hosting environments or pre-cloud, this would have been limited to massive corporations, defense, intel, and some of the academia co-investing with those groups. If you can manage the strain of shifting to a big data infrastructure, this barrier should be the least of your problems. If you can allow your staff to get the data they need at the speed they need so they can process in parallelization without long wait times, you are looking good. Get a credit card, or if Government, buy off a Cloud GWAC, and get your governance and policies moving, as they are likely behind and not ready. Likely they will prolong the silo’d information phenomenon. Focus on the I in IT, and let the CTO respond to the technology stack.

Focus on data quality, have a workforce investment plan, and continue working your information access policies

The tipping point that move you into Big Data is where these combined require you to deal with the complicated enormity at speeds answering questions not just for MIS and reports, but to help answer questions. If you can focus on those things in that order (likely solving in reverse), you will be able to implement parallelization of data discovery.

This will shorten the distance from A to B and create new economies, new networks, and enable your customer or user base to do things they could not before. It is the train, plane, and automobile factor all over again.

And to throw the shameless plug in, this is what we do. This is Why we focus on spatial data science and Why is change so fundamental.