Are Supercomputers Really That Super? - Large Data Program Consulting

The Human Brain Is No Longer The Fastest Computer In The World – What Does That Mean?

Computers are fast and are getting faster at an unfathomable rate. But, they have yet to approach the capabilities of what the most capable human brain on the planet can do…. until now. That’s right, humanity is about to have a baby – and not the Manchester Baby Computer from 1948. That baby is supercomputing processing and storing as fast and as much as a human brain!

Now, this computing will be limited by a lot of the data bias it will receive:

Hearing and Sight (audio and video) is a compilation of biased real-time and collection views creating an abundance of confirmation bias which is still a complicated ruleset to discern.
Smells have never been scientifically proven to work in big data yet as it is quite subjective to the persona, their experiences, and trends we aren’t certain of.
Sentiment will be relative to the culture, philosophy, nationality, race, creed, neighborhood, period, language, and vocabulary context it is provided
Spatial accuracy is impacted by how and when we group patterns or as mapped to time (boundaries change, and with boundaries changing, sentiment changes)

Where We Are Heading

These 200,000 years of evolution, processing the foundational, thematic, and dynamic data with pre-wired cognitive response – some autonomic – have yet to be modeled. So while comparing the human brain to computers is a close match, physicists have long speculated the human brain might be a quantum computer rather than a digital computer. So will we get there pre-2040 or more like 2100? Either way, the next 20 years will be amazing. Without this data and rulesets, Supercomputers would melt down today if they even tried to do only the computational part of our brain, let alone all the other things our brains do.

So how do we prepare for this new arrival? Like any good parent, it doesn’t start when the baby is born. We need to nest. This blog hits on the facts behind “this baby” supercomputing. Then it hits on the 4 data things we are doing to prepare data to teach and train it before it arrives.

Computers Are Fast. And Get Faster Really Quickly.

Generally referred to as ‘exponential growth’ means. In short, wicked fast. Electrical computer engineers and computer scientists beat themselves up that Moore’s Law is now more of a principle and things are slowing down. To be redundant, Moore’s Law in 1962 noted “The logic density of silicon integrated circuits has closely followed the curve (bits per square inch) of doubling” every 18 months.

Another Freaky fact: Oh and don’t forget about storage! What is even more freaky than this computing talk is the storage. Dynamic data is growing so much that 90% of the world’s data is created every two years.

The storable amount in 1 square inch of silicon has doubled roughly every 18-24 months since 1962. That is about 30 times in 60 years. That is the kind of growth we’re dealing with. Then moving to the applicability of this large amount of data, computer and data science face issues like understanding big data, data growth, confusion on big data tools needed, lack of data professionals, data security, and data integration.

To Clarify: Since 1962, We Mean Doubled 30 Times, Not 30 Times Faster

2^30 times faster is almost half a billion to a billion times more. 1,000,000,000,000 more transistors…. Which in turn means many more 0 and 1 calculations per second. This means that supercomputers 20 years ago are now personal devices. Also, supercomputers 10 years ago are now on power-user desktops. This also means that data centers and cloud solutions are only 5 years behind supercomputers. In short, whatever supercomputing exists today, in 10 iterations, or 10-15 years, that’ll be accessible by the average nerd on their power user big computer at home or work. And that power user device in 10 years will be in the average.

Take ‘97 for example, Deep Blue, the chess-winning computer did 11 GFLOPS (Giga Floating point Operations per Second) – Smartphones are clocking above that as of 2017. And Power Users on Alienware are at 10-40 GLOPS.

PetaFlops Moving To Exaflops Supercomputer In 2021 – We Are Exascale!

Let’s keep with the idea that supercomputers are 10-15 years or 6-8 generations ahead (on the 2^n curve or 1000 times more powerful). Meanwhile, personal devices are in 10s and power users in 100s of GFLOPS. That means Supercomputers are pushing 100s of PFLOPS and even single-digit exaflops. From research in 2011, there was one that broke 10 PFLOPS in Japan and now in Japan, they are 442 PFLOPS. The Cray supercomputer at Oak Ridge going live in 2021 is there – 1.5 exaflops or 1,500 PFLOPS. More are coming soon at other national labs – and that is just the U.S.

So in 10 years, that’ll be on power user desktops for amazing visualization tools. The ORNL supercomputer will support massive science modeling with raster tempo-spatial datasets at the universe and geological time scales. Now, the exciting metric, and we are going for scary too, is compared to the human brain. In 20 years, that’ll be on personal devices under $1,000 per.

With Top 500 tracking, even if Moore’s law is slowing down (graph below), or if it is arguable, the data center deployment certainly has slowed down in what it is offering, which usually infers we can keep moving to personal devices:

Exascale Means Computing Has Exceeded Human Brains Processing

So what? 1.5 exaflop is more powerful than the 1 exaflop estimation of a brain. Meanwhile, computing prioritization, creativity, and energy use (e.g. Tianhe-1A, for example, uses 4.04 megawatts of power) by the human brain trumps computing, while processing, logic, and math are won by the computer. For instance, The human brain cannot handle the 4 V’s of data. Computing at this level can truly handle the volume and velocity issues of Sensors & Embedded Devices. There are over 30 billion IoT devices and this doesn’t count non-internet embedded devices that could be bridged in.

It’s a good thing we went from IPv4 to v6 as we only had 4.3 billion IPs then. These devices will be able to do more than basic 8 or 32-bit computing to clean up dirty data coming in from fleets, cars, buildings, and video. That signal cleaning will only enhance the training data needed to take advantage of the computing.

The Targeted Year and Why

So in 2040, or 3-5 computing generations, we’ll have computing 20 times the power of a human brain on supercomputers. That means we’ll have to compute the power of a human brain on your device. Yet, if we stick with digital computing, the energy consumption and miniaturization may be a roadblock – and this doesn’t take into account how quantum computing will plan in as that completely changes the curve. Quantum computers, as a side note, conduct quantum computing. Quantum computing is the use of quantum phenomena such as superposition and entanglement to perform computation. This means that a computer can be in multiple “states”. Kind of like how the human brain can consider multiple things at once. Also, quantum computers (which have yet to be built in a laboratory) differ from digital computers. This is because “in theory, they can answer questions that weren’t asked.”

The article we wrote “AI Ain’t Taking Over the World Yet, So They Want You to Think” alluded to this. This is called Exascale computing. Computing is as powerful as a human and many humans put brains together tied to the internet and all those sensors.

*Take a breath, the matrix is not here yet*

However, There Is a Problem

The problem with hardware moving too fast is software has tried to keep up by releasing poorly tuned code. So even if we release all this, the software isn’t there with tuned logic. The cool thing is that if Moore’s law should fail, we can tune the software to achieve higher quality instructions instead of simply more instructions!

This moves us from Knowledge to Wisdom Solutions:

Deep Learning Analytics – we can get to multidimensional data analytics: Content, temporal, spatial, semantic, sentiment/persona, etc. Not quite sentient or conscious with exascale – that’s something for the year 2100 folk.
Workflow Steering and RPA – We can get wisdom on big data exhaust of our processes to fine-tune our robotic process automation, adjust factories, adjust triggers and ranges in markets, and other workflow steering. We may solve the traveling salesman problem once and for all!
Self-healing IT – We can have AI-driven cloud elasticity for data centers to monitor and adjust themselves and even more self-healing IT to keep up with vulnerabilities and other architectural qualities in the data centers and in the network and the internet itself.
Human-Computer Interaction – We can have marketing, advertising, and other reactions and experience data adjust like how humans adjust to non-verbal queues and opportunities to improve active learning programs and teaching

We can start getting into knowledge and wisdom solutions. However, this requires more than harvesting unstructured data feeds, spatial datasets, financial data, and MIS feeds.

Like A Good Parent Guiding A Baby’s Knowledge Growth, The Data Community Must Be Involved To Feed Good Foundational Data

Unless you want chatbots to be racist like in 2016, the computing needs to be nurtured. There are 3 major feeds:

Foundational Datasets, Maps, and Irrefutable Rules – History, Math, Science, and Philosophies that will have a national or industry bias
Thematic Data – Group-based Maps and Irrefutable Rules. Usually produced by education, companies, and other Governed groups which will have an enterprise bias
Dynamic Dataset – Ungoverned, unstructured, raw collections of images, comments, earth sensors, and social feeds with an independent bias.

What Can We Do Now As We Nest And Get Ready For This New Baby Superpower In Computing

What is joining your family of tools is computing and storage. It’s coming. The bun is in the proverbial oven. Based on 50-plus years of precedence, this computing may even be in your pocket. However, without the data, like a brain as a baby, it won’t know. The interconnected nature of the foundational, thematic, and dynamic data is even more critical to be network-centric. The ECE and CS folk are examining how to improve the supercomputer’s interactions with the massive amounts of data in the world. In other words, to use the phrase we at Xentity love, put the ‘I’ back in ‘IT.’

So, we have to be prepared to teach the computing brain. We have to prepare to instill it with data, models, algorithms, signals, functional reactions, sentiment concepts, and possibly sentience that go beyond stuff Watson does with Jeopardy answers to answer a niche prospect.

Our Preparations

So, to nest, here is how we prepare. We are focused on staging the data to be training data usable in the future:

Expose Dataset for improved discovery – Have improved metadata for use outside the enterprise in situations like this. This focuses on the F in FAIR Data principles to help humans – modelers, data scientists, planners, language experts, subject matter experts, data managers, etc. – who will be feeding the baby able to feed more quality data. Not quite the organic, non-GMO level of quality. However, it improves a diet rich in connective constructs and lowers confirmation bias.
Connect Features with Language – Have improved semantic linkages for increased contextual use of thematic data so that the siloed datasets can have connective tissues when structures people refer to a dam and waterway people refer to a dam and transportation people refer to a dam with a road, and engineers refer to a dam, etc. the questions we can ask about that specific dam or set of dams can go cross-domain.
Target datasets that have high re-use – Working with foundational geospatial, statistics from national and international program datasets, we can improve the core base of re-usable data – topographical, hypsographic, common core educational data basis (topical progression roadmaps), financial (weighting value of resources), temporal and historical contextual data (encyclopedic), organizational and governmental units, raster pixel data (surface, subsurface, atmospheric, oceanic, universal).
Modernizing data Increase rest of FAIR – Beyond findability, focus on the AIR in FAIR through improved architectures and solutions to move us forward to feed these wisdom solutions. Improved data supply chains, improved IoT data and sensors into datasets in real-time streaming, information workflows, API patterns, data lakes, and graph theory constructs to enable these knowledge and wisdom architectures that supercomputing will open up.