Big Data Mysticism

In 2000, I compiled an internal Sun Microsystems document. We noticed that the shares of companies on Silicon Valley increase with the numbers of CPUs they have in their grids. I called  this "Faraone curve" from the name of the first CEO of Gridware Inc, the company I was a one of the co-founders, and acquired by Sun.

Fig. 1: Faraone Curve
When we came out with this curve, many company men laughed; Hah, Hah, Hah.

"Haughtiness is a terrible trait that we must flee from."
(Rabbi Nachman, Likutey Moharan I, 10)
Great mystical thinkers say we should always focus on the inner intelligence of every matter, and
"we must bind ourselves to the wisdom and intelligence that is to be found in each thing. The inner intelligence is a great light that clarifies all our decisions and illuminates our actions and deeds"
These are words from "Rebbe Nachman, great-grandson of the holy sage The Baal Shem Tov in Likutei Moharan , printed in Ostrog the Province of Volhyn, under the rule  of our master, the exalted and pious Czar Alexander Pavlovitch in the year 1808."

David Ungar is a seeker for inner intelligence as he investigated how to program manycore CPUs
The obstacle we shall have to overcome, if we are to successfully program manycore systems, is our cherished assumption that we write programs that always get the exactly right answers. This assumption is deeply embedded in how we think about programming. The folks who build web search engines already understand, but for the rest of us, to quote Firesign Theatre: Everything You Know Is Wrong!
Those who know, are not surprised that Google, Facebook, Amazon and Yahoo became the largest Silicon Valley companies and they perfected the technology of managing and extracting value from the largest number or processors available.

This generated the hunger for storage and Big Data. According to the latest gurus, Viktor Mayer Schonberger and Keneth Cukier
core, big data is about predictions. Though it is described as part of the branch of computer science called artificial intelligence, and more specifically, an area called machine learning, this characterization is misleading. Big data is not about trying to “teach” a computer to “think” like humans.
So what it is?
  1. Big data gives us an especially clear view of the granular: subcategories and submarkets that samples can’t assess.
  2. It permits us to loosen up our desire for exactitude, the second shift. We don’t give up on exactitude entirely; we only give up our devotion to it.
  3. A move away from the age-old search for causality. As humans we have been conditioned to look for causes, even though searching for causality is often difficult and may lead us down the wrong paths. In a big-data world, by contrast, we won’t have to be fixated on causality; instead we can discover patterns and correlations in the data that offer us novel and invaluable insights. The correlations may not tell us precisely why something is happening, but they alert us that it is happening.
Google has a white paper by Kazunori Sato,  An Inside Look at Google BigQuery . Why Google uses Big Query rather than Hadoop / MapReduce ?
MapReduce was only a partial solution, capable of handling about a third of my problem. I couldn’t use it when I needed nearly instantaneous results because it was too slow. Even the simplest job would take several minutes to finish, and longer jobs would take a day or more. ...
So to discover the inner intelligence of things,  scientists do not have to go through the torture of setting up a Hadoop installation. This is not simple. This is not trivial. This is an obstacle to analyze creatively.

No wonder Cloudera, the dean of Hadoop companies came out with Cloudera Search . Unlike Google, "Cloudera has contributed its innovations and IP around the integration of Apache Solr and Apache Lucene with CDH back to the respective upstream projects."

Other tools that must change are the resource managers. Right know all assume they know every node where they run, but soon, they will not

Once these easy to use tools are accessible, Big Data will explode. Other will have the technology, ready for us to use, the same way Goggle has all we need to email and start an IT center.

When Rebbe Nachman dictated  to his favorite student, Reb Noson one of this book of secrets, Rebbe Nachman stopped and asked: "if only you knew what you are writing". Reb Noson replied; "I really have no idea at all." Rebbe Nachman then said to him; "you don't know what it is that you don't know."

"By eliminating haughtiness, our wisdom is repaired."
(Rabbi Nachman, Likutey Moharan I, 10)

Comments

Popular Posts