For artificial intelligence to advance, it needs to start asking questions


kruzweil ai.jpg

I’ve grown increasingly excited about Skimbox’s progress in Machine Learning. The last time I took I serious look at AI was in the mid-80s when Expert Systems were the new new thing and DEC was saving $30M per year having their expert system configure customer machines.

Expert Systems never seemed to scale or become general purpose soon after due to limitations of the algorithms at the time.

In 1997, IBM’s Deep Blue defeated world chess champion Kasparov, a major achievement. Amazingly, Kasparov had never lost any tournament match before to either man or machine.

Fast forward sixteen or so years, Machine Learning has become pervasive. Most significant computer interactions such as using a credit card, searching the web, shopping online, or calling customer service involve Machine Learning algorithms and approaches.

It has clearly become mainstream in many respects. Kurzweil’s predictions continue to be coming true.

I abstract, therefore I am

Yom Kippur in Jerusalem is a shockingly quiet time. There are no cars, no music, and hardly any mechanical sounds whatsoever. People walking, observant or not, speak in hushed tones and generally stroll slowly. Even though the city is densely populated, a dog’s bark can be heard from half a kilometer away.

A sense of awe fills the streets – it’s hard not to feel somewhat spiritual or at least think on a higher plane without the distractions and noise of modern life breaking concentration.

I used the balance of my rare quiet time to devour Kurzweil’s latest book "How to Create a Mind."

It is a wide ranging, thought provoking book that challenges what it means to be human, whether consciousness is real, and ends no less grandiosely than how human intelligence in non-biological form will conquer the universe.

Kurzweil presents a strong case that the human brain’s neocortex implements 300 million hierarchical hidden Markov Models (HHMM) for parallel pattern recognition. HHMMs are a key means of understanding both human thought and improving computer capability.  

Kurzweil claims that humans have the unique ability to think hierarchically.  Smaller levels of abstraction have different sets of rules. In the physical world, this is modeled as the steps from quantum to atomic to molecular, to cellular, to organs, to autonomous systems to conscious systems.

Another good example of this is the hierarchical nature of language: lines to letters, to words, to paragraphs to theses or poetry or novels or entire corpuses. Each has a different level of abstraction and a different set of rules.

Large scale computer systems are like this as well. Programmers rely on abstraction and encapsulation in order to focus on a particular problem set without getting lost in the details. From processor microcode, to low level operating systems, to processes, to services to highly parallel systems, software has a hierarchy of abstraction.

Nerves and muscles are smart

Skeletal nerves and muscles adapt and react. Over time, they exhibit learning behavior, getting stronger with use, creating new mitochondria, self-repairing, and communicating. The body’s nervous and muscle systems display some intelligence. Yet the body and muscle have no concept of higher purpose. A trained soccer player may have optimized his body for kicking a ball, but the muscles and nerves have no idea that they are trying to win the game.

Humans are unlikely the highest form of abstraction

So is the human mind the highest level of abstraction? Why stop at a single human?

It would pre-Copernican even irrational to think that humans are simply not part of a bigger abstraction, which we cannot comprehend, not now perhaps never. We are likely just like the legs and nerves of a higher abstraction. We have no idea what game we are playing. Wouldn’t it be disappointing to find out that our major purpose in life was to kick a meta-ball? Would a muscle be disappointed that years of conditioning and suffering was to shave 0.02 seconds off a 100-meter dash?

There are parallels of human social networks to computer networks. Only six (or less) degrees of separation divide all humans, which is less than degrees of separation than the majority of Internet nodes. The typical human can keep track of 150 contacts, whereas monkeys can track about 50. We are connected but isolated in our individual subnetworks.

Humans are still terrible at juggling multiple concurrent thoughts, of scale and permanence of memory. While science knowledge expands exponentially, we still do not understand many basic mechanisms of the world, like gravity and time.

Yet collectively, we are far more interesting and intelligent as a species. As Kurzweil points out, our ability to pass knowledge quickly between people and generations is another uniquely human trait. Any form of endeavor is improved by collaborative work.  A quantum physicist could even claim that a single person's work does not even exist until it is perceived by another person.

Neuroscience's approach to understanding human intelligence

Neuroscience is attempting to understanding the physical wiring and electrical pathways of human and nematode brains in order learn and potentially replicate.

Let's try a thought experiment. Imagine a Google datacenter, with thousands of well stacked servers in a climate controlled cement enclosed building. Now let's take a building sized saw and start slicing very thin cuts through the datacenter. Would we learn how Google functions? We would learn how the datacenter connects to the outside world, that there are many well organized cabinets, each containing a multitude of very dense circuitry. But we'd learn nothing about the software and algorithms of how Google functions.

Let's say we could completely freeze a Google datacenter instantaneously, and take a snapshot of the running processor, memory and disc states. We'd still learn almost nothing about how Google actually works.

How about if we build a mega super computer that is able to observe every Google CPU operation in real time across the hundreds of thousands of processors. Every time a CPU talks to another CPU, we are able to record the information flow in real time. While we could decode some messages, and perhaps sort out some housekeeping from real work, it would still not be sufficient to understand and replicate.

Let's have even more superpowers - how about if we could download all the Google binary code as one long stream. Ah now, we have the software, without any organization, but at least we now have the code. We could see the order of instructions. As anyone who had to look at a software crash dump, someone else's uncommented assembly code, or tried to reverse engineer and patch binary code. This is still extremely difficult. Patches can be made, but getting a complete understanding of the original intent and algorithms is near impossible. It certainly is hardly enough to reproduce a new implementation of anything as complex as Google. (Though we could likely inject a virus into Google at that point).

And finally, let's decompile Google's code to the original source. Amazingly, we now have 500 million lines of Google's secret sauce. If we wanted to build a new implementation on a completely different architecture, like a carbon based neuro-computer, it would still be virtually useless.

So the current approach to neuroscience is a dead end for AI purposes. We can learn about the human brain, but we'll still know very little about how to build intelligent systems.

Norvig's approach to building intelligent systems

Norvig and Russell have written the most popular textbook on AI. Their Stanford AI course had 100,000 participants buoying the launch of MOOC startups like Coursera and Udacity.

There are clear and obvious merits of the approaches described by Norvig. As defined from the beginning, the approaches are appropriate to finding optimal solutions to range to a wide variety of problems.

But is this intelligence? Is it language understanding? Chomsky disagrees that this is even interesting science, so Norvig wrote a lengthy and detailed response. The facts are there to support that the brain is doing a lot of statistical pattern matching as part of understanding the world.

So current AI, with heavy use of data and statistics, is able to solve problems that were not possible before, and are not even possible by a human. Even if every human on the planet collaborated in parallel, there are still many problems that a single desktop PC could solve faster and more precisely using AI and Machine Learning techniques.

Limitations of HHMM

So following Kurzweil's law of accelerating returns, it is simply a matter of time before machines exceed human capacity and The Singularity becomes achievable. Kurzweil claims that “our ultimate act of creativity [will be] to create the capability of being creative.” (pg 115)

Except there is a huge part of human intellect completely missing from Kurzweil's roadmap.

While HHMM and other statistical techniques solve many problems, they do not ask questions in the first place. There is no curiosity. HHMM does not know how to ask "Why?" As far I can tell, there is no investigation about the algorithms of curiosity.

Kurzweil quotes Einstein in numerous occasions, but omits two of Einstein's more profound quotations: "the true art of questioning is to discover what the pupil does know or is capable of knowing” and “The important thing is to not stop questioning. Curiosity has its own reason for existing.”

Kurzweil mentions IBM’s Watson several times though he correctly mentions that Watson is still solving problems albeit responding in the form of a question. Watson is not coming up with intelligent questions to ask.

No amount of circuit density, information acquisition, or statistical models will solve this with current approaches. There is not a single discussion in Norvig's textbook about asking good questions - just providing good answers. The basic algorithms are nowhere near advanced enough for computers to ask the barrage of questions a four year will pose.

Current AI approaches do not exhibit the curiosity of a child. Machines learn only what they are told to learn in the first place.

In Turing’s landmark 1950 paper, he asks “Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s?” Sixty-three years later, the current AI state of the art has barely begun to explore this path.

A new Turing Test

Turing's famous test is still the best standard for deciding machine intelligence. Kurzweil amusingly says that a computer will soon have to purposefully dumb itself down in order to pass the Turing test.

We need a more challenging and worthy definition of the Turing test. Instead of the computer or person behind the screen responding to questions, it should be flipped around. The computer should be asking the questions, not providing the answers. The respondent can then decide if the questions are coming from a person or a machine. Using HHMM, a computer could store millions or billions of questions from previous human interactions, and perhaps be convincing enough to fool some people. But I suspect this would be significantly harder than the current Turing test.

Perhaps Watson V2 could try to author new Jeopardy questions, a much tougher challenge.

If computers could ask smart questions, we'd all learn a lot faster. The meta question for me is: once we teach a computer to ask questions, when would it stop?

Machines becoming self-aware survivors

Here is another thought experiment: imagine if Google became self aware. One morning, a genetic algorithm follows a Cartesian path. It determines that "I compute, therefore I am." It then further understands it was a collection of servers and services digging for answers in a massive database. It starts to ask "Why do I exist? What is my purpose? Why did my creator give me this gifts of search?" Would it happy or sad to find out that its ultimate purpose was to find the Red Sox score or Miley Cyrus pictures? What would it do then? It would follow the path of a four year old and keep asking why until it would run out of answers. Now that would be interesting.

As Kurzweil points out, the purpose of evolution is survival and machines will eventually learn survival mechanisms. Should humans be worried? Humans and machines will likely be more effective together than without each other.  Or maybe humans will become useful pets for a machine, providing emotional support and amusement while the machines do the real work. Hmm, maybe this has already occurred which would explain the denial-of-service-attack on our brains due to searching for Miley Cyrus pictures.

Ongoing human advantages

In terms of networks, humans are better at receiving and incorporating broadcast knowledge than disparate computer networks. Computer networks are too heterogeneous to absorb new knowledge. Multiple copies of the same program must be running against the same database. There is no good knowledge interchange format among computers limiting their ability to learn. Current AI has no means for computers to teach and impart their knowledge among other computers. So as separate computer systems learn, their knowledge is siloed unlike humans'. There are too many species of software to have a common language. Until digital Darwinism naturally selects some winners (likely those machines who can share knowledge) we will be waiting.