Why We’re Always On The Cusp Of Artificial General Intellegence

The follow post is my argument on why we can’t know how long it will take to develop AGI. It is based on various assumptions so I’d be interested to read any feedback you have regarding it!

 

Explanation of Terms: AGI & ANI

There are three broad forms of artificial intelligence: artificial narrow intelligence (ANI), artificial general intelligence (AGI) and artificial super intelligence (ASI). The subject of this post is relating to AGI, however ANI is also needed to explain where we are, so I will briefly explain both… by copying and pasting the explanation used on WaitButWhy:

AI Caliber 1) Artificial Narrow Intelligence (ANI): Sometimes referred to as Weak AI, Artificial Narrow Intelligence is AI that specializes in one area. There’s AI that can beat the world chess champion in chess, but that’s the only thing it does. Ask it to figure out a better way to store data on a hard drive, and it’ll look at you blankly.

AI Caliber 2) Artificial General Intelligence (AGI): Sometimes referred to as Strong AI, or Human-Level AI, Artificial General Intelligence refers to a computer that is as smart as a human across the board—a machine that can perform any intellectual task that a human being can. Creating AGI is a much harder task than creating ANI, and we’re yet to do it. Professor Linda Gottfredson describes intelligence as “a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience.” AGI would be able to do all of those things as easily as you can.

 

The Explosive Growth

Knowledge – and the access to that knowledge – is a key component for any AGI, and the internet is used for precisely that reason. The internet has completely changed the landscape for how humans interact with their daily environment compared to twenty years ago. No-one imagined culture would change so much twenty years ago due to our newly found connectivity. No doubt, it will be fundamental in how AGIs will interact with their environment in the future.

We’ve also made an ANI that can drive cars, that can beat humans at Go, that can exhibit creativity in a specific field by creating music or art, an ANI that can learn – through neural networks – how to play a game of Mario:

There is even a computer that has passed the Turing test (well, passed it 33% of the time): Eugene Goostman. And tech companies like Google are getting into serious bidding wars for experts in the field of AI. That must mean we’re close, right?

 

The Naive Optimism

In the 1950’s, thanks to scientific advances from Turing and von Neuman et al, the scientific community started looking more and more into the prospect of AI, and the field of AI was born.

There was huge optimism with regards to the rate of growth within AI development. In 1958, H. A. Simon and Allen Newell stated, in : “within ten years a digital computer will be the world’s chess champion” and “within ten years a digital computer will discover and prove an important new mathematical theorem.” We know, with hindsight, that the former prediction took four times longer than expected – with Deep Blue beating Kasparov in 1997 rather than 68 – and the latter prediction is still yet to be realised.

Then, in even more optimism, the following claims that have yet to succeed were made:

In 1965, H. A. Simon stated: “machines will be capable, within twenty years, of doing any work a man can do.”
In 1967, Marvin Minsky said: “Within a generation … the problem of creating ‘artificial intelligence’ will substantially be solved.”
And in 1970, Marvin Minsky (in Life Magazine), claimed: “In from three to eight years we will have a machine with the general intelligence of an average human being.”

Why were these great scientists so off their mark? Can we predict just how long it will take us to achieve AGI?

 

Task Breakdowns

“When eating an elephant take one bite at a time.” – Creighton Abrams

A standard technique when faced with large tasks is to break it down into smaller chunks. Many British readers will be reminded of this strategy when BBC Bitesize helped us use it to study for our GCSE. This technique has become so ubiquitous that it shapes the way we view tasks. When we see a large task, many of us unconsciously break it up into smaller, more manageable chunks. Applying this lens to knowledge is constantly being reinforced, as examples of its application are everywhere: the categorisation of subjects, science, the hierarchical nature of companies are even founded on this idea. Large tasks are split into smaller chunks, where the small chunks of work are categorised in nature and then given to experts in those related field, such as legal tasks to lawyers, tasks relating to the understanding and creation of complex objects to engineers, the tasks of selling objects/services to the sales team, the task of keeping people & work loads on track, prioritized and organised is given to managers.

So how does this all affect our perception of how close we are to AGI? Well, naturally, scientists will break the task down into easier chunks. In their minds, these chunks might look something like this:

1.  Create an artificial neuron (= the transistor). Check (But not really. We’ll re-explore this later)

2. Connect millions of neurons together so that it forms something similar to a neural network (= a modern day computer, which contains more transistors than a fruit fly contains neurons). Check.

3. Write the artificial equivalent of the “software of nature”. This field is growing today and is called Machine Learning. No check yet.

So we’re pretty close, right? We’ve got 2 out of 3 tasks complete. Well, no, not really. At this stage, many people could be – and probably generally are – assuming that each task is roughly the same size with regards to complexity, and thus the time needed to complete the task is the same. But we just don’t know. The problem is even bigger than that: we don’t know what we don’t know to figure out the rest of the puzzle. Let me explain…

 

The Unknown

“Reports that say that something hasn’t happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know. And if one looks throughout the history of our country and other free countries, it is the latter category that tend to be the difficult ones.” – Donald Rumsfeld

What on earth is Rumsfeld talking about? Well, actually, he’s talking about something very valid. Psychologists would call it metacognition: our ability to perceive and make judgments on our knowledge and skills. There are many models that have been created from metacognition, such as the 4 Stages Of Competency and the Dunning-Kruger Effect. If we take the phrase ‘unknown unknowns’, the first ‘unknown’ would relate to our perception of our knowledge, and the second ‘unknown’ would relate to whether we hold that specific packet of knowledge.

The Unknown Unknown Matrix

 

There are a few attributes that are associated with each part, however the main one of interest is that unknown unknowns, due to their very nature, cannot be measured in terms of the size of the knowledge. I think we’re starting to see the problem.

 

Two Paths To Innovation

There are two main paths to take when creating something new.

1.  The joining of two already existing sub-components – which had never been joined before – to create something new

The wheel has been around for a while. The suitcase has also been around for a while. But a wheely suitcase? It took us humans until the 1970’s – dragging heavy suitcases around for every holiday – until one clever inventor finally came up with the novel idea of putting the wheel and the suitcase together.

In this situation, all we need to do is get the expert of each component part in to talk about how the relationship between the new sub-components will work. If it’s a simple amalgamation, maybe not even that. For example, if we’d like to build a wheely suitcase, we might need a designer – to make it looks aesthetically pleasing – and an engineer – to make it structurally and physically sound (and to add the wheels). Maybe also a separate production engineer to find the correct tools & develop a production line for the creation of the product. All of the knowledge required to do this already exists, which makes it easier to predict how long it might take to develop this product. We just need to find people with the knowledge already in their brains and add it together in the novel way.

2. Research and development: applying new science to create new technology

This is the trickier route to new products. But where the barrier to entry is greater, the incentive may also be larger: in monetary terms, but also in prestige and personal sense of achievement. This is also the form of invention that AGI takes when it tries to replicate the brain.

Replicating the brain to create artificial AGI requires a two-pronged strategy: the first is in further research into the mechanisms of the brain, so we can better understand how it works. This will create a blueprint of how to create the artificial brain. The second part of the strategy is to further the development of artificial brain parts (computer chips) so we have better and better devices to emulate a brain with. The two paths start at opposite ends of the room, but as each side develops, they get closer and closer together… until finally at some time in the future, they’ll meet: the point at which we have the knowledge and skills to create a AGI by replicating an artificial brain. However, the distance left to cover will remain unknown until it has been covered. Like seeing light at the end of the tunnel: we can see the end, but no matter how far we travel, the light seems just as far away as it was previously. Then, before we realise it, we can make out shapes beyond the end of the tunnel, we keep walking and we suddenly find ourselves in the outside world with the wind on our faces.

 

The Hindsight Bias

Just a quick one before moving on, it might be worthwhile to touch upon the Hindsight Bias. This is usually applied to History: where in the past, it seems very obvious which factors contributed to the events that unfolded in History. It seems clear, in hindsight, that certain financial instruments and sub-prime mortgages contributed massively to the financial crash of 2008. Was an imminent crash obvious at the time? No, of course not. To look back in History is to remove all the noise, and to see just pure cause and effect.

In the same way that the hindsight bias applies to history, ‘development bias’ (just made this one up) applies to the production of a new product. It might be clear how much complexity is involved in steps 1 and 2 of the AGI task list, in hindsight, but it is still very unclear regarding the path ahead for step 3. It is easy to see, in hindsight, how to create an object when the path has been set out. During product development, however, there are a multitude of attractive looking paths for creation, yet we never know which one can be successful until we walked that path.

 

Create an artificial neuron (= the transistor). Uncheck.

We humans work by analogy. An electric circuit is a water pump: the water represents electrons being pumps around the circuit. The mind’s neurons are a logic gate: they can either be on or off.

All analogies and models have thresholds at which they break down, however. Analogies of the brain are arguably getting stronger: from clay and water during Greek ages, to being more like a mechanical machines during the industrial revolution, to being like a computer chip in our current era. However we still underestimate the complexity of the brain by over-simplifying to an analogy we can understand, and this has tricked us into believing we can seamlessly just replace a neuron with a transistor. A neuron can be turned ‘on’ from being given an electrical pulse greater than a certain voltage, however this ‘on’ state is temporary. A neuron has no way of storing information in an on-off state as the transistor does: as a result, both components use completely different mechanisms to operate. A computer, as we know them at the moment, will never be able to operate like a brain. This means that we only have 1 of the 3 boxes checked, and we’ll have to traverse the lands of unknown unknowns in the search of our artificial neurons.

 

The Application of Unknown Unknowns to the AGI Task Hierarchy

“Fail fast, fail forward” – Silicon Valley Mantra

Tasks that have never been undertaken before are unknown unknowns. We known roughly how long it would take to undertake tasks that others have achieved before, because we can gauge how long others took to do them. However with the unknown unknowns, all we can do is to fail forward: to work in a way that moves us from not perceiving what we don’t know in order to achieve the task, to perceiving what we don’t know. To be cartographers of our own map of knowledge: sketching out in ever more detail the areas we’ve acquired and the areas we haven’t acquired. The areas yet to be discovered are dark and potentially limitless in size.

This brings us to task no. 3 for creating AGI. We just don’t know how long it will take us to replicate a human brain, due to what we have yet to achieve. We still only understand a tiny fraction of the complexity of the human brain. As well as this, machine learning is still in it’s relative infancy. We might argue that based on our current understanding of quantum mechanics, and our ability to manipulate at the nanoscale, we’re fairly close to understanding the brain, but that’s assuming that we already have the broad strokes of what is involved filled in. There might be a whole scientific discipline that we have yet to master before we understand the inner mechanics of the human brain, and we don’t even realise we don’t know it yet.

It seemed like a small step, back in 1960, from transistors to AGI. With hindsight, we can clearly see that there was still so much to learn that the scientists of the time hadn’t even perceived yet. Looking back, we can see the path needed to walk for the development of AGI has been far longer than previously anticipated. It might be tempting to think again that the step from where we are to AGI must be small: our success is almost palpable. But in reality we just don’t know how far we still have yet to go to make an AGI perform in the same way as the human brain.

Yet just because our brain performs as an AGI, this doesn’t mean all AGIs need to behave like a brain. To induce – from seeing a swan that is white – that all swans are white, is a fallacy. Are there other ways to create an AGI?

 

More Than One Way To Skin A Cat

We can use task break-down again, but this time to break down the steps taken by an AGI to perform tasks. If we break each task that an AGI (a human) performs down enough, we will start to see that even the most complex tasks are just the combination of many simple processes. We can break down the task into many different sub-tasks that an individual ANI can perform.

This bottom-up ANI amalgamation is the strategy Google employs in hopes of achieving an AGI. A single ANI is fairly limited by itself in scope, but it can perform a specific task. Maybe it can defeat the best chess players, or maybe it can just tell you what time it is. Hook this ANI up to the internet, though, and it can be used when needed. If a second layer is placed over the top of all these ANIs – a layer which is able to assess the task, then access and apply the appropriate ANIs to complete this task – we have just made our first AGI.

With this form of AGI, we may be able to class invention as the joining of many objects that already exist – ANIs – to form something new: the AGI. The top layer part of the AGI which judges the task and accesses each ANI would still need to be developed, but even this would be more like an adaption of already existing technology. A prediction may not be able to be made with completely accuracy: there are still areas outside the view of the analyst. Yet if the amount of ANIs necessary to form a functioning AGI can be counted, and the speed at which the ANIs which don’t exist can be created can be estimated, a prediction could be made. Perhaps the next question is: is it possible to calculate how many ANIs are needed until an AGI is created?

 

To Be Continued…?

So maybe the first AGI won’t pop into existence. It will crawl slowly into view, assimilating more and more ANIs until it is able to do any task that a human can do. No doubt, in time, it will continue to grow different ANI limbs in all directions, until it has completely surpassed humanity.

 

Intrinsic Value vs Book Value

We’ve all had an instance in our lives when we need to buy a product, and we just turn to the most known name in the industry. Or maybe we instantly refer to the name that’s most prestigious. We don’t look at what it can do on paper, we just trust that the product beats its competitors because of the badge that it has associated with it.

I’ve been looking at buying a Ducati recently. I put a few tentative bids down on eBay and I bought a book that describes the development of the specific bike I’ve been looking at. I got really engrossed into how they created the bike, how everything they do is derived from two principles: handling and power. How they pretty much build the whole bike around the engine, and the heritage they have with L-twin engines.

And then a weird realisation happened. I realised that I was buying the brand more than I was buying the product. I was buying the story of the bike and the association with Ducati, more than the technical ability of the bike. I had shifted from my product focused philosophy to a brand focused one.

Up until now, I’ve rarely cared about a brand. I’ve always judged a product’s merits based on it’s ability alone – untethered to where it actually came from. This is the process of making a decision using logic to quantify the specification of the product vs. it’s cost. The product with the most “bang for its buck” wins (i.e. specs:price ratio).  And I still think this is the correct approach if you’d like to be rational. On paper, the Ducati is seriously overpriced relative to bikes with similar stats from other manufacturers.

But other people judge a product’s merits, less on the actual product, but more on where that product came from. They have “Brand Focus”. This isn’t necessarily a bad thing – it’s not a completely rational thing to do in my opinion – but there is definitely an intangible value about owning a specific brand: a bit like art has intangible value. You can’t rationally derive the value from specifications: speed, braking ability, durability etc; rather you start deriving value from how the product makes you feel. And that, in my opinion, is a slippery slope.

Famous stock broker Warren Buffett calls these two ways of valuating ‘intrinsic value’ (product focus) and ‘book value’ (brand focus/how much the stocks are actually selling for, regardless to how much their calculated worth is).

Regardless of all that, however, is that Ducatis seem to really maintain their value well.  With a Ducati, its perceived worth is not detrimentally affected by lacking in specs: it holds the price it was originally sold for simply because it is still a Ducati and people continue to perceive it as valuable. This is also an apparent fact when buying stocks. There’s the worth of the company based purely on stats: it’s assets. But then there’s the public opinion on how much the company will grow: which is pure speculation and creates the perceived worth of each stock.

So, in the end, does it really matter where the value is derived, as long as it’s stable and predictable? To some extent, I think: yes, it still does. Brands can fall out of grace with the public: people might start to perceive Ducati as less luxurious. And public perception of a company’s growth can change overnight: and with it, the perceived worth of a stock. Stats are less fickle. Horsepower won’t change overnight unless someone gets a wrench to the bike (or mistreats it). Will that stop me from buying a Ducati in the future? Only time will tell…

First Post!

Hello. Welcome to Solidifying Nebulous Ideas. Come in. Take a look around.

I’ve written quite a few posts up to this point, so I’ll populate this blog with the ones already written in the next few days. Before I do that, though, I figured that a fresh, new, first post would be good to set the scene. But what new content can I talk about to represent what’s in store for this blog?

I thought I’d take the time to explain what this blog’s favicon (the icon that sits in your browser’s tab for this site) is – and why I chose it – as it holds quite a bit of meaning to me. You’ll probably see it popping up around the site in due course.

The favicon is a hypercube:

To me, it’s a symbol for ideas that are beyond our comprehension. This blog can be summed up as a collection of ideas that have developed from chaos into understandable order throughout my life. But, in my opinion, the hypercube will never be comprehensible. It represents the limit to our understanding.

“Okay, okay, the hypercube symbolises something for you,” I hear you say. “But what physically is a hypercube?” A hypercube is a 4D square. But as we can’t see 4D objects, what you’re looking at is a 3D shadow of the 4D shape. It’s the best we can achieve in our meager 3 dimensions. Actually, because you’re looking at it on a 2D screen, it’s a 2D representation of a 3D shadow of a 4D shape. The picture below is the same translation from 2D to 3D as our interpretations for a hypercube from 3D to 4D:

All edges of the hypercube are in a direction perpendicular or parallel to each other. So the edges that look like they’re moving into the centre of the cube are actually moving away from the cube in a direction perpendicular to it, in the 4th dimension. A bit of a mind-fuck, right?

I challenge anyone to be able to conceptualise the hypercube. Personally, I think it’s outside the realms of any human understanding, because it’s beyond our perception to be able to imagine a 4D world. Nothing in everyday life equips us to imagine how something would behave in 4D. However, maybe with new games coming out that allow us to explore the environment of 4 dimensions, we can become more intuitive with the idea. The 2D to 3D equivalent of a game that explores dimensions is a game called Fez (one of my favourite games). In Fez, the 3D layout of the world is mapped out, but you only need to traverse it in 2D. Because of this, you can change the angle you perceive each 3D environment to make platforms closer/further away, from the gamer’s 2D perspective. Then if we go one step further, a game that will allow us to explore a 4D environment while perceiving a 3D space (Fez + 1D) – is Miegakure.

Just like apes, who look at skyscrapers and can only fathom the geometric shape as something that exists within the rest of nature – they can’t imagine the careful material selection, the stress calculations on each of the supporting framework, the aesthetic design process – we can’t grasp the hypercube. And maybe in the future, when we create superintelligent AI (ASI), computers will be able to imagine a hypercube intuitively and we will be the apes. But for now, the hypercube is also a symbol of how far we’ve come throughout our lives. When we are born, we are like the ape. Then as we grow in the world, we move more and more towards the ASI. There will always be things beyond our comprehension, but these posts contain the success stories that create order out of chaos. The concepts that have been solidified from the nebulous.