In 1739, Parisians flocked to see an exhibition of automata by the French inventor Jacques de Vaucanson performing feats assumed impossible by machines. In addition to human-like flute and drum players, the collection contained a golden duck, standing on a pedestal, quacking and defecating. It was, in fact, a digesting duck. When offered pellets by the exhibitor, it would pick them out of his hand and consume them with a gulp. Later, it would excrete a gritty green waste from its back end to the amazement of audience members.
Vaucanson died in 1782 with his reputation as a trailblazer in artificial digestion intact. Sixty years later, the French magician Jean-Eugène Robert-Houdin gained possession of the famous duck and set about repairing it. Taking it apart, however, he realized that the duck had no digestive tract. Rather than breaking down the food, the pellets the duck was fed went into one container, and pre-loaded green-dyed breadcrumbs came out of another.
The field of artificial intelligence is currently exploding, with computers able to perform at near- or above-human level on tasks as diverse as video games, language translation, trivia and facial identification. Like the French exhibit-goers, any observer would be correctly impressed by these results. What might be less clear, however, is how these results are being achieved. Does modern AI reach these feats by functioning the way that biological brains do, and how can we know?
In the realm of replication, definitions are important. An intuitive response to hearing about Vaucanson’s cheat is not to say that the duck is doing digestion differently but rather that it’s not doing digestion at all. But a similar trend appears in AI. Checkers? Chess? Go? All were considered formidable tests of intelligence until they were solved by increasingly more complex algorithms. Learning how a magic trick works make it no longer magic, and discovering how a test of intelligence can be solved makes it no longer a test of intelligence.
So let’s look to a well-defined task: identifying objects in an image. Our ability to recognize, for example, a school bus, feels simple and immediate. But given the infinite combinations of individual school buses, lighting conditions and angles from which they can be viewed, turning the information that enters our retina into an object label is an incredibly complex task — one out of reach for computers for decades. In recent years, however, computers have come to identify certain objects with up to 95 percent accuracy, higher than the average individual human.
Like many areas of modern AI, the success of computer vision can be attributed to artificial neural networks. As their name suggests, these algorithms are inspired by how the brain works. They use as their base unit a simple formula meant to replicate what a neuron does. This formula takes in a set of numbers as inputs, multiplies them by another set of numbers (the “weights,” which determine how much influence a given input has) and sums them all up. That sum determines how active the artificial neuron is, in the same way, that a real neuron’s activity is determined by the activity of other neurons that connect to it. Modern artificial neural networks gain abilities by connecting such units together and learning the right weight for each.
The networks used for visual object recognition were inspired by the mammalian visual system, a structure whose basic components were discovered in cats nearly 60 years ago. The first important component of the brain’s visual system is its spatial map: Neurons are active only when something is in their preferred spatial location, and different neurons have different preferred locations. Different neurons also tend to respond to different types of objects. In brain areas closer to the retina, neurons respond to simple dots and lines. As the signal gets processed through more and more brain areas, neurons start to prefer more complex objects such as clocks, houses, and faces.
The first of these properties — the spatial map — is replicated in artificial networks by constraining the inputs that an artificial neuron can get. For example, a neuron in the first layer of a network might receive input only from the top left corner of an image. A neuron in the second layer gets input only from those top-left-corner neurons in the first layer, and so on.
The second property — representing increasingly complex objects — comes from stacking layers in a “deep” network. Neurons in the first layer respond to simple patterns, while those in the second layer — getting input from those in the first — respond to more complex patterns, and so on.
These networks clearly aren’t cheating in the way that the digesting duck was. But does all this biological inspiration mean that they work like the brain? One way to approach this question is to look more closely at their performance. To this end, scientists are studying “adversarial examples” — real images that programmers alter so that the machine makes a mistake. Very small tweaks to images can be catastrophic: Changing a few pixels on an image of a teapot, for example, can make the network label it an ostrich. It’s a mistake a human would never make and suggests that something about these networks is functioning differently from the human brain.
Studying networks this way, however, is akin to the early days of psychology. Measuring only environment and behaviour — in other words, input and output — is limited without direct measurements of the brain connecting them. But neural-network algorithms are frequently criticized(especially among watchdog groups concerned about their widespread use in the real world) for being impenetrable black boxes. To overcome the limitations of this techno-behaviourism, we need a way to understand these networks and compare them with the brain.
An ever-growing population of scientists is tackling this problem. In one approach, researchers presented the same images to a monkey and to an artificial network. They found that the activity of the real neurons could be predicted by the activity of the artificial ones, with deeper layers in the network more similar to later areas of the visual system. But, while these predictions are better than those made by other models, they are still not 100 percent accurate. This is leading researchers to explore what other biological details can be added to the models to make them more similar to the brain.