AI is hard – I hope we can all agree about that.

AI is hard for many reasons, but one of the most important is that it is a subject which is hard to approach directly. We have evidence that intelligent behaviour is possible – we provide that evidence – but the processes that lead to intelligent behaviour are hidden from view, inside our brains. We can’t examine these processes directly, and so when we try to create intelligent behaviour, we have to start from a blank slate.

So, how do AI researchers go about building systems capable of intelligent behaviour?

There are basically two types of approach, and one of these has been shown to be dramatically successful over recent years.

Let’s suppose we want to write a program that can translate texts from English to French. Not very long ago, programs that could do this competently were firmly in the realm of science fiction, and progress in automated translation was so slow that it was something of a cruel inside joke for the AI community. 

Famously – and possibly apocryphally – an early English to Russian translation program translated the sentence ‘The spirit was willing but the flesh was weak’ as ‘The vodka was good but the meat was bad’. Whether or not the story is true (it isn’t), it has an inner truth: machine translation programs were plagued with problems, routinely making blunders in translation that a child would not make. 

The main approach to machine translation, which was the dominant approach until this century, was what we might call model-based. With this approach, what we do is try to come up with a model of the behaviour we are trying to reproduce, and to give that model to a computer so that it can use it directly. For English to French translation, the models in question would be models of the English and French languages. 

So, with this approach, we would try to come up with a model of English and French sentences and texts, and to use this model for the translation. First, we would define the structure of sentences (technically, the ‘syntax’ – what makes a grammatically acceptable English and French sentence and text). We then use that syntax to understand the structure of the text for translation, and hopefully from that we can derive the meaning of the text (the ‘semantics’). Once we have that meaning, we can again go back to our model of the target language and construct a corresponding text from the meaning. 

This approach to natural language understanding requires us to be able to come up with rules defining the grammar, and how to extract the meaning from the structure of the text, and then how to generate a text from the meaning. So, researchers busily worked on all of these problems – for decades. Ever more elaborate ways of capturing text structure and meaning were developed, and there was progress, but translation using these approaches never achieved human-level or anything like it. 

The problem is, human languages are complicated and messy – they simply resist precise attempts to define their syntax and semantics, and are so full of subtleties, quirks and exceptions that they are seemingly impossible to nail down. 

In the 1990s, another idea began to gain prominence, called statistical machine translation. Remarkably, with this approach there is no attempt to construct any kind of model or understanding of the language in question. Instead, what we do is start with a large number of examples of what we are trying to do (translated texts), and we use statistical methods to learn the probability of particular translations. The basic maths behind this approach is simple, but to make the approach work in practice required lots of data, and lots of processing time to compute the statistical associations. 

Statistical machine translation achieved remarkable successes very quickly, and the field was turned on its head. And the same ideas began to be applied in other areas. Other learning techniques were investigated and found to work – one of them being deep learning, which is the hub of all the present excitement about AI.

So there, in a nutshell, are the two basic approaches to AI. 

With the first, you aim to create a model of the thing you are trying to achieve, and give that model to a computer. This has the advantage of being transparent – we can look at the model and see what is going on. But for complex, real-world problems, coming up with a practically useful model may be impossible.

With the second, you don’t worry about a model – you simply throw lots of examples of the behaviour you are trying to create at a machine learning program, and hope that the program learns the right thing to do. And the exciting thing is, at present, there is a lot progress with this approach. The very big disadvantage of this approach is that it is opaque. Your program may learn to do its task better than a human, but it can’t tell you how it does it, and ultimately the expertise of the program is encoded in (very long) lists of numbers – and nobody has any idea how to tell what those numbers mean.

This conundrum is at the heart of contemporary AI. It seems we can have transparency or competency, but at present it seems we can’t have both. And that is one of the biggest challenges for AI research today. 

Professor Mike Wooldridge, Professor of Computer Science, Oxford University