It’s been a busy period here at the Big Data Institute, or BDI, Oxford’s recent addition to the rapidly growing biomedical research campus.

Over just a few days we’ve hosted a board meeting for a major pharmaceutical company, held a kick-off meeting for a collaboration with another, given a tour to representatives from HM Treasury and BEIS, participated in a NICE expert working group on real-world evidence, and been part of a successful Oxford-led bid to establish a hub for AI in biomedical imaging.

It seems that everyone wants to know about AI, machine learning and big data in health research. And it’s not surprising. The dramatic advances we’ve seen in the ability of algorithms to identify and use complex patterns in images, documents, streams of financial data and other data-rich domains are beginning to transform the way in which biomedical and health data-related research can be carried out. 

From solving mundane but critical tasks, such as maximising the efficiency of healthcare delivery, to the holy grails of automated drug design or individualised therapy, AI is being deployed across the world with enthusiasm, hype and occasional success.

Within Oxford, we’ve been fortunate enough to have in place many of the pieces we need to make real the promise of biomedical AI. An incredible history of population health research, leading back to Richard Doll and the British doctors’ study on smoking, with an emphasis on clinical trials and population-scale longitudinal measurement; huge strength in the statistical underpinnings of AI, often referred to as machine learning; a community of clinician-scientists who have the insight and drive to understand the need and to help facilitate and shape data-driven research programmes; and a university’s worth of fantastic engineers, informaticians, epidemiologists, genomicists and so on, excited by collaborative research and hungry to see their insights make a difference to patients. 

The BDI acts as a hub for such activity, supporting the necessary training, computational infrastructure and information exchange, while also leading research programmes ranging from mapping the burden of antimicrobial resistance across the world, to developing mobile apps for measuring the parts of memory that is fastest to decline in dementia. 

To a large extent the needs of an AI-driven research programme in healthcare are not so different from any other data-driven problem. Take, for example, the challenge of automated feature prioritisation from imaging modalities, such as pathology or radiology. We want the computer to help the clinician spot features of importance, building on sets of expert-curated training data, coupled with learning algorithms that improve with experience. This requires a close loop between the engineers, clinicians, algorithm developers and, of course, access to the critical high-quality data sources. 

This is the type of problem AI has proven hugely competent at solving – it’s a game, like chess or Go, where the rules are set and the machine has to learn the best strategies. Clearly, issues such as repeatability, reproducibility and generalisability are important, but we don’t particularly require the machine to explain why a particular decision has been made. We just need a good decision, fast.

But many of the core problems in biomedicine are fundamentally different from this class of task. Consider the problem of investigating whether some patients respond better to one type of drug than another. Resources, such as the UK Biobank, which are measuring vast amounts of biological, clinical, behavioural and medical data on hundreds of thousands of people, give unprecedented power to find complex patterns. So if we were to use AI to ask whether there are differences in the medical trajectories between those patients given drug A or drug B, the answer would almost certainly be yes. Put another way, by looking at the entirety of a person’s data, I can probably work out whether they were given drug A or drug B with reasonable confidence. 

But that doesn’t necessarily mean that these differences were the result of taking the different drugs. Perhaps drug A is more often given to those who are likely to do well because they have fewer other diseases, or because its use just happens to be preferred in a couple of hospitals that have particularly good specialists and care pathways for the disease. 

In this and many other medical problems the critical intelligence we need is an understanding of causality – the health benefit likely to arise from a particular intervention. And this doesn’t fall naturally from AI. Rather, it is something that only clinical trials, despite their cost and time, can assess. 

So what is the role of AI in such work? There are two key areas, both of which the BDI is pursuing. First, we can use AI to make us much smarter about generating therapeutic hypotheses to take to trials, building on a growing wealth of data types that give us clues to causality (such as genomics, longitudinal data, experimental screens and high-resolution biological measurement). Second, we use AI to make trials themselves better, by finding the patients most likely to benefit, the readouts able to measure impact the fastest, and by analysing the clinical data arising to refine hypotheses and iterate. 

AI is ultimately just a tool, but it’s one that allows us to do science better and get the benefits out into the real world faster. 

Professor Gil McVean, Professor of Statistical Genetics and Director of the Big Data Institute