SCW_AUGSEPT17

laboratory informatics ➤

analysis, associated biological molecules and compounds, together with full-text scientific literature and clinical trials data, potential drug candidates, and all associated drug-related chemical and structural data,’ explained Jabe Wilson. ‘Te challenge was for the team to exploit this broad set of diverse information through the application of a machine learning algorithm that could extract the relevant features to provide answers to the challenge.’ By applying machine learning approaches,

including ‘guilt by association’ – essentially drawing inferences about one drug’s activity based on its association with the same biological target/s as a second drug – the FA challenge team devised a huge network of genes and their associations with potential drug compounds. Tey identified two small molecule drugs,

deferiprone and deferoxamine, as possible therapeutic candidates for treating the genetic disorder. Both drugs are currently marketed for indications that are not related to FA. ‘Te results have now been passed to Findacure to investigate further,’ Wilson said. Te Hackathon, Lynch believes, exemplifies

the power and versatility of deep learning to help address some of the primary issues in the drug discovery and development arena.

IF YOU ARE GOING THROUGH SCIENTIFIC LITERATURE AND PULLING OUT STATEMENTS OF FACTS, IT’S IMPORTANT TO KEEP TO THE CORRECT TAXONOMY, SO THAT YOU ALWAYS USE THE RIGHT TERMS

It demonstrated to industry, regulators and

other healthcare stakeholders the potential power of deep learning in diverse R&D and clinical settings. It will now be up to all these stakeholders to work together to fit machine learning technologies and approaches into regulatory- compliant clinical and pharmaceutical discovery development settings, and for uncovering new biological pathways and processes, Lynch believes. ‘It’s that balance between caution and

excitement. We sit in a time of opportunity and change, and I think that it’s great to have the excitement. What the Pistoia Alliance is trying to do is to address the hurdles and barriers, so we can continue to build on what has already been accomplished,’ Lynch stated. Widespread adoption of AI is not going to be achieved by a click of a mouse, Lynch admits. ‘It’s

22 SCIENTIFIC COMPUTING WORLD

The first portfolio of purpose-built AI supercomputers are powering incredible advances in AI and healthcare

relatively early days yet, and we are going to be making a lot of comparisons between human and AI-driven discoveries and decision making.’ In a pharmaceutical discovery and

development setting, target and drug discovery will be areas in which the benefits of AI will be most evident during the next few years. ‘AI will enable scientists to identify and map molecular pathways that underlie diseases, to accelerate the development of personalised medicine. Te use of AI to support the clinical workflow has good potential, given the scale of the datasets and the tasks performed within this area,’ Lynch said.

Addressing the data bottleneck Data is still a bottleneck, Halabi believes. ‘Not so much the lack of it, because in the healthcare sector at least, our partners, including the Mayo Clinic, or the Massachusetts General Hospital, have tons of data. Te issue is that much of this data isn’t labelled to say what we want to do with it. A huge amount of research is now ongoing to allow programs to automatically extract relevant information from image reports, for example, and label the image with pertinent information, to expand the dataset.’ Tis approach will take the main burden off

the physician, he suggests. ‘Sure, you need the physician to label, perhaps 10 per cent of the data to start with, then you run the algorithm, and the algorithm learns from the labels, and pre-labels new images. Te physician then corrects these, and feeds the results back into the algorithm, and you re-run it again, and keep correcting, and feeding back in, incrementally. Te system learns at each round.’ You also need the expertise to understand

the domains, or subject areas, and to extract semantic information from the scientific literature and patents, Lynch adds. ‘It’s critical to have the right taxonomies and indexing, so that you can derive the information that you are looking for in specified areas, whether that be chemistry,

biology, or drug-related.’ It’s imperative to have clean, curated data to plug into your algorithm, Elsevier’s Wilson stresses. ‘Tat requires the correct dictionaries and taxonomies. If you are going through scientific literature and pulling out statements of facts, it’s important to keep to the correct taxonomy, so that you always use the right terms.’ Elsevier is involved in a Pistoia Alliance

ontology mapping group that is working to ensure that the community has the right dictionaries so that they can apply natural language processing and derive the right semantic information to use as features in machine learning systems. Te hardware and soſtware required to even

tentatively springboard machine learning into mainstream healthcare and drug discovery and development may already be available, but it is critical that we marry enthusiasm going forward with the encouragement to attract and train people with the right skillsets, or we will end up with a wide skills gap, Lynch stresses: ‘Tat’s going to be a rate-limiting step if we don’t plough much more investment into training people globally.’ ‘Tere is a range of skillsets involved in

maximising the utility of AI in life sciences,’ Wilson adds. ‘You need people with computing expertise and the ability to apply the statistical analytics and algorithms that underpin machine learning. You need people who can accumulate clean, accurate and reliable data. And you need people with the domain knowledge, for example in chemistry and biosciences. Ten you need to bring them all together.’ So what is the future for machine learning and

AI in life sciences and healthcare? It will probably be a hybrid of human expertise and flavours of AI, Wilson suggests. ‘Machine learning has the potential to create knowledge and generate predictions that have not been achievable in other ways. Tere’s a virtuous cycle as the more that we start to apply machine learning to create predictions in different areas, the more that we will identify new opportunities.’l

@scwmagazine l www.scientific-computing.com

Page 1 | Page 2 | Page 3 | Page 4 | Page 5 | Page 6 | Page 7 | Page 8 | Page 9 | Page 10 | Page 11 | Page 12 | Page 13 | Page 14 | Page 15 | Page 16 | Page 17 | Page 18 | Page 19 | Page 20 | Page 21 | Page 22 | Page 23 | Page 24 | Page 25 | Page 26 | Page 27 | Page 28 | Page 29 | Page 30 | Page 31 | Page 32

orderForm.title