already surfaced as the primary factor underlying the success of Benevolent AI – a company that has identified 24 drug candidates in only four years, advancing two promising prospects for treatment of the perennially intractable Alzheimer’s Disease8. Detailed technical reviews of DL and AEB are avail- able elsewhere6,7, but for the purposes of this arti- cle it is important to focus on conditions by which such methods may best excel. Specifically, algorith- mic capacity for discriminative logic that mimics human intuition requires access to extensive data and information in the following classes:

Chemical: Comprehensive annotated databases of bioactive chemical entities are essential for using DL and AEB to discover new drug scaffolds and optimise known ones. Such resources are available and are growing, though they are not without some flaws. For example, mineable structural rep- resentations such as SMILES strings and finger- prints contain chemical ambiguities that can cloud predictions. Available bioactivity data in available databases is sparser than need be, largely due to proprietary restrictions on many assays. The latter is unfortunate; while corporate entities remain jus- tifiably hesitant to unmask promising leads, there may be immense value to reporting data on mid- to low-quality hits that have long-since fallen out of consideration; such data can be very productively applied to enhancing models for toxicity and off- target effects, plus intuiting new target SAR. Furthermore, valuable insight can be extracted from just knowing the inactives from screens, with implications for lead search efficiency and accurate prediction of drug safety.

Biological:Molecular interrogation data (OMICS), microscopy and many forms of sub-organismal physiological data provide a wealth of insight suit- able for algorithmic use in discovering biomedical implications far broader than anticipated in the original study. Again, the power of computational predictions is magnified with greater data sharing, although some bioanalyses produce prohibitively large data sets. In the future, this hurdle may be overcome through sandbox experiments that help to strategically filter large data entries down to key features of demonstrable value to physiology and pharmacology.

Clinical: Anonymised data, meta data and unstruc- tured annotations are immensely valuable to med- ical insight when combined with chemical and bio- logical data. A protocol for ensuring that all stud- ies can be stripped of identifying information and


made centrally available would be a tremendous boon to medicine, with potential benefits extend- ing far beyond the original motivation in a given clinical study.

Literature: Collectively, scientific publications, e-books, patents, funding proposals, clinical reports, blogs and social media comprise a wild blend of fact, informed speculation, rigorously attained errata, unverified information and unveri- fiable conjecture. Nonetheless, modern data mod- elling has demonstrated that, when mined proper- ly, careful delineation of known truth versus con- jecture is not crucial9. The mere fact that a given chemical, for example, is being debated as a poten- tial cancer therapeutic (proven or not) may inform discussions of whether the broader chemotype family may have prospective cytotoxicity. That said, the predictive capacity of text mining does grow most quickly through expanded access to the most reliable information sources, such as high impact journals. Thus, even if top publishers feel a need to maintain pay-per-human-view access to, there are synergistic benefits to opening their full (non-quarantined) electronic holdings to text min- ing. Mining access improves scientific AI inferences for all, and such access tends to flag relevant papers for actual human (ie paid) consumption.

AI in drug discovery Recent expectations are that AI approaches can play an important role in the future of drug discov- ery, particularly for increasing productivity and R&D innovation. For instance, AI will alleviate the numbers barrier in drug development. According to Mullard, 2017, there are 1,060 possible com- pounds that have drug-like characteristics, which gives rise to a problem of categorising the property for each chemical10. The enormity of the data set makes drug discovery even more expensive and protracted; thus, researchers are hoping to reverse this trend by combining the lessons learned from previous drug discovery projects with the vast amounts of experimental data that has already been produced by the scientific community to drive AI-powered drug design. It is expected that AI will enable predictions

towards molecular dynamics, resulting in: (i) Focused sets of compounds for screening; or (ii) new uses for previously tested compounds towards treating diseases; or (iii) the creation of therapeutics targeted towards patients harbouring specific molecular markers, such as harmful mutations which give phenotype selective advantages; or

Drug Discovery World Summer 2018

Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52  |  Page 53  |  Page 54  |  Page 55  |  Page 56  |  Page 57  |  Page 58  |  Page 59  |  Page 60  |  Page 61  |  Page 62  |  Page 63  |  Page 64  |  Page 65  |  Page 66  |  Page 67  |  Page 68  |  Page 69  |  Page 70  |  Page 71  |  Page 72  |  Page 73  |  Page 74  |  Page 75  |  Page 76  |  Page 77  |  Page 78  |  Page 79  |  Page 80