search.noResults

search.searching

dataCollection.invalidEmail
note.createNoteMessage

search.noResults

search.searching

orderForm.title

orderForm.productCode
orderForm.description
orderForm.quantity
orderForm.itemPrice
orderForm.price
orderForm.totalPrice
orderForm.deliveryDetails.billingAddress
orderForm.deliveryDetails.deliveryAddress
orderForm.noItems
Informatics


Figure 1


be demonstrated and documented is that the error rate is within acceptable limits.


Analyses and results – sharing and reproducibility – repositories, containers and workflows Workflows can be deployed to automate analyses to enable themto run faster andmore reproducibly and to scale. For example, it had been calculated that running a virtual drug docking simulation on a laptop computer would theoretically take 8.5 years (not useful), but that simulation could be run in the cloud with a workflow using 40,000 CPUs in just four hours. Workflow manager software comes in a variety of packages. A comprehensive list is available in GitHub11. The current emphasis on the deployment of the


FAIR data principles (Findable, Accessible, Interoperable, Reproducible) in bioinformatics was noted as indeed was the Pistoia Alliance project12 and its multi-author paper on the ‘Implementation of FAIR Data Principles for Pharma and Life Sciences’13. The use of workflow managers helps to address the reproducibility of these analyses and sharing the code through repositories such as Github or ContainerHub allows other users to run exactly the code that was used to generate the ini- tial results.


Drug DiscoveryWorld Summer 2019


Provenance was a crucial


issue to be


addressed – who ran it, when, where did they run it, which workflow manager was used, etc. The Common Workflow Language (CWL) makes an important contribution to this challenge and the paper published in 2018 entitled ‘Sharing interop- erable workflow provenance: A review of best practices and their practical application in CWLProv’14 focused specifically on provenance in this environment.


Scaling bioinformatics – high performance computing and the cloud The ability to scale bioinformatics solutions is important. A good example can be demonstrated by Genomics England. In the 100,000 Genomes Project, DNA is sequenced by Illumina. As analy- ses scale, so the underlying platforms need to change, eg individual applications might require SaaS (Software as a Service) such as GATK* and Dragen*, through genomics platforms (PaaS [Platform as a Service]) such as BaseSpace Sequence Hub*, SevenBridges*, DNAnexus* and up to Infrastructure platforms (IaaS [Infrastructure as a service]) such as AWS*, Google Cloud* and Microsoft Azure*. Key business considerations for investing in bioinformatics include:


59


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52  |  Page 53  |  Page 54  |  Page 55  |  Page 56  |  Page 57  |  Page 58  |  Page 59  |  Page 60  |  Page 61  |  Page 62  |  Page 63  |  Page 64