Free online resources:Layout 1 14/1/10 19:53 Page 37
Cheminformatics
researchers should follow to achieve the regulatory screening against commercially available screening Continued from page 36
acceptance of QSAR models. The need to curate libraries. Often the screening efforts arise in an
the primary data from which the models are academic setting. Because of the disconnect
14 Zhu, H et al (2008).
Combinatorial QSAR modeling
derived was not mentioned. The Journal of between academic biology and expert medicinal
of chemical toxicants tested
Chemical Information and Modeling published a chemistry it is essential to carry out a medicinal
against Tetrahymena pyriformis.
special editorial highlighting the requirements for chemistry annotation of putative hits or leads J Chem Inf Model 48 (4), 766-
QSAR papers that should be followed by authors before expenditure of significant drug discovery
784.
considering publishing their results in the journal
20
effort. The early stages of the annotation process
15 Young, D et al (2008). Are
the chemical structures in
and recent publications addressing common mis- can be done using known filters and guidelines for
your QSAR correct? QSAR
takes and criticising faulty practices in the QSAR acceptable chemistry functionality. A more detailed
Comb Sci 27, 1337-1345.
modelling field
21-23
have appeared, yet none of analysis asking questions about the chemistry of 16 Oprea, TI et al (2007).
these sources have explicitly described and dis- the hit or lead, and what is known biologically and
WOMBAT and WOMBAT-PK:
cussed the importance of chemical record curation chemically about substructures and similar com-
Bioactivity Databases for Lead
and Drug Discovery, Chemical
for developing robust QSAR models. pounds to the hit or lead currently requires a
Biology: From Small Molecules
There is an obvious trend within the community medicinal chemistry expert and takes on average
to Systems Biology and Drug
of QSAR modellers to develop and follow the stan- about 20 minutes per compound. The in-depth Design. Schreiber, SL, Kapoor
dardised guidelines for developing statistically data available through CAS SciFinder was used in
TM and Wess, G (Eds), Wiley-
robust and externally predictive QSAR models
24
. the annotation of 64 putative tools and probes
VCH, New York, 2007, pp. 760-
786.
The importance of developing best practices for from the NIH Roadmap MLSCN effort
25
.
17 Oprea, TI et al (2003). On
data preparation prior to initiating the modelling Progress towards public sector tools for chemistry
the propagation of errors in
process is obvious. There is therefore a pressing annotation might allow for a more affordable and the QSAR literature in
need to amend the five OECD principles by adding accessible process in the future. For example, many
EuroQSAR 2002 – Designing
a sixth rule that would request careful data prepa- companies have instituted filters (usually SMARTS
drugs and crop protectants:
Processes, problems and
ration prior to model development. There is a need queries) to remove undesirable molecules, false
solutions. Eds Ford, M,
to develop and systematically employ standard positives and frequent hitters from their HTS
Livingstone, D, Dearden, J and
chemical record curation protocols that should be screening libraries or to filter vendor compounds. Van de Waterbeemd H (Eds),
helpful in the pre-processing of any chemical Early examples include REOS from Vertex
26
,
New York, Blackwell Publishing,
dataset and these could be automated using existing basic, hard and soft filters from GSK
27
and func-
2003, 314-315.
18 Dearden, JC et al (2009).
software packages (many of which are free for aca- tional group compound filters from BMS
28
. These
How not to develop a
demic investigators). The essential procedures are in addition to the many proprietary filters at
quantitative structure-activity
include the removal of inorganic compounds, coun- companies. A particular issue is chemical reactivity or structure-property
terions and mixtures (because for the most part the towards protein thiol groups. A group from
relationship (QSAR/QSPR).
current chemical descriptors do not account for Abbott reported a sensitive assay to detect reactive
SAR QSAR Environ Res 20 (3-
4), 241-266.
such molecular records), ring aromatisation, nor- molecules by NMR (ALARM NMR)
29,30
. A fol-
19 Group, QE (2004). The
malisation of specific chemotypes, curation of tau- low up study used 8,800 compounds with data
report from the expert group
tomeric forms and the deletion of duplicates. from this assay to create a Bayesian classifier on (Quantitative) Structure-
Data analytical studies are impossible without model with extended connectivity fingerprints
Activity Relationships
trusting the original data sources. It is important, (ECFP_6) with good classification accuracy to pre-
[(Q)SARs] on the principles
for the validation of (Q)SARs.
whenever possible, to verify the accuracy of the dict reactivity
31
. This also identified 175 substruc-
OECD Series on Testing and
primary data before developing any model. We tures that were likely of interest as potentially caus-
Assessment No. 49.
believe that this approach could be summarised by ing reactivity. Currently there is no freely accessible ENV/JM/MONO(2004)24.
a famous proverb ‘Trust, but verify’ that was fre- automated method for filtering compounds or
Organization for Economic
quently used by the late president Ronald Reagan alerting users to reactivity issues. If we were to take
Cooperation and
Development, Paris, France.
during the cold war era and that traces back to the this further, how could we encode the knowledge
206 pp.
founder of the Russian KGB Felix Dzerzhinsky of many medicinal chemists with drug discovery
20 Jorgensen, WL (2006).
who invented it almost 100 years ago expertise into a piece of software or database that QSAR/QSPR and proprietary
(http://en.wikipedia.org/wiki/Trust_but_Verify). would identify chemical ‘trash’ or undesirable mol-
data. J Chem Inf Model 46,
Our hope is that other experts will also contribute ecules for biologists? There is certainly some scope
937.
21 Maggiora, GM (2006). On
their expertise and best practices to this effort. here to influence the quality of hits and leads that
outliers and activity cliffs –
are published and annotate such molecules in pub-
why QSAR often disappoints. J
Improving the quality of putative hits lic databases. Chem Inf Model 46 (4), 1535.
and leads
Hits or leads in rare, orphan and neglected diseases Discussion
(or for that matter many pharmaceutically relevant Freely available databases and tools supporting
targets) can arise from phenotypic or mechanistic drug discovery and chemistry in particular are Continued on page 38
Drug Discovery World Winter 2009/10 37
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64 |
Page 65 |
Page 66 |
Page 67 |
Page 68 |
Page 69 |
Page 70 |
Page 71 |
Page 72 |
Page 73 |
Page 74 |
Page 75 |
Page 76 |
Page 77 |
Page 78 |
Page 79 |
Page 80 |
Page 81 |
Page 82 |
Page 83 |
Page 84 |
Page 85 |
Page 86 |
Page 87 |
Page 88