Multivariate assessment of virtual screening experiments

Publication Type:

Journal Article


Journal of Chemometrics, Volume 24, Number 11-12, p.757–767 (2010)


Discovering molecules with a desired biological function is one of the great challenges in drug research. To discover new lead molecules, virtual screens (VS) are often conducted, in which databases of molecules are screened for potential binders to a specific protein, using molecular docking. The choice of docking software and parameter settings within the software can significantly influence the outcome of a VS. In this study, we have applied chemometric methods such as design of experiments, principal component analysis and partial least-square projections to latent structure (PLS) to simulated VS experiments to find and compare suitable conditions for performing VS against six protein targets selected from the DUD databases. The docking parameters in FRED, and scoring functions in both FRED and GOLD docking software, were varied according to a statistical experimental design and a PLS model was calculated to correlate the experimental setup to the VS outcome. The study revealed that the choice of scoring function has the greatest influence on VS outcome, and that other parameters have varying influence, depending on the protein target. We also found that substantial bias can be introduced by the lack of variation of molecular properties in the databases used in the screening. Our results provide indications that docking experiments could be tailored to the protein target in order to obtain satisfactory VS results.