A challenging task in early stage lead discovery is the triaging of chemical series from high throughput screening (HTS) assays. Here triaging is a multiplex problem that seeks to find the balance between a number of often competing goals such as potency, tractability, selectivity, novelty, etc. Depending on the type of assay used (e.g., biochemical, cell-based), the number of identified chemical series can easily be in the hundreds. Sifting through that much data to identify only a handful of promising series for follow-up can be quite overwhelming.
One effective strategy toward triaging is to utilize well curated databases to provide the necessary context around each chemical series. In particular, for given a chemical series we’d often like to address the following two basic questions:
- What is known, if any, about this chemical series with respect to the intended and/or related target(s)?
- What compound classes are known to modulate the intended and/or related target(s) and how similar are they to the underlying chemical series?
Recently, together with our colleague Rajarshi Guha, we develop a profiling tool to help with the triaging task. The tool is built around our molecular framwork and uses a subset (namely, IC50 activity type) of the ChEMBL database (release 5) as the backend. To facilitate activity comparisons across assays, activity values are normalized based on a robust variant of the Z-score (i.e., median and MAD are used in place of mean and standard deviation, respectively). We hope the tool also serves as an effective way to browse ChEMBL. Please feel free to let us know how we can make the tool more useful. If you’d like to have the tool available in-house with your own version of ChEMBL, please drop us a note. We’ll be happy to help you set it up.