Recently
Bruns & Watson published a paper describing a set of rules to filter out (potenially) reactive and promiscuous small molecules. There are 275 rules addressing a variety of possible reasons to reject a molecules: general reactivity, assay interference (such as flourescence), instability and so on. There filtering protocol also includes a set of explicit compounds specifically marked as interfering (even though they may have passed the structural rules).
The authors have also made their code
available on Github - it's C++ code and compiles easily on Unix systems. While it's quite straightforward to use, we've put up a simple
web page where you can paste a collection of SMILES (which should include molecule titles) and retrieve a summary report (in HTML, plain text or Excel formats). You can also interact with the page via POST requests to
http://tripod.nih.gov/py/filter. An example of doing this in Python is shown below
import urllib, urllib2
format = 'text' # can be 'excel' or 'html'
url = 'http://tripod.nih.gov/py/filter'
values = {'format':format, 'queryids':'''[NH2+](C(C)c1ccccc1)CCC(c1ccccc1)c1ccccc1 85273758
O(C(=O)C(C1CCCCC1)c1ccccc1)CC[NH+](CC)CC 85273739
O(C(c1ccccc1)c1ccccc1)CC[NH+](C)C 85273734
n1c(N)c(N=Nc2ccccc2)ccc1N 85273730
S(=O)(=O)(N1CC[NH+](CC1)CC(=O)Nc1c(n[nH]c1C)C)c1cc(OC)c(OC)cc1 85273684
S1(=O)(=O)CC([NH+](CC(=O)Nc2cc(cc(c2)C(OC)=O)C(OC)=O)C)CC1 85273683
FC(F)(F)c1ccc(NC(=O)C([NH+](CC(=O)N2CC[NH+](CC2)Cc2ccccc2)C)C)cc1 85273682'''}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
page = response.read()
print page |