AViDD Filters: Compound Filtering SOP for Hit to Lead

High throughput screening (HTS) is a powerful technique for identifying potential drug candidates from large libraries of compounds. However, many compounds that show activity in HTS are false positives or may not be suitable for further development. Therefore, it is important to apply various filters to eliminate compounds that have undesirable properties, such as poor solubility, promiscuous activity, or interference with the HTS assay. In this article, we reviewed some of the computational and experimental filters we have used to triage HTS screens to identify small molecules with activity on viral targets, and discuss their pros and cons. Finally, we accepted the limit for each filter for hit confirmation and hit progression in the AViDD drug discovery effort.

  1. Physicochemical Filters

Physicochemical filters are based on the molecular properties of the compounds, such as molecular weight, lipophilicity, polarity, and hydrogen bonding. These properties may affect the drug-likeness and pharmacokinetics of the compounds. One of the most widely used physicochemical filters is the Lipinski’s rule of five (Lipinski et al., 2001), which states that most orally active drugs have no more than five violations of the following criteria: molecular weight ≤ 500 Da, logP ≤ 5, number of hydrogen bond donors ≤ 5, and number of hydrogen bond acceptors ≤ 10. Other physicochemical filters include the Veber rule (Veber et al., 2002), which limits the number of rotatable bonds ≤ 10 and the polar surface area ≤ 140 Å, and the Ghose filter (Ghose et al., 1999), which defines ranges for molecular weight (160-480 Da), logP (-0.4 to 5.6), number of atoms (20-70), and molar refractivity (40-130).

The advantages of physicochemical filters are that they are based on empirical data from known drugs and are easy to calculate computationally. They can help to reduce the size of the compound library and eliminate hit compounds that may be difficult to optimize into drug leads. However, a key disadvantage of physicochemical filters is that they are not specific for the HTS target and may exclude some novel chemotypes that can be optimized for activity on viral enzymes.

2. Pan Assay Interference Filters

Pan Assay Interference (PAINS) filters are based on the structural features of the compounds that are known to cause promiscuous off-target activity across multiple HTS (Baell et al., 2010). These features include reactive groups (such as aldehydes, thiols, or Michael acceptors), protein aggregators (such as aromatic amines or sulfonamides)(Irwin et al., 2015), fluorescent compounds (such as coumarins or rhodamines), or metal chelators (such as catechols or hydroxamic acids). These types of compounds can interfere with the assay by covalently modifying the target or other proteins, forming aggregates that sequester the target or other molecules, quenching or enhancing the signal of the assay readout, or binding to metal ions that are essential for the enzymatic activity of the target.

The advantage of PAINS filters is that they can be applied computationally to rapidly eliminate potential false positives arising from non-specific interactions or assay artifacts. However, a major disadvantage of PAINS filters is that they may exclude compounds that have genuine activity against viral targets, such as covalent inhibition or metal chelation. These filters will fail to flag those interfering chemotypes that have not been widely reported in the literature.

  1. Data Interpretation Filters

Data interpretation filters are based on the statistical analysis of the primary HTS data to eliminate outliers and false positives (Moffat et al., 2017). These filters include methods such as Z’-score normalization, selectivity index, dose-response curve fitting, hillslope, and cluster analysis. These methods can help to correct for systematic errors, such as plate effects or edge effects, as well as to flag compounds with poor solubility or protein aggregators to help distinguish between true hits and false positives or negatives.

The advantages of data interpretation filters are that they can be used to triage the primary HTS screening data to remove many of the false positives and to prioritize the most promising hits for further validation. However, the disadvantages of data interpretation filters are that they require careful selection and optimization of parameters and algorithms, and they may depend on the quality and quantity of the HTS data. They may also introduce biases or artifacts if not applied properly.

Figure 1: Hit Confirmation Process in AViDD

4. Hit filtering criteria and hit confirmation

In the READDI AViDD Center, all primary hits from virtual and high-throughput screening are filtered through the set of physicochemical, experimental, and data interpretation filters depicted in Figure 1. The accepted parameter for each step is presented in Table 1 for hit confirmation and progression.

Table: READDI AViDD Center Hit Selection Parameters Limit

Tier Filter Parameter Limit
Computational STOPLIGHT Composite score ≤1
Molecular Weight ≤500 da
clogP ≤5
tPSA <140 Å
Number of Rotatable Bonds <10
FS3 >0.3
Assay Liabilities PAINS: Reactive groups

Thiols, furans, warheads, hydrazones, polyphenols, curcumin

Aggregators Absent
Fluorescent compounds Absent
Metal chelators Absent
Data Analysis Hit Selection Z’-score >0.5
Curve fitting Satisfactory
Hillslope 0.5 to 2
Selectivity index (tox/potency) ≥10
Filtered hit validation Reproducibility
Synthetic feasibility Priority to easy access to library generation
Promiscuity (active in unrelated HTS) Avoid
Hit to Lead Experimental IC50 Reproducible data
Solubility (Kinetic) 10X of IC50
Aggregation by DLS 0% at 50 μM


Compound filtering from HTS is a crucial step in hit identification for small molecule drug discovery. For the viral enzyme targets in the READDI AViDD Center it has been used to eliminate primary hits that have undesirable phyiochemical properties or potetial for assay interference, and to select compounds that have the best potential for on target activity. However, no single filter is perfect or universal. Therefore, it is important to apply multiple filters in a rational and balanced way, taking into account the specific characteristics of the target protein and the assay format. By doing so, one can increase the chances of finding progessible hits for optimization into effective antiviral drugs.

To know more:

Lipinski Rules: AdvancedDrugDeliveryReviews, 46 (2001) 3–26

To access STOPLIGHT: Click here; Ref. Holli et al., 2023

Aggregators: J Med Chem. 2015 Sep 10; 58(17): 7076–7087.

Solubility models: https://practicalcheminformatics.blogspot.com/2023/06/getting-real-with-molecular-property.html?m=1

Leave a Reply

Your email address will not be published. Required fields are marked *