CSNAP accepts two types of inputs representing the structure of small molecule query ligands:
Molecular fingerprints present each query ligand as a binary sequence of structural features. This faciliates chemical structural comparisons from large compound databases. The fingerprints are used in both the "search" and "cluster" steps. Different types of fingerprints represent different ways of representing ligands into binary sequences:
In CSNAP, chemical similarity comparisons of ligand molecular fingerprints are quantified by two scoring functions: Tanimoto coefficent (Tc) (0-1) and Z-score (>2.5) based on Tc score distribution of the top 100 ranked compounds. A Tc cutoff of 0.85 and a Z-score cutoff of 2.5 generally indicates that two ligands are highly similar and may share common bioactivities.
A confidence score is assigned to each target annotation for ChEMBL compounds as part of the manual curation process from the literature.The confidence score value reflects both the type of target assigned to a particular assay and the confidence that the target assigned is the correct target for that assay. The confidence scores range from 0, for as yet uncurated data entries, to 9, where a single protein target has been assigned a high degree of confidence. Assays assigned a non-molecular target type, for example a cell-line or an organism, receive a confidence score of 1, while assays with assigned protein targets receive a confidence score of at least 4.
Three types of target annotations are retrieved from the ChEMBL database in CSNAP analyses:
The CSNAP results page consists of five panels:
Both query and reference compounds are ordered into network similarity graphs where nodes represent compounds and edges represent chemical similarity between ligands. To further differentiate between query and reference compounds, the query compounds are labeled by red nodes while the reference compounds are labeled by gray nodes. The nodes and edges can be selected to reveal additional information about the selected compounds.
Right click mouse to access content menu and toggle node and edge labeling.
The node information consists of 7 fields:
Specifically, the CSNAP_Target_Scores consisting of two scoring functions:
The predicted targets are ranked according to S-scores in decending order:
The target scores can be used to measure the confidence of the predicted targets of the selected compounds.
The edge information consists of 3 fields:
The similarity values repersent the Tanimoto similarity measure between the source and target nodes connecting by the edge.
LTIF fingerprint maps the S-scores of all the query compound against all the predicted targets. This results in a heatmap that can be used to identify the off-target activities of the ligands. The color intensity is scaled according to the S-score: from low (green) to high (red).
LTIF target spectrum counts the total S-score as sigma S-score of each target column in the LTIF heatmaps. The plot can be used to determine the target consensus within the compound set and can be used to differentiate major targets from off-targets.
Use the following mouse operation to navigate the plot:
The Target GO Search consists of 6 fields:
The Target GO Search panel can be utilized as a post-target prediction and target valiation tool particularly for identifying targets that can induce given phenotypes through known molecular etiology.