Chemoinformatics and Metabolism: Paula de Matos
Chemoinformatics and Metabolism: Paula de Matos
Paula de Matos
Indexing, searching
Natural Products and and dissemination of
Metabolomics chemical information
3 08.12.21
http://www.ebi.ac.uk/chebi
Dictionary
Ontology
Resource for
Nomenclature
4 08.12.21
What does ChEBI cover?
5 08.12.21
6 08.12.21
7 08.12.21
Status
8 08.12.21
ChEBI further info
• http://www.ebi.ac.uk/chebi
• Mailing lists:
• [email protected]
• [email protected]
• [email protected]
• Submitting data
• http://www.ebi.ac.uk/chebi/submissions
9 08.12.21
The Chemistry Development Kit (CDK):
An Open Source Java-Library for Structural Chemo- and Bioinformatics
http://cdk.sourceforge.net
(1) Steinbeck, C.; Hoppe, C.; Kuhn, S.; Guha, R.; Willighagen, E. L. Current Pharmaceutical Design 2006, 12, 2111-2120.
(2) Steinbeck<, C.; Han, Y. Q.; Kuhn, S.; Horlacher, O.; Luttmann, E., Willighagen, E. Journal of Chemical Information and
Computer Sciences 2003, 43, 493-500.
10 08.12.21
The Chemistry Development Kit (CDK)
Input/Output Visualization
•I/O (CML, MDL Molfile, SDF, PDB) •Structure-Diagram-Layout (SDG)
•SMILES •2D Rendering
•InChI •3D Rendering
12 08.12.21
Example: Fingerprinting
H
N
0 0 1 1 0 1 0 0 1 0
O O
-COOH Alky Hetero-
aryl N O-Alkyl -NH2
H N
13 08.12.21
CDK in numbers
•
67 registered developers on SF
•
86 people subscribed to cdk-devel list
•
111 people subscribed to cdk-user list
14 08.12.21
CDK in numbers
15 08.12.21
CDK in numbers
• Mailing list:
• [email protected]
• [email protected]
• Documentation
• http://pele.farmbio.uu.se/nightly/
17 08.12.21
OrChem
18 08.12.21
OrChem database structure
19 08.12.21
Example OrChem Queries
• Similarity search
• select * from table(
orchem_simsearch.search( 'OC4=C(C(=C3OC(C)
(COC=1C=CC(=CC=1)CC2C(=O)NC(=O)S2)CCC3=C4C)C)C','S
MILES',0.8,null,'N')
);
• Substructure search
• select orchem_subsearch.search(molfile,'MOL',50,'Y') from
compounds where molregno=12345;
20 08.12.21
Fingerprint distribution
21 08.12.21
Parallel vs. Non parallel
Performance of substructure search on 3.5 million compounds
22 08.12.21
Substructure benchmarking
Performance of substructure search on 3.5 million compounds
23 08.12.21
Similarity Benchmarking
24 08.12.21
OrChem info
• http://orchem.sourceforge.net/
• Mailing list:
• [email protected]
25 08.12.21
26 08.12.21