Databases Projects

From Chemical Informatics and Cyberinfrastructure Collaboratory

These are individual projects, but they have a lot in common particularly they all share a need to be able to store and search 2D chemical structure information, and to be able to view tables showing 2D structures and data. The standard for doing this that we're using is to use:

CHORD works only with PostgreSQL, and extends its functionality for chemical structures (e.g. adding SQL commands to perform substructure and similarity searching)

Contents

Distributed Drug Discovery

Purpose: To build a database of 2D chemical structures and associated reaction information that have been made (or which can be made) in the Distributed Drug Discovery project run by Bill Scott

People: Kelsey Forsythe, Bill Scott, Malika Mahoui, Usha Cheemakurthi, Deepthi Jonnala, David Wild

Special Needs: Enumeration of libraries from reagents. Special kinds of searching.

Local NIH DTP database

Purpose: To build a local database containing the NIH DTP data that can be used for data mining

People: Melanie Wu, Xiao Dong, Huijun Wang, David Wild

Special Needs: Ability to similarity search, extract biological fingerprints and gene expression data.

Progress: Similarity Search on DTP data now available

Local Pubchem database

Purpose: To build a local copy of PubChem that can be used for data mining

People: Melanie Wu, Xiao Dong, Huijun Wang, David Wild, Rajarshi Guha

Special Needs: Ability to handle complex data in PubChem, and prototype new architectures.


Currently a slightly reduced version of PubChem is available locally. Web services are also provided as a front-end to the various tables. See the details for more information on schema, indices and triggers.

Docking database

Purpose: Store the results of large scale docking. Here large scale implies the whole of PubChem, though currently it is working with a drug like subset of PubChem (approximately 1M compounds) and multiple protein targets. Currently has data for 1 target, though families of proteins will be processed. See here for details.

People: Rajarshi Guha

Quantum Mechanical database (Varuna)

Purpose: People: Melanie Wu, Mookie Baik, ... ?

Special Needs: