Research
From Chemical Informatics and Cyberinfrastructure Collaboratory
Contents |
Customer Groups
The following are draft identifications of customer groups we could target.
HTS screening and chemistry follow-up scientists
Description: These are academic chemists and other scientists who are interested in using the MLSCN screening results from PubChem to follow up on compounds that show potential activity, to examine compounds active in screens related to their projects, or compounds similar to those with which they are already working. They are most likely to reside in medicinal chemistry, pharmacy, and medicine schools and departments. They may also reside in screening centers. Scientists performing this function already exist in the pharmaceutical industry, using tools like Tripos HTS Benchware Data Miner, Pipeline Pilot, and Spotfire to analyze the active compounds and their screening results, and to gather computational and scientific information to help in the decision making process
Current collaborators: Scripps (Stephan), Lilly (Mic Lajiness, Tom Doman, Jeff Sutherland, Dan Robertson, Horst Hemmerle), IU (Faming Zhang)
Potential collaborators: Tripos (Bob Clark), Pfizer (Jack Bikker)
Suggested focus: Develop workflows which encode computational processes currently used to process the data (such as flagging and organizing) and also ones which bring in useful information not usually accessible (such as literature compounds with OSCAR). Interview customers using contextual design techniques to clarify the most useful workflows. Develop idea of "Augmented HTS analysis" (see talk below). Build 2-3 tools for these customers, and refine with usability testing.
Work so far: We have worked with the Scripps MLSCN group. A detailed trip report (from Rajarshi Guha) is available from Scripps Trip Report, September 2006. We have demonstrated that we can use workflows to do a range of processing of HTS data, from the standard (filtering, flagging) to the innovative (literature mining). This work is described in David Wild's Sept 06 ACS talk. We have also demonstrated that our services and workflows can be used in the Pipeline Pilot environment employed by Scripps. We have shown that organizing the compounds in PubChem using cluster analysis is readily possible in a few hours, and have adapted toxicity prediction methods to make them amenable to exposure as web services. We have developed an example, easy to use desktop .NET interface to allow extraction of structures and data from PubChem by chemists (PubChemSR)
Academic organic chemists
Description: These are chemists working in academic chemistry departments whose research could potentially benefit from computational methods (although they may not know it!). Their needs are likely to be specific and not fall into predefined categories well, which makes our web service / workflow approach very appealing for this.
Current collaborators: IU (Faming Zhang, Rich DiMarchi)
Potential collaborators: Other IU chemists, other university contacts through MESA collaboration
Suggested focus: Develop further the collaboration with Mesa to educate academic chemists on the potential benefits of using computational tools. Work with Faming, Rich and any other chemists we can find to understand how we might produce information which would help their research. Encode any necessary workflows, and build easy-to-use tools if needs might be recurrent. Devote some effort to helping them with "one off" needs but document these to build over time an understanding of information provision which could be generally useful
Work so far: Mookie is working with Rich DiMarchi on the application of QM for his formulation project. David and Rajarshi are working with Faming to mine Pubchem for selective Kinase inhibitors. David and Gary are collaborating with MESA for the deployment of chemoinformatics web tutorials in a variety of university departments (including chemists)
Chemoinformatics researchers and professionals
Description: These people in universities and industry who are actively researching and/or applying chemoinformatics techniques. They may be doing basic research (developing new methods), evaluating methods, or using methods in drug discovery.
Current collaborators: Lilly (Mic Lajiness, Tom Doman, Jeff Sutherland, Dan Robertson, Horst Hemmerle), Pfizer (Jack Bikker), Michigan MACE ... probably more???
Potential collaborators: OpenEye, Digital Chemistry, Peter Rose, ChemAxon, Sheffield, Other universities, ...
Suggested focus: Deploy a robust infrastructure that allows people to wrap their code as web services, and makes existing services and workflow tools available to them. Set standards for evaluation datasets, data formats and SOAP/WSDL interfaces. Provide education in the development of chemoinformatics techniques using our infrastructure
Work so far: Prototype infrastructure including services and workflows, evaluation of DTP Tumor Cell Line set as standard HTS data mining test set, Lilly-IU-Michigan joint workshop, blog, general chemoinformatics education
Core Services
- Maintenance and support of Cyberinfrastructure
- Requirements and Software Design (including HCI)
- End user software development
- Basic Chemoinformatics Research
- Education
Archive
Here are the previous contents of this page
The CICC project's core research objective is to develop a Web Service infrastructure that links chemical and drug databases, cheminformatics tools, computational quantum chemistry applications, and user interface environments. Workflow tools are used to link these services into use case scenarios for distributed drug discovery.
I coughed up a quick summary of the project, available here.
Below is a list of some major project areas.
- Databases Projects - Dist Drug Discovery, NIH/PubChem local, Varuna
- Web Service Infrastructure - Cheminformatics and general tools implemented as web services
- Workflows - Taverna workflows using cheminformatics services and databases
- Visualization and End User Tools - Tools for mining and visualizing data
Additional information
- Interoperability and Standards - What standards we should use, file formats, etc
- SourceForge SVN Repository Information - Links to CICC's Source Forge page and information on SVN.
