From the BlogHire Me

Archives for June 2011

Catalan scientists in chemoinformatics

Research in Spain, and specially in Catalonia was improving, mainly due to budget increases, during last decade. But then the crisis came and science, regarded as a luxury and not as a necessity, suffered severe cuts. Oh short sighted politicians! Shame on you! These drastic changes raised alarms even in Nature, where a tough editorial against the measures was written.

During last ICCS I had the chance to meet other catalan chemoinformaticians that are performing exciting research, in the motherland and abroad.  Frankly, it gave me some hope to see these young talents, carrying on with their work amidst the gloomy present and future for scientists in our country. Hopefully one day the muscles that will move the Catalan and Spanish economies will be our brains and not the arms that pile bricks and cement.

A picture was taken of the catalans at ICCS. Let’s see how many of us come to next edition.

From left to right
Xavi Jalencas, Oral Presentation, A knowledge-based approach to assessing the target promiscuity of chemical fragments. Xavi is a PhD student at the ChemogenomicsLab headed by Jordi Mestres, in Barcelona, Catalonia, Spain.

Violeta I. Perez-Nueno, Oral Presentation, Identifying and quantifying drug promiscuity by correlating ligand and target shape similarities. Violeta is a post-doc at LORIA, INRIA Nancy Grand Est, Vandoeuvre-les-Nancy, France.

Julio E. Peironcely, Oral Presentation, Improving metabolite identification with chemoinformatics. I am a PhD student at Leiden University , TNO and the Netherlands Metabolomics Centre.

Laura Guasch, Poster Presentation, Key features for designing PPARgamma agonists: an analysis of ligandreceptor interaction by using a 3D-QSAR approach. Laura is as well a PhD student at the Nutrigenomics group of Rovira i Virgili University, Tarragona, Catalonia, Spain.

Miguel Rojas-Cherto, Poster Presentation, Effectiveness of fingerprint-based measures of multi-stage mass spectrometry similarity for virtual screening of chemical structures. Miguel is a fellow post-doc at Leiden University and the Netherlands Metabolomics Centre.

Not in the picture we have Jordi Muñoz-Muriedas, Poster Presentation, Ames toxicity: strategies to remove ames liability for anilines. Jordi is a scientist at Glaxo Smithkline, Stevenage, United Kingdom.

Slides for the 9th International Conference on Chemical Structures

As I stated before in a post, I enjoy when people share. So it is time to walk the talk. I listened to Egon and I uploaded the slides of my talk at the ICCS 2011, I hope you enjoy them.

Get the pdf here


J. Peironcely (1,2,3) M. Rojas-Cherto (2,3), P. Kasper (2,3), L. Coulier (1,3), R. Vreeken(2,3), T. Reijmers (2,3), A. Bender (4), JL. Faulon (5), T. Hankemeier (2,3)

1 TNO Quality of Life, Zeist, The Netherlands
2 Leiden University, Leiden, The Netherlands
3 Netherlands Metabolomics Centre, Leiden, The Netherlands
4 University of Cambridge, Cambridge, United Kingdom
5 University of Evry, Evry, France

To provide detailed information of biological phenotypes on a chemical basis metabolomics aims at profiling all sorts of metabolites, which form a chemically diverse group of substrates and products involved in enzymatic pathways. Current analytical platforms used in metabolomics produce a large amount of complex data, which require chemoinformatics tools to process and transform this data into meaningful information. For biological interpretation identification of metabolites (elucidating the chemical structure of the metabolites of interest) is essential. New analytical platforms and better software tools are required to advance in metabolite identification. Here we present a pipeline of software tools developed to facilitate identification of metabolites measured with Liquid Chromatography – Mass Spectrometry (LC-MS).

High-resolution multi stage MS spectra (MSn data) were acquired for metabolite standards listed in the HMDB (Human Metabolome Database). Currently no tool exists that captures all relevant information present in MSn data so a software tool was developed, integrating the Chemistry Development Kit (CDK) and XCMS, for preprocessing the spectral data. The Multi-stage Elemental Formula (MEF) tool automatically resolves the elemental composition of the parent compound, the fragment ions, and the neutral losses. This process of elemental formula assignment and fitting also removes artifacts of the spectra. The resulting enriched MSn data of many metabolite standards are stored in XML format in a MSn database, to allow structural elucidation of unknown metabolites by comparing the MSn data of the unknowns with the MSn data in the database. The database also enables the characterization of substructures from the unknown compound by querying and matching subsets of the MSn data. A fingerprint based similarity search for MSn data was developed to find out which trees in the database are most similar to an experimentally acquired MSn data.

An open source chemical structure generator was implemented to generate candidate structures using the elemental formula and substructure information obtained with the previous tools. This structure generator combines concepts of graph theory and a chemistry library, the CDK, to exhaustively generate all non-isomorphic chemical structures for the input data. This input data is an elemental formula and optionally, one or multiple non-overlapping prescribed substructures. The output of the structure generator is a, usually large, list of structures which need to be further reduced. Therefore, models of Metabolite-Likeness were built to reject structures that do not resemble metabolites. Different molecular descriptors, fingerprints, and classifiers were evaluated, and the best combination employed to build a final model. Only candidate structures with a high Metabolite-likeness are kept in our Metabolite Identification pipeline.

In this work we demonstrate how this workflow of chemoinformatics tools improves the state of the art in metabolite identification using real life samples and how it helps to translate experimental data into chemical data.


New publication: Chemogenomics Approaches for Receptor Deorphanization and Extensions of the Chemogenomics Concept to Phenotypic Space

Chemogenomics Approaches for Receptor Deorphanization and Extensions of the Chemogenomics Concept to Phenotypic Space

You can download the pdf here

Chemogenomics Approaches for Receptor Deorphanization and Extensions of the Chemogenomics Concept to Phenotypic Space. 2011, Curr. Top. Med. Chem. (in press)
Eelke van der Horst, Julio E. Peironcely, Gerard J. P. van Westen, Olaf O. van den Hoven, Warren R. J. D. Galloway, David R. Spring, Joerg K. Wegner, Herman W. T. van Vlijmen, Ad P. IJzerman, John P. Overington and Andreas Bender.



Chemogenomic approaches, which link ligand chemistry to bioactivity against targets (and, by extension, to phenotypes) are becoming more and more important due to the increasing number of bioactivity data available both in proprietary databases as well as in the public domain. In this article we review chemogenomics approaches applied in four different domains: Firstly, due to the relationship between protein targets from which an approximate relation between their respective bioactive ligands can be inferred, we investigate the extent to which chemogenomics approaches can be applied to receptor deorphanization. In this case it was found that by using knowledge about active compounds of related proteins, in 93% of all cases enrichment better than random could be obtained. Secondly, we analyze different cheminformatics analysis methods with respect to their behavior in chemogenomics studies, such as subgraph mining and Bayesian models. Thirdly, we illustrate how chemogenomics, in its particular flavor of ‘proteochemometrics’, can be applied to extrapolate bioactivity predictions from given data points to related targets. Finally, we extend the concept of ‘chemoge- nomics’ approaches, relating ligand chemistry to bioactivity against related targets, into phenotypic space which then falls into the area of ‘chemical genomics’ and ‘chemical genetics’; given that this is very often the desired endpoint of approaches in not only the pharmaceutical industry, but also in academic probe discovery, this is often the endpoint the experimental scientist is most interested in.