The Computational Prediction of Cytochrome P450 Metabolites

The active site of the crystal structure of human lung CYP2A13 contains a porphyrin heme group which catalyzes several reactions to make xenobiotics more water soluble.

Cytochrome P450 Metabolites

Cytochrome P450s (CYPs) are a class of enzyme that metabolize xenobiotics (drugs and other compounds taken into the body) in order to oxidize them and make them more soluble in the blood and thus, more easily excreted through the kidneys. CYP’s are essential in drug metabolism, as well as the metabolism of many endogenous substrates, and different CYP forms exist in different tissues of the body, distributed in different locations within the cell. As a result of this inhomogeneity of distribution, some xenobiotics are metabolized by specific CYPs in certain organs, leading to an increase in concentration of their metabolites in these tissues. While CYP metabolism usually leads to metabolites that are rapidly removed from the body, in some cases, the metabolites may actually be harmful to the body – and more so in the tissues they are generated.

In a scientific collaboration with Reynolds American Incorporated , a computational approach to the prediction of xenobiotic metabolites by cytochrome P450s is being developed which integrates 1) machine learning models trained on quantum-mechanically-derived molecular surface properties for a set of CYP substrates with known metabolites to identify sites of metabolism across all CYP isoforms and 2) ensemble docking of the substrates into the binding sites of a given CYP isoform to identify low energy poses that locate a known active site within a reactive distance of the heme group. Below is a graphical overview of the process.


Substrates are rendered as three-dimensional structures and geometry optimized


Quantum mechanical observables are calculated as molecular surface properties


Structures with known metabolites are used to train machine learning models to predict reactive sites


Ensembles of CYP conformers are generated from molecular dynamics simulations


Substrates are then docked into the binding sites of the ensemble to filter poses with active sites near the heme reactive center