DLKcat Online Tool

Abstract

Enzyme turnover numbers (kcat) are key to understanding cellular metabolism, proteome allocation and physiological diversity, but experimentally measured kcat data are sparse and noisy. Here we provide a deep learning approach (DLKcat) for high-throughput kcatprediction for metabolic enzymes from any organism merely from substrate structures and protein sequences. DLKcat can capture kcat changes for mutated enzymes and identify amino acid residues with a strong impact on kcat values. We applied this approach to predict genome-scale kcat values for more than 300 yeast species. Additionally, we designed a Bayesian pipeline to parameterize enzyme-constrained genome-scale metabolic models from predicted kcat values. The resulting models outperformed the corresponding original enzyme-constrained genome-scale metabolic models from previous pipelines in predicting phenotypes and proteomes, and enabled us to explain phenotypic differences. DLKcat and the enzyme-constrained genome-scale metabolic model construction pipeline are valuable tools to uncover global trends of enzyme kinetics and physiological diversity, and to further elucidate cellular metabolism on a large scale.