CLEAN AI Tool Offers Unprecedented Accuracy in Enzyme Function Prediction

A new AI tool called CLEAN has been developed by researchers at the University of Illinois Urbana-Champaign, allowing for precise predictions of enzyme functions based on amino acid sequences. CLEAN’s extraordinary abilities extend to enzymes that are not well-studied or not fully understood. Its accuracy, reliability, and sensitivity surpass current state-of-the-art tools, making it a vital resource for research in genomics, chemistry, industrial materials, medicine, and pharmaceuticals.

Credit: Thomas Shafee/Wikipedia

Huimin Zhao, the study leader and a professor of chemical and biomolecular engineering at the University of Illinois Urbana-Champaign, explained the AI tool’s functionality. By comparing CLEAN’s ability to generate predictive text from written language, like ChatGPT, Zhao said, “We are leveraging the language of proteins to predict their activity.” CLEAN aims to assist researchers in quickly determining the functions of new protein sequences and identifying suitable enzymes for chemical synthesis and material production across various fields, including biology, medicine, and industry.

Scheduled for publication in the journal Science, the study highlights the shortcomings of conventional computational tools when predicting enzyme functions. Existing tools typically assign enzyme commission numbers (ID codes indicating the enzyme-catalyzed reaction) by comparing queried sequences with known enzymes, but their predictive accuracy wanes when dealing with less-studied enzymes or enzymes with multiple functions.

CLEAN sets itself apart by using a deep-learning algorithm called contrastive learning, a state-of-the-art approach that improves prediction accuracy. “We are not the first one to use AI tools to predict enzyme commission numbers, but we are the first one to use this new deep-learning algorithm called contrastive learning to predict enzyme function. We find that this algorithm works much better than the AI tools that are used by others,” Zhao emphasized.

The research team validated CLEAN’s performance through computational and in vitro experiments. The AI tool effectively predicted the function of previously uncharacterized enzymes and identified enzymes with multiple functions. It also corrected enzymes that were mislabeled by existing software.

CLEAN is accessible online for researchers around the globe who wish to characterize enzymes or determine their potential to catalyze specific reactions. The tool’s user-friendly web interface allows researchers to input enzyme sequences into a search box and promptly receive the results.

The research team has ambitious plans for expanding CLEAN’s capabilities. They plan to use the AI to characterize other proteins, such as binding proteins, receptors, and transcription factors. Moreover, the team aims to improve the machine-learning algorithms to enable users to search for desired reactions, and the AI will identify the best-suited enzymes for the job.

Zhao envisions a future where AI can predict the functions of all proteins in a cell, propelling biotechnology and biomedical applications forward. “We want to predict the functions of all proteins so that we can know all the proteins a cell has and better study or engineer the whole cell for biotechnology or biomedical applications,” Zhao concluded.


X, S. (2023, March 30). AI predicts enzyme function better than leading tools. Retrieved April 2, 2023, from

Touchstone, L. A. (2023, March 30). AI predicts enzyme function better than leading tools. News Bureau | ILLINOIS.