Javascript must be enabled to continue!
Improve Protein Solubility and Activity based on Machine Learning Models
View through CrossRef
AbstractImproving catalytic ability of protein biocatalysts leads to reduction in the production cost of biocatalytic manufacturing process, but the search space of possible proteins/mutants is too large to explore exhaustively through experiments. To some extent, highly soluble recombinant proteins tend to exhibit high activity. Here, we demonstrate that an optimization methodology based on machine learning prediction model can effectively predict which peptide tags can improve protein solubility quantitatively. Based on the protein sequence information, a support vector machine model we recently developed was used to evaluate protein solubility after randomly mutated tags were added to a target protein. The optimization algorithm guided the tags to evolve towards variants that can result in higher solubility. Moreover, the optimization results were validated successfully by adding the tags designed by our optimization algorithm to a model protein, expressing it in vivo and experimentally quantifying its solubility and activity. For example, solubility of a tyrosine ammonium lyase was more than doubled by adding two tags to its N- and C-terminus. Its protein activity was also increased nearly 3.5 fold by adding the tags. Additional experiments also supported that the designed tags were effective for improving activity of multiple proteins and are better than previously reported tags. The presented optimization methodology thus provides a valuable tool for understanding the correlation between amino acid sequence and protein solubility and for engineering protein biocatalysts.Contactkang.zhou@nus.edu.sg, chewxia@nus.edu.sg
Title: Improve Protein Solubility and Activity based on Machine Learning Models
Description:
AbstractImproving catalytic ability of protein biocatalysts leads to reduction in the production cost of biocatalytic manufacturing process, but the search space of possible proteins/mutants is too large to explore exhaustively through experiments.
To some extent, highly soluble recombinant proteins tend to exhibit high activity.
Here, we demonstrate that an optimization methodology based on machine learning prediction model can effectively predict which peptide tags can improve protein solubility quantitatively.
Based on the protein sequence information, a support vector machine model we recently developed was used to evaluate protein solubility after randomly mutated tags were added to a target protein.
The optimization algorithm guided the tags to evolve towards variants that can result in higher solubility.
Moreover, the optimization results were validated successfully by adding the tags designed by our optimization algorithm to a model protein, expressing it in vivo and experimentally quantifying its solubility and activity.
For example, solubility of a tyrosine ammonium lyase was more than doubled by adding two tags to its N- and C-terminus.
Its protein activity was also increased nearly 3.
5 fold by adding the tags.
Additional experiments also supported that the designed tags were effective for improving activity of multiple proteins and are better than previously reported tags.
The presented optimization methodology thus provides a valuable tool for understanding the correlation between amino acid sequence and protein solubility and for engineering protein biocatalysts.
Contactkang.
zhou@nus.
edu.
sg, chewxia@nus.
edu.
sg.
Related Results
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract
The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Determination of Saturated Solubility of Mirtazapine Using UV Visible Spectrophotometer
Determination of Saturated Solubility of Mirtazapine Using UV Visible Spectrophotometer
Solubility is an important parameter for designing new drug formulations. Many drugs possess poor aqueous solubility hence, poor bioavailability. Many pharmaceutical industries fac...
Solubility-aware protein binding peptide design using AlphaFold
Solubility-aware protein binding peptide design using AlphaFold
AbstractNew protein–protein interactions (PPIs) are being identified, but PPIs have different physicochemical properties compared with conventional targets, making it difficult to ...
PHYSICO CHEMICAL AND FUNCTIONAL PROPERTIES OF CHICKPEA PROTEIN ISOLATE
PHYSICO CHEMICAL AND FUNCTIONAL PROPERTIES OF CHICKPEA PROTEIN ISOLATE
The main purpose of this research work was to isolate the most refined form of protein from chickpea for food processing. In this research work, chickpea (Cicer arietinum. L) was c...
SOLUBILITY OPTIMIZATION OF TERBINAFINE HYDROCHLORIDE SOLID DISPERSION BY BINARY AND TERNARY METHOD
SOLUBILITY OPTIMIZATION OF TERBINAFINE HYDROCHLORIDE SOLID DISPERSION BY BINARY AND TERNARY METHOD
Background: In the process of drug development, the most challenging factor is poor solubility. To formulate an oral dosage form, poor solubility and bioavailability are the main c...
An Approach to Machine Learning
An Approach to Machine Learning
The process of automatically recognising significant patterns within large amounts of data is called "machine learning." Throughout the last couple of decades, it has evolved into ...
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using d...

