Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Uncovering Microbial Biosynthetic Potential with Genomic Context-aware Protein Language Model

View through CrossRef
Abstract Microbial secondary metabolites, synthesized by biosynthetic gene clusters (BGCs), offer vast potential for biotechnological applications. Among BGC profiling techniques, computational detection methods face challenges, including time-consuming alignment and reliance on predefined profiles. To address these, we present BGC-Finder, an end-to-end pipeline utilizing protein language models for BGC detection and annotation from microbial genomes and metagenomes. This approach achieves remarkable increase in profiling speed of up to 100-fold, and employs genomic context-aware modeling to facilitate interpretable genetic essentiality assessment and large-scale BGC clustering. BGC-Finder outperformed traditional methods, successfully detecting 9.49% more biosynthetic-core genes and 27.70% more cytochrome P450s in 742 experimentally-validated BGCs. Notably, it retrieved 31 remote biosynthetic homologs from 210 polar marine metagenomes and identified 4,585 BGCs with 6,388 core genes from 256 fungal genomes. These findings highlight BGC-Finder’s capability to illuminate “microbial biosynthesis dark matter” (sequence-unrelated, function-similar biosynthetic enzymes) and expedite natural product discovery. Highlights BGC-Finder is an accurate and ultrafast pipeline leveraging protein language models (pLMs) to predict and annotate biosynthetic gene clusters (BGCs) from microbial genomes and metagenomes. The genomic context-aware model enables interpretable analysis: attention-driven identification of essential biosynthetic genes and embedding-guided BGC clustering. BGC-Finder sensitively retrieves remote homologous BGCs from both bacteria and fungi genomes, uncovering hidden ‘microbial biosynthesis dark matter’. We discovered a non-ribosomal peptide synthetase (NRPS) family, which involved into function-specific BGCs in two evolutionarily distant fungi.
Title: Uncovering Microbial Biosynthetic Potential with Genomic Context-aware Protein Language Model
Description:
Abstract Microbial secondary metabolites, synthesized by biosynthetic gene clusters (BGCs), offer vast potential for biotechnological applications.
Among BGC profiling techniques, computational detection methods face challenges, including time-consuming alignment and reliance on predefined profiles.
To address these, we present BGC-Finder, an end-to-end pipeline utilizing protein language models for BGC detection and annotation from microbial genomes and metagenomes.
This approach achieves remarkable increase in profiling speed of up to 100-fold, and employs genomic context-aware modeling to facilitate interpretable genetic essentiality assessment and large-scale BGC clustering.
BGC-Finder outperformed traditional methods, successfully detecting 9.
49% more biosynthetic-core genes and 27.
70% more cytochrome P450s in 742 experimentally-validated BGCs.
Notably, it retrieved 31 remote biosynthetic homologs from 210 polar marine metagenomes and identified 4,585 BGCs with 6,388 core genes from 256 fungal genomes.
These findings highlight BGC-Finder’s capability to illuminate “microbial biosynthesis dark matter” (sequence-unrelated, function-similar biosynthetic enzymes) and expedite natural product discovery.
Highlights BGC-Finder is an accurate and ultrafast pipeline leveraging protein language models (pLMs) to predict and annotate biosynthetic gene clusters (BGCs) from microbial genomes and metagenomes.
The genomic context-aware model enables interpretable analysis: attention-driven identification of essential biosynthetic genes and embedding-guided BGC clustering.
BGC-Finder sensitively retrieves remote homologous BGCs from both bacteria and fungi genomes, uncovering hidden ‘microbial biosynthesis dark matter’.
We discovered a non-ribosomal peptide synthetase (NRPS) family, which involved into function-specific BGCs in two evolutionarily distant fungi.

Related Results

Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
Hubungan Perilaku Pola Makan dengan Kejadian Anak Obesitas
<p><em><span style="font-size: 11.0pt; font-family: 'Times New Roman',serif; mso-fareast-font-family: 'Times New Roman'; mso-ansi-language: EN-US; mso-fareast-langua...
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
Učinak poučavanja razrednomu jeziku u izobrazbi nastavnika njemačkoga
The actual use of classroom language is principally limited to the classroom environment. As far as foreign language learning is concerned, the classroom often turns out to be the ...
Endothelial Protein C Receptor
Endothelial Protein C Receptor
IntroductionThe protein C anticoagulant pathway plays a critical role in the negative regulation of the blood clotting response. The pathway is triggered by thrombin, which allows ...
Perturbation of GABA Biosynthesis Links Cell Cycle to ControlArabidopsis thalianaLeaf Development
Perturbation of GABA Biosynthesis Links Cell Cycle to ControlArabidopsis thalianaLeaf Development
AbstractTo investigate the molecular mechanism underlying increasing leaf area in γ-Aminobutyric acid (GABA) biosynthetic mutants, the first pair of true leaves of GABA biosyntheti...
Modern approaches to enhancing the biosynthetic potential of Streptomyces avermitilis
Modern approaches to enhancing the biosynthetic potential of Streptomyces avermitilis
Streptomyces avermitilis is one of the most intensively studied actinobacteria due to its ability to produce a wide range of bioactive secondary metabolites, including antibiotics ...
ATOMIC: A graph attention neural network for ATOpic dermatitis prediction on human gut MICrobiome
ATOMIC: A graph attention neural network for ATOpic dermatitis prediction on human gut MICrobiome
Abstract Atopic dermatitis (AD) is a chronic inflammatory skin disease driven by complex interactions among genetic, environmental, and microbial...
Compartmentalized Biosynthesis of Mycophenolic Acid
Compartmentalized Biosynthesis of Mycophenolic Acid
Abstract Mycophenolic acid (MPA) from filamentous fungi is the first natural product antibiotic in human history and a first-line immunosuppressi...

Back to Top