Javascript must be enabled to continue!
Enhancing Graph-based Machine Learning through Lyndon Partial Words
View through CrossRef
Objectives: This study integrates the combinatorial properties of Lyndon partial words with Graph-Based Machine Learning (GBML) to develop an innovative approach for sequence analysis. The research is particularly aimed at addressing challenges in fields like bioinformatics and natural language processing (NLP), where incomplete or fragmented data often hinder effective analysis. By leveraging the minimality and primitiveness inherent to Lyndon partial words, this study seeks to provide a robust framework for modeling and analyzing such data.
Methods: Graphs were constructed from Lyndon partial words, where nodes represent unique partial words or their conjugates, and edges signify relationships such as lexicographical proximity or shared substrings. These graphs were subjected to advanced GBML techniques, including community detection algorithms to uncover clusters of related patterns, and similarity analysis to measure structural and semantic relationships. Data preprocessing ensured the accurate representation of partial words while maintaining their combinatorial integrity.
Findings: The integration of Lyndon partial words into GBML demonstrates significant potential in pattern recognition and structural analysis, particularly for datasets characterized by fragmentation or incompleteness. The constructed graphs effectively capture underlying relationships and patterns, aiding in the discovery of meaningful insights in sequence data. This novel framework enables improved modeling of real-world scenarios, such as identifying recurring motifs in biological sequences or understanding linguistic variations in incomplete text datasets.
Novelty: By combining the theoretical elegance of Lyndon partial words with the computational power of GBML, this study introduces a novel methodology for tackling incomplete data in sequence analysis. The approach highlights the adaptability of combinatorial constructs for solving practical problems, offering new avenues for research in data-intensive domains like bioinformatics and NLP. The framework also underscores the importance of interdisciplinary solutions in advancing machine learning applications for complex and fragmented datasets.
Title: Enhancing Graph-based Machine Learning through Lyndon Partial Words
Description:
Objectives: This study integrates the combinatorial properties of Lyndon partial words with Graph-Based Machine Learning (GBML) to develop an innovative approach for sequence analysis.
The research is particularly aimed at addressing challenges in fields like bioinformatics and natural language processing (NLP), where incomplete or fragmented data often hinder effective analysis.
By leveraging the minimality and primitiveness inherent to Lyndon partial words, this study seeks to provide a robust framework for modeling and analyzing such data.
Methods: Graphs were constructed from Lyndon partial words, where nodes represent unique partial words or their conjugates, and edges signify relationships such as lexicographical proximity or shared substrings.
These graphs were subjected to advanced GBML techniques, including community detection algorithms to uncover clusters of related patterns, and similarity analysis to measure structural and semantic relationships.
Data preprocessing ensured the accurate representation of partial words while maintaining their combinatorial integrity.
Findings: The integration of Lyndon partial words into GBML demonstrates significant potential in pattern recognition and structural analysis, particularly for datasets characterized by fragmentation or incompleteness.
The constructed graphs effectively capture underlying relationships and patterns, aiding in the discovery of meaningful insights in sequence data.
This novel framework enables improved modeling of real-world scenarios, such as identifying recurring motifs in biological sequences or understanding linguistic variations in incomplete text datasets.
Novelty: By combining the theoretical elegance of Lyndon partial words with the computational power of GBML, this study introduces a novel methodology for tackling incomplete data in sequence analysis.
The approach highlights the adaptability of combinatorial constructs for solving practical problems, offering new avenues for research in data-intensive domains like bioinformatics and NLP.
The framework also underscores the importance of interdisciplinary solutions in advancing machine learning applications for complex and fragmented datasets.
Related Results
Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
Selection of Injectable Drug Product Composition using Machine Learning Models (Preprint)
BACKGROUND
As of July 2020, a Web of Science search of “machine learning (ML)” nested within the search of “pharmacokinetics or pharmacodynamics” yielded over 100...
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Bilangan Terhubung Titik Pelangi pada Graf Garis dan Graf Tengah dari Hasil Operasi Comb Graf Bintang C<sub>3</sub> dan Graf Bintang S<sub>n</sub>
Bilangan Terhubung Titik Pelangi pada Graf Garis dan Graf Tengah dari Hasil Operasi Comb Graf Bintang C<sub>3</sub> dan Graf Bintang S<sub>n</sub>
Penelitian ini bertujuan menentukan bilangan terhubung titik pelangi (rainbow vertex connection number) pada graf garis dan graf tengah yang diperoleh dari hasil operasi comb antar...
Lyndon Words and Christoffel Words
Lyndon Words and Christoffel Words
This chapter covers the lexicographical ordering of lower Christoffel words, which is equivalent to the ordering by their slopes (Borel and Laubie). Lower Christoffel words are par...
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study
Abstract
Introduction
The exact manner in which large language models (LLMs) will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, bene...
Računalno potpomognuto usmjeravanje kod dvojezičnih govornika
Računalno potpomognuto usmjeravanje kod dvojezičnih govornika
This thesis investigates whether modern computer models can confirm how people encounter words and then use these findings in didactics. In recent years, computers have been used i...
Bootstrapping a Biodiversity Knowledge Graph
Bootstrapping a Biodiversity Knowledge Graph
The "biodiversity knowledge graph" is a nice metaphor for connecting biodiversity data sources, but can we actually build it? Do we have sufficient linked data available? Given tha...

