Javascript must be enabled to continue!
Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
View through CrossRef
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using deep learning techniques. This study aims to evaluate the effectiveness of different text augmentation strategies—character-level, token-level, and semantic-level—in improving the classification accuracy of these certificates, which are crucial for bridging the skills gap in the digital economy.
Background: Traditional education systems often overlook skills gained through non-formal learning, creating a gap between industry needs and academic qualifications. This paper addresses this by using BERT-based deep learning models to classify non-formal learning certificates, enhanced by text augmentation techniques to improve accuracy in mapping them to formal academic standards.
Methodology: This study employs a deep learning approach using Bidirectional Encoder Representations from Transformers (BERT) to classify non-formal learning certificates into seven core computer science courses. The research utilizes text augmentation techniques at character, token, and semantic levels to improve classification accuracy. A dataset of 525 certificates, collected through data gathering, was preprocessed using Optical Character Recognition (OCR) to extract text from PDF documents, followed by cleaning and augmentation before training the BERT model.
Contribution: This paper addresses the growing need for efficient Recognition of Prior Learning (RPL) in the context of rapidly advancing knowledge, particularly in the AI era, where non-formal learning is becoming increasingly important. We present a novel approach to automating the classification and validation of non-formal learning certificates using deep learning techniques. The study evaluates and compares character-level, token-level, and semantic-level text augmentation methods to improve the accuracy of certificate classification. What sets this research apart is the systematic assessment of which augmentation method best enhances model performance for RPL tasks, providing new insights into optimizing deep learning models for this purpose. The findings aim to reduce human error and improve the efficiency of RPL implementation, offering a scalable solution for better integrating or converting non-formal learning into formal educational systems.
Findings: The study found that token-level augmentations, particularly word insertion and word deletion, significantly improved classification accuracy, with validation accuracies exceeding 88%. Character-level augmentations also contributed to model performance, but with slightly lower accuracy. Semantic-level augmentation via back translation showed the least impact. These results demonstrate that token-level text augmentations offer the most effective strategy for enhancing the classification of non-formal learning certificates in the context of Recognition of Prior Learning (RPL).
Recommendations for Practitioners: Practitioners should focus on token-level text augmentation techniques, like word insertion and deletion, to improve the accuracy of machine learning models for classifying non-formal learning certificates, enabling better integration into formal education and employment pathways.
Recommendation for Researchers: Researchers should explore combining multiple augmentation techniques (e.g., token-level and semantic-level) and investigate advanced models like BERT-large or multilingual variants for improved classification accuracy. Additionally, examining the impact of different OCR tools and preprocessing strategies could further enhance non-formal learning certificate recognition.
Impact on Society: The findings of this study have significant implications for improving access to education and employment opportunities. By enhancing the recognition of prior learning through automated classification of non-formal learning certificates, this research supports a more inclusive and equitable education system. It can help individuals, particularly those with non-traditional educational backgrounds, gain recognition for their skills, ultimately bridging the skills gap in the workforce and promoting lifelong learning in the digital economy.
Future Research: Future research should focus on expanding the dataset to include multilingual certificates, which would enhance the model’s ability to generalize across different languages and cultural contexts. Additionally, researchers could investigate the use of hybrid models that combine BERT with other machine learning techniques to further improve classification accuracy. Exploring the integration of real-world data sources, such as employer-verified work experience and additional non-formal learning formats, could also provide a more comprehensive approach to recognizing prior learning.
Title: Enhancing Non-Formal Learning Certificate Classification with Text Augmentation: A Comparison of Character, Token, and Semantic Approaches
Description:
Aim/Purpose: The purpose of this paper is to address the gap in the recognition of prior learning (RPL) by automating the classification of non-formal learning certificates using deep learning techniques.
This study aims to evaluate the effectiveness of different text augmentation strategies—character-level, token-level, and semantic-level—in improving the classification accuracy of these certificates, which are crucial for bridging the skills gap in the digital economy.
Background: Traditional education systems often overlook skills gained through non-formal learning, creating a gap between industry needs and academic qualifications.
This paper addresses this by using BERT-based deep learning models to classify non-formal learning certificates, enhanced by text augmentation techniques to improve accuracy in mapping them to formal academic standards.
Methodology: This study employs a deep learning approach using Bidirectional Encoder Representations from Transformers (BERT) to classify non-formal learning certificates into seven core computer science courses.
The research utilizes text augmentation techniques at character, token, and semantic levels to improve classification accuracy.
A dataset of 525 certificates, collected through data gathering, was preprocessed using Optical Character Recognition (OCR) to extract text from PDF documents, followed by cleaning and augmentation before training the BERT model.
Contribution: This paper addresses the growing need for efficient Recognition of Prior Learning (RPL) in the context of rapidly advancing knowledge, particularly in the AI era, where non-formal learning is becoming increasingly important.
We present a novel approach to automating the classification and validation of non-formal learning certificates using deep learning techniques.
The study evaluates and compares character-level, token-level, and semantic-level text augmentation methods to improve the accuracy of certificate classification.
What sets this research apart is the systematic assessment of which augmentation method best enhances model performance for RPL tasks, providing new insights into optimizing deep learning models for this purpose.
The findings aim to reduce human error and improve the efficiency of RPL implementation, offering a scalable solution for better integrating or converting non-formal learning into formal educational systems.
Findings: The study found that token-level augmentations, particularly word insertion and word deletion, significantly improved classification accuracy, with validation accuracies exceeding 88%.
Character-level augmentations also contributed to model performance, but with slightly lower accuracy.
Semantic-level augmentation via back translation showed the least impact.
These results demonstrate that token-level text augmentations offer the most effective strategy for enhancing the classification of non-formal learning certificates in the context of Recognition of Prior Learning (RPL).
Recommendations for Practitioners: Practitioners should focus on token-level text augmentation techniques, like word insertion and deletion, to improve the accuracy of machine learning models for classifying non-formal learning certificates, enabling better integration into formal education and employment pathways.
Recommendation for Researchers: Researchers should explore combining multiple augmentation techniques (e.
g.
, token-level and semantic-level) and investigate advanced models like BERT-large or multilingual variants for improved classification accuracy.
Additionally, examining the impact of different OCR tools and preprocessing strategies could further enhance non-formal learning certificate recognition.
Impact on Society: The findings of this study have significant implications for improving access to education and employment opportunities.
By enhancing the recognition of prior learning through automated classification of non-formal learning certificates, this research supports a more inclusive and equitable education system.
It can help individuals, particularly those with non-traditional educational backgrounds, gain recognition for their skills, ultimately bridging the skills gap in the workforce and promoting lifelong learning in the digital economy.
Future Research: Future research should focus on expanding the dataset to include multilingual certificates, which would enhance the model’s ability to generalize across different languages and cultural contexts.
Additionally, researchers could investigate the use of hybrid models that combine BERT with other machine learning techniques to further improve classification accuracy.
Exploring the integration of real-world data sources, such as employer-verified work experience and additional non-formal learning formats, could also provide a more comprehensive approach to recognizing prior learning.
Related Results
A Semantic Orthogonal Mapping Method Through Deep-Learning for Semantic Computing
A Semantic Orthogonal Mapping Method Through Deep-Learning for Semantic Computing
In order to realize an artificial intelligent system, a basic mechanism should be provided for expressing and processing the semantic. We have presented semantic computing models i...
Improving Medical Document Classification via Feature Engineering
Improving Medical Document Classification via Feature Engineering
<p dir="ltr">Document classification (DC) is the task of assigning the predefined labels to unseen documents by utilizing the model trained on the available labeled documents...
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
Analisis Hukum Terhadap Jaminan Sertifikat Tanah yang Bukan Milik Sendiri Berdasarkan Perjanjian Pinjam Pakai dalam KUH Perdata
Analisis Hukum Terhadap Jaminan Sertifikat Tanah yang Bukan Milik Sendiri Berdasarkan Perjanjian Pinjam Pakai dalam KUH Perdata
This research aims to find out and analyze the form of a loan-to-use agreement for a certificate that is not one's own as collateral in the Civil Code and to find out and analyze t...
Integrating Character Education on Physics Courses with Schoology Based E-learning
Integrating Character Education on Physics Courses with Schoology Based E-learning
Aim/Purpose: This study intends to find out the difference between the use of Schoology-based e-learning and conventional learning by integrating character education in the learnin...
Text Data Augmentation for Deep Learning
Text Data Augmentation for Deep Learning
Abstract
Natural Language Processing (NLP) is one of the most captivating applications of Deep Learning. In this survey, we consider how the Data Augmentation training stra...
Cidade educativa e movimentos culturais: um ensaio da educação não formal no ensino superior (p.221-239)
Cidade educativa e movimentos culturais: um ensaio da educação não formal no ensino superior (p.221-239)
Este artigo tem como propósito apontar maneiras de pensar e praticar a educação não formal em um curso de graduação em Pedagogia e colaborar para a formação do futuro profissional ...

