Javascript must be enabled to continue!
Named Entity Recognition of an Oversampled and Preprocessed Manufacturing Data Corpus
View through CrossRef
In recent manufacturing industry, improving the manufacturing process is of paramount importance. One area that holds great potential for enhancement is the application and manipulation of maintenance data. By effectively leveraging this data, manufacturers can optimize maintenance schedules, leading to increased efficiency, reduced costs, and minimized downtime. However, the challenge lies in handling vast amounts of maintenance data that often come in various formats, making it difficult to extract valuable insights. Without proper analysis, this unprocessed data can result in unforeseen issues, costly disruptions, and extended downtime periods. To overcome this obstacle, modern manufacturing companies are turning to advanced technologies such as language modelling, text classification, machine translation, and Named Entity Recognition (NER). To the best of our knowledge, no investigation has been conducted to assess the impact of text preprocessing on NER performance. Improving the initial stage of NER, such as text preprocessing, can enhance NER performance which leads to the training model’s efficiency performance. In this study, Hidden Markov Model (HMM) is employed to improve NER performance by utilizing oversampling and text preprocessing techniques. The study is performed without IOB labelling and consider seven specific entities and the preprocessing text tasks include tokenization, lemmatization, erase punctuation, stop words removal, and elimination of long and short words. As a result, HMM for NER with oversampling and preprocessed text outperformed the one without any of both by 20.10% and 27.59%, respectively, due to consideration of significant classes and words among the entity classes in preprocessed factory reports. This finding highlights the importance of text preprocessing method selection in NER and its capability to optimize maintenance schedule and reduce downtime.
Title: Named Entity Recognition of an Oversampled and Preprocessed Manufacturing Data Corpus
Description:
In recent manufacturing industry, improving the manufacturing process is of paramount importance.
One area that holds great potential for enhancement is the application and manipulation of maintenance data.
By effectively leveraging this data, manufacturers can optimize maintenance schedules, leading to increased efficiency, reduced costs, and minimized downtime.
However, the challenge lies in handling vast amounts of maintenance data that often come in various formats, making it difficult to extract valuable insights.
Without proper analysis, this unprocessed data can result in unforeseen issues, costly disruptions, and extended downtime periods.
To overcome this obstacle, modern manufacturing companies are turning to advanced technologies such as language modelling, text classification, machine translation, and Named Entity Recognition (NER).
To the best of our knowledge, no investigation has been conducted to assess the impact of text preprocessing on NER performance.
Improving the initial stage of NER, such as text preprocessing, can enhance NER performance which leads to the training model’s efficiency performance.
In this study, Hidden Markov Model (HMM) is employed to improve NER performance by utilizing oversampling and text preprocessing techniques.
The study is performed without IOB labelling and consider seven specific entities and the preprocessing text tasks include tokenization, lemmatization, erase punctuation, stop words removal, and elimination of long and short words.
As a result, HMM for NER with oversampling and preprocessed text outperformed the one without any of both by 20.
10% and 27.
59%, respectively, due to consideration of significant classes and words among the entity classes in preprocessed factory reports.
This finding highlights the importance of text preprocessing method selection in NER and its capability to optimize maintenance schedule and reduce downtime.
Related Results
Žanrovska analiza pomorskopravnih tekstova i ostvarenje prijevodnih univerzalija u njihovim prijevodima s engleskoga jezika
Žanrovska analiza pomorskopravnih tekstova i ostvarenje prijevodnih univerzalija u njihovim prijevodima s engleskoga jezika
Genre implies formal and stylistic conventions of a particular text type, which inevitably affects the translation process. This „force of genre bias“ (Prieto Ramos, 2014) has been...
Efficacy of an Extended Half-Life GlycoPEGylated rFVIII (N8-GP): Pooled Analysis of ABR (Results from Two Clinical Trials)
Efficacy of an Extended Half-Life GlycoPEGylated rFVIII (N8-GP): Pooled Analysis of ABR (Results from Two Clinical Trials)
Abstract
Introduction
The short half-life of standard factor VIII (FVIII) products means that frequent injections (3 to 4 times/week) are needed for e...
A Phase 1b, Dose-Finding Study Of Ruxolitinib Plus Panobinostat In Patients With Primary Myelofibrosis (PMF), Post–Polycythemia Vera MF (PPV-MF), Or Post–Essential Thrombocythemia MF (PET-MF): Identification Of The Recommended Phase 2 Dose
A Phase 1b, Dose-Finding Study Of Ruxolitinib Plus Panobinostat In Patients With Primary Myelofibrosis (PMF), Post–Polycythemia Vera MF (PPV-MF), Or Post–Essential Thrombocythemia MF (PET-MF): Identification Of The Recommended Phase 2 Dose
Abstract
Background
Myelofibrosis (MF) is a myeloproliferative neoplasm associated with progressive, debilitating symptoms that ...
Concept-based and relation-based corpus navigation : applications of natural language processing in digital humanities
Concept-based and relation-based corpus navigation : applications of natural language processing in digital humanities
Navigation en corpus fondée sur les concepts et les relations : applications du traitement automatique des langues aux humanités numériques
La recherche en Sciences...
HDI Corpus: A Dataset for Named Entity Recognition for In-Context Herb-Drug Interactions
HDI Corpus: A Dataset for Named Entity Recognition for In-Context Herb-Drug Interactions
Introduction
This article proposes a new dataset for Named Entity Recognition based on PubMed articles and aiming to address the problem of Herb-Drug Interactio...
Dynamics of Mutations in Patients with ET Treated with Imetelstat
Dynamics of Mutations in Patients with ET Treated with Imetelstat
Abstract
Background: Imetelstat, a first in class specific telomerase inhibitor, induced hematologic responses in all patients (pts) with essential thrombocythemia (...
Unsupervised entity linking using graph-based semantic similarity
Unsupervised entity linking using graph-based semantic similarity
Nowadays, the human textual data constitutes a great proportion of the shared information resources such as World Wide Web (WWW). Social networks, news and learning resources as we...
Active learning for Named Entity Recognition in Kannada
Active learning for Named Entity Recognition in Kannada
<p>Named Entity Recognition (NER) task aims at automatically recognising and classifying named entities in a given natural language input. Majority of the studies related to ...

