Javascript must be enabled to continue!

Named Entity Recognition of an Oversampled and Preprocessed Manufacturing Data Corpus

In recent manufacturing industry, improving the manufacturing process is of paramount importance. One area that holds great potential for enhancement is the application and manipulation of maintenance data. By effectively leveraging this data, manufacturers can optimize maintenance schedules, leading to increased efficiency, reduced costs, and minimized downtime. However, the challenge lies in handling vast amounts of maintenance data that often come in various formats, making it difficult to extract valuable insights. Without proper analysis, this unprocessed data can result in unforeseen issues, costly disruptions, and extended downtime periods. To overcome this obstacle, modern manufacturing companies are turning to advanced technologies such as language modelling, text classification, machine translation, and Named Entity Recognition (NER). To the best of our knowledge, no investigation has been conducted to assess the impact of text preprocessing on NER performance. Improving the initial stage of NER, such as text preprocessing, can enhance NER performance which leads to the training model’s efficiency performance. In this study, Hidden Markov Model (HMM) is employed to improve NER performance by utilizing oversampling and text preprocessing techniques. The study is performed without IOB labelling and consider seven specific entities and the preprocessing text tasks include tokenization, lemmatization, erase punctuation, stop words removal, and elimination of long and short words. As a result, HMM for NER with oversampling and preprocessed text outperformed the one without any of both by 20.10% and 27.59%, respectively, due to consideration of significant classes and words among the entity classes in preprocessed factory reports. This finding highlights the importance of text preprocessing method selection in NER and its capability to optimize maintenance schedule and reduce downtime.

Akademia Baru Publishing

Nurul Hannah Mohd Yusof Nurul Adilla Mohd Subha Nurulaqilla Khamis Norikhwan Hamzah

Journal of Advanced Research in Applied Sciences and Engineering Technology

2023

Title: Named Entity Recognition of an Oversampled and Preprocessed Manufacturing Data Corpus

Description:

In recent manufacturing industry, improving the manufacturing process is of paramount importance.

One area that holds great potential for enhancement is the application and manipulation of maintenance data.

By effectively leveraging this data, manufacturers can optimize maintenance schedules, leading to increased efficiency, reduced costs, and minimized downtime.

However, the challenge lies in handling vast amounts of maintenance data that often come in various formats, making it difficult to extract valuable insights.

Without proper analysis, this unprocessed data can result in unforeseen issues, costly disruptions, and extended downtime periods.

To overcome this obstacle, modern manufacturing companies are turning to advanced technologies such as language modelling, text classification, machine translation, and Named Entity Recognition (NER).

To the best of our knowledge, no investigation has been conducted to assess the impact of text preprocessing on NER performance.

Improving the initial stage of NER, such as text preprocessing, can enhance NER performance which leads to the training model’s efficiency performance.

In this study, Hidden Markov Model (HMM) is employed to improve NER performance by utilizing oversampling and text preprocessing techniques.

The study is performed without IOB labelling and consider seven specific entities and the preprocessing text tasks include tokenization, lemmatization, erase punctuation, stop words removal, and elimination of long and short words.

As a result, HMM for NER with oversampling and preprocessed text outperformed the one without any of both by 20.

10% and 27.

59%, respectively, due to consideration of significant classes and words among the entity classes in preprocessed factory reports.

This finding highlights the importance of text preprocessing method selection in NER and its capability to optimize maintenance schedule and reduce downtime.

Back

Genre implies formal and stylistic conventions of a particular text type, which inevitably affects the translation process. This „force of genre bias“ (Prieto Ramos, 2014) has been...

Efficacy of an Extended Half-Life GlycoPEGylated rFVIII (N8-GP): Pooled Analysis of ABR (Results from Two Clinical Trials)

Abstract Introduction The short half-life of standard factor VIII (FVIII) products means that frequent injections (3 to 4 times/week) are needed for e...

A Phase 1b, Dose-Finding Study Of Ruxolitinib Plus Panobinostat In Patients With Primary Myelofibrosis (PMF), Post–Polycythemia Vera MF (PPV-MF), Or Post–Essential Thrombocythemia MF (PET-MF): Identification Of The Recommended Phase 2 Dose

Abstract Background Myelofibrosis (MF) is a myeloproliferative neoplasm associated with progressive, debilitating symptoms that ...

Concept-based and relation-based corpus navigation : applications of natural language processing in digital humanities

Navigation en corpus fondée sur les concepts et les relations : applications du traitement automatique des langues aux humanités numériques La recherche en Sciences...

HDI Corpus: A Dataset for Named Entity Recognition for In-Context Herb-Drug Interactions

Introduction This article proposes a new dataset for Named Entity Recognition based on PubMed articles and aiming to address the problem of Herb-Drug Interactio...

Dynamics of Mutations in Patients with ET Treated with Imetelstat

Abstract Background: Imetelstat, a first in class specific telomerase inhibitor, induced hematologic responses in all patients (pts) with essential thrombocythemia (...

Unsupervised entity linking using graph-based semantic similarity

Nowadays, the human textual data constitutes a great proportion of the shared information resources such as World Wide Web (WWW). Social networks, news and learning resources as we...

Active learning for Named Entity Recognition in Kannada

<p>Named Entity Recognition (NER) task aims at automatically recognising and classifying named entities in a given natural language input. Majority of the studies related to ...

Email:
Password:

Email:

Named Entity Recognition of an Oversampled and Preprocessed Manufacturing Data Corpus

Related Results