Javascript must be enabled to continue!
A Lite Romanian BERT: ALR-BERT
View through CrossRef
Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP). There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances. However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU. In addition to this, such models are mostly available in English or a shared multilingual structure. Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”. Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT. Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters. We also intend to distribute our code and model as an open source together with the downstream task.
Title: A Lite Romanian BERT: ALR-BERT
Description:
Large-scale pre-trained language representation and its promising performance in various downstream applications have become an area of interest in the field of natural language processing (NLP).
There has been huge interest in further increasing the model’s size in order to outperform the best previously obtained performances.
However, at some point, increasing the model’s parameters may lead to reaching its saturation point due to the limited capacity of GPU/TPU.
In addition to this, such models are mostly available in English or a shared multilingual structure.
Hence, in this paper, we propose a lite BERT trained on a large corpus solely in the Romanian language, which we called “A Lite Romanian BERT (ALR-BERT)”.
Based on comprehensive empirical results, ALR-BERT produces models that scale far better than the original Romanian BERT.
Alongside presenting the performance on downstream tasks, we detail the analysis of the training process and its parameters.
We also intend to distribute our code and model as an open source together with the downstream task.
Related Results
Analisis SWOT Mobile Dictionary Pleco dan Hanping Lite
Analisis SWOT Mobile Dictionary Pleco dan Hanping Lite
Penelitian berjudul “Analisis SWOT Mobile Dictionary Pleco dan Hanping Lite†dirancang sebagai pedoman pengguna untuk menentukan Mobile Dictionary yang sesuai dengan kebutuhan ...
Over-Sampling Effect in Pre-Training for Bidirectional Encoder Representations from Transformers (BERT) to Localize Medical BERT and Enhance Biomedical BERT (Preprint)
Over-Sampling Effect in Pre-Training for Bidirectional Encoder Representations from Transformers (BERT) to Localize Medical BERT and Enhance Biomedical BERT (Preprint)
BACKGROUND
Pre-training large-scale neural language models on raw texts has made a significant contribution to improving transfer learning in natural langua...
The Collector Journal for Swedish Literature Science Research
The Collector Journal for Swedish Literature Science Research
Johan Svedjedal, Rymden och tvåkronan. Karin Boyes För lite och den författarsociala debatten. (Space and the Two-Crown Coin: Karin Boye’s För lite and the debate on authors’ socia...
Vega-Lite: A Grammar of Interactive Graphics
Vega-Lite: A Grammar of Interactive Graphics
We present Vega-Lite, a high-level grammar that enables rapid specification of interactive data visualizations. Vega-Lite combines a traditional grammar of graphics, providing visu...
Romanian Art Historiography in the Interwar Period. Between the Search for Scholarship and Commitment to a Cause
Romanian Art Historiography in the Interwar Period. Between the Search for Scholarship and Commitment to a Cause
At the end of World War I, Romania emerged as a much stronger nation, with a greatly enlarged territory. During the two world wars, the Romanian state was permanently looking for t...
A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT
A Pre-Training Technique to Localize Medical BERT and to Enhance Biomedical BERT
Abstract
Background: Pre-training large-scale neural language models on raw texts has been shown to make a significant contribution to a strategy for transfer learning in n...
Homeland in Romanian children’s literature written in the Diaspora
Homeland in Romanian children’s literature written in the Diaspora
Romanian children’s literature has always been situated at the crossways of cultural ideologies. The Romanian texts for children lack innocence due to the implicit level of cultura...
Prevention and management of allergic-like reactions to iodine contrast media: a best practice implementation project
Prevention and management of allergic-like reactions to iodine contrast media: a best practice implementation project
ABSTRACT
Introduction:
With the wide application of iodine contrast media (ICM), the occurrence of allergic-like reactions to iodine contrast med...

