Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Unrestricted Character Encoding for Japanese

View through CrossRef
The glyphs of the Japanese writing system mainly consist of Chinese characters, and there are tens of thousands of such characters. Because of the amount of characters involved, glyph database creation and character representation in general on computer systems has been the focus of numerous researches and various software systems. Character information is usually represented in a computer system by an encoding. Some encodings target specifically Chinese characters: this is the case for instance of Big-5 and Shift-JIS. Tere are also encodings that aim at covering several, possibly all, writing systems: this the case for instance of Unicode. However, whichever the solution adopted, a significant part of Chinese characters remain uncovered by the current encoding methods. Thanks to the properties and relations featured by Chinese characters, they can be classified into a database with respect to various attributes. First, the formal structure of such a database is described in this paper as a character encoding, thus addressing the character representation issue. Importantly, we show that the proposed logical structure overcome the limitations of existing encodings, most notably the glyph number restriction and the lack of coherency in the code. This theoretical proposal will then be followed by the practical realisation of the proposed database and the visualisation of the corresponding code structure. Finally, an additional experiment is conducted to measure the memory size overhead that is induced by the proposed encoding, comparing with the memory size required by an implementation of Unicode. Once the files are compressed, the memory size overhead is significantly reduced.
Title: Unrestricted Character Encoding for Japanese
Description:
The glyphs of the Japanese writing system mainly consist of Chinese characters, and there are tens of thousands of such characters.
Because of the amount of characters involved, glyph database creation and character representation in general on computer systems has been the focus of numerous researches and various software systems.
Character information is usually represented in a computer system by an encoding.
Some encodings target specifically Chinese characters: this is the case for instance of Big-5 and Shift-JIS.
Tere are also encodings that aim at covering several, possibly all, writing systems: this the case for instance of Unicode.
However, whichever the solution adopted, a significant part of Chinese characters remain uncovered by the current encoding methods.
Thanks to the properties and relations featured by Chinese characters, they can be classified into a database with respect to various attributes.
First, the formal structure of such a database is described in this paper as a character encoding, thus addressing the character representation issue.
Importantly, we show that the proposed logical structure overcome the limitations of existing encodings, most notably the glyph number restriction and the lack of coherency in the code.
This theoretical proposal will then be followed by the practical realisation of the proposed database and the visualisation of the corresponding code structure.
Finally, an additional experiment is conducted to measure the memory size overhead that is induced by the proposed encoding, comparing with the memory size required by an implementation of Unicode.
Once the files are compressed, the memory size overhead is significantly reduced.

Related Results

Zero to hero
Zero to hero
Western images of Japan tell a seemingly incongruous story of love, sex and marriage – one full of contradictions and conflicting moral codes. We sometimes hear intriguing stories ...
Transcriptomics extract the key chromium resistance genes of Cellulomonas
Transcriptomics extract the key chromium resistance genes of Cellulomonas
Abstract Cellulomonas fimi Clb-11 can reduce high toxic Cr (VI) to low toxic Cr (III). In this study, transcriptomics was used to analyze the key genes, which was involved ...
Foreword
Foreword
This issue consists of a special report on the Japanese concept of "characters." Since the beginning of this millennium, there has been active discussion of "characters," with a st...
Evaluating Binary Encoding Techniques in The Presence of Missing Values in Privacy-Preserving Record Linkage
Evaluating Binary Encoding Techniques in The Presence of Missing Values in Privacy-Preserving Record Linkage
IntroductionApplications in domains ranging from healthcare to national security increasingly require records about individuals in sensitive databases to be linked in privacy-prese...
Analisis Nilai Pendidikan Karakter Tokoh Utama dalam Novel Yang Telah Lama Pergi Karya Tere Liye
Analisis Nilai Pendidikan Karakter Tokoh Utama dalam Novel Yang Telah Lama Pergi Karya Tere Liye
This study aims to uncover the character education values reflected in the main character in Tere Liye's novel Yang Sudah Lama Pergi (The Long Gone). The focus of the study is to a...
Students’ Attitude Toward Character Building Courses at Bina Nusantara University
Students’ Attitude Toward Character Building Courses at Bina Nusantara University
Character development is required for students so they can bring benefits to society in the future. Character development in Bina Nusantara University has been carried out since 20...
Japanese American Buddhism
Japanese American Buddhism
Japanese Buddhism was introduced to the United States at the Parliament of World Religions in Chicago in 1893, but the development of Japanese American Buddhism, also known as Nikk...
History of Japanese Labor and Production Management
History of Japanese Labor and Production Management
Tracking with Japan’s macroeconomic fortunes since World War II, global interest in Japanese management practices emerged in the 1950s with the start of Japan’s “miracle economy,” ...

Back to Top