Javascript must be enabled to continue!
Dutch Parallel Corpus: A Balanced Copyright-Cleared Parallel Corpus
View through CrossRef
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and English consisting of more than ten million words. The corpus contains five different text types and is balanced with respect to text type and translation direction. All texts included in the corpus have been cleared from copyright. We discuss the importance of parallel corpora in various research domains and contrast the Dutch Parallel Corpus with existing parallel corpora. The Dutch Parallel Corpus distinguishes itself from other parallel corpora by having a balanced composition and by its availability to the wide research community, thanks to its copyright clearance. All texts in the corpus are sentence-aligned and further enriched with basic linguistic annotations (lemmas and word class information). Approximately 25,000 words of the Dutch-English part have been manually aligned at the sub-sentential level. Rich metadata facilitates the navigability of the corpus and enables users to select the texts that satisfy their needs. The entire corpus is released as full texts in XML format and is also available via a web interface, which supports basic and complex search queries and presents the results as parallel concordances. The corpus will be distributed by the Flemish-Dutch Human Language Technology Agency (TST-Centrale).
Title: Dutch Parallel Corpus: A Balanced Copyright-Cleared Parallel Corpus
Description:
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and English consisting of more than ten million words.
The corpus contains five different text types and is balanced with respect to text type and translation direction.
All texts included in the corpus have been cleared from copyright.
We discuss the importance of parallel corpora in various research domains and contrast the Dutch Parallel Corpus with existing parallel corpora.
The Dutch Parallel Corpus distinguishes itself from other parallel corpora by having a balanced composition and by its availability to the wide research community, thanks to its copyright clearance.
All texts in the corpus are sentence-aligned and further enriched with basic linguistic annotations (lemmas and word class information).
Approximately 25,000 words of the Dutch-English part have been manually aligned at the sub-sentential level.
Rich metadata facilitates the navigability of the corpus and enables users to select the texts that satisfy their needs.
The entire corpus is released as full texts in XML format and is also available via a web interface, which supports basic and complex search queries and presents the results as parallel concordances.
The corpus will be distributed by the Flemish-Dutch Human Language Technology Agency (TST-Centrale).
Related Results
Authorship in Croatian copyright legislation from 1846 to 2007
Authorship in Croatian copyright legislation from 1846 to 2007
The aim of this paper is to investigate and present concepts of the author and his/her copyright work in copyright legislation that entered into force in Croatia from 1846 to 2007....
Žanrovska analiza pomorskopravnih tekstova i ostvarenje prijevodnih univerzalija u njihovim prijevodima s engleskoga jezika
Žanrovska analiza pomorskopravnih tekstova i ostvarenje prijevodnih univerzalija u njihovim prijevodima s engleskoga jezika
Genre implies formal and stylistic conventions of a particular text type, which inevitably affects the translation process. This „force of genre bias“ (Prieto Ramos, 2014) has been...
Recentering Creativity in Copyright Law Discourse
Recentering Creativity in Copyright Law Discourse
Copyright discourse often centres around creativity; as a rationale for copyright, and as a threshold for copyright to subsist in songs, books, art and other creative works. Yet cr...
Playing with Copyright
Playing with Copyright
Copyright education has become an important aspect of librarians’ information literacy and scholarly communications activities. These include providing support and delivering teach...
Copyright's Free Speech Burdens
Copyright's Free Speech Burdens
Abstract
This chapter examines more precisely when and how copyright does—and does not—burden speech. We can divide copyright's speech burdens into three distinct, y...
Remaking Copyright in the First Amendment's Image
Remaking Copyright in the First Amendment's Image
Abstract
Courts are not the only realm in which First Amendment values should come into play in defining and delimiting copyright. Concern over copyright's speech‐bu...
Foreing and domestic experience in protecting intellectual property right to jewelry and jewelry
Foreing and domestic experience in protecting intellectual property right to jewelry and jewelry
Key words: copyright, jewelry, bijouterie, unfair competition, trademark, litigation
Fedorova N. Foreing and domestic experience in protecting intellectual property right to jewel...
Copyright's Paradox
Copyright's Paradox
Abstract
Copyright is at once an engine of free expression and impediment to free expression. Copyright law underwrites much literature, journalism, music, art, and ...

