Contrastive Analysis and Translation Studies Linked to Text Corpora
Contrastive Analysis and Translation Studies Linked to Text Corpora
Principal investigators
Abstract
In the past year, the research group at CAS has compared original texts and translations of these texts to develop better methods of analysis. The text collection is composed systematically in electronic form, and will be used in linguistic research. The first of such text corpuses was established in the 1960s, the Brown Corpus, and was a collection of American text constituting approx. 1 million words.
Through the development of better computer technology, it has been possible to create even larger text collections, like The Bank of English with hundreds of millions of words, and the new British National Corpus, which includes spontaneous spoken language. Over the years, it has become more common to use electronic texts in historical research of language.
Work on the English-Norwegian parallel corpus began in 1993, and the work continues at CAS. Much of the time has been spent selecting texts for the collection and obtaining permissions from authors and translators. Only after this work is done can researchers compile and analyse the texts.
By comparing texts in different languages, researchers can find similarities and differences in the languages’ vocabulary, syntax and style. This help researchers find characteristic features of certain languages that are difficult to see when working only with that particular language.
In addition to the development of the corpus and the analysis of texts, researchers on the project have also been working on the development of programs for search and parallelisation of texts to be used for many different languages.