Anna Čermáková, Marie Kopřivová
Corpus-based research of spoken language: the state-of-the-art for Czech and English
A B S T R A C T
The article aims to review corpus-based research on spoken language, emphasizing issues in description and conceptualization of the grammar of spoken language in relation to the grammar of written language. The review first briefly looks at the development of spoken corpora, from simply transcribed corpora without sound alignment to today’s sophisticated multi-modal corpora. The main part of the article deals with issues concerning the metalanguage for the description of spoken language, the choice of its basic descriptive unit, the status of basic linguistic categories such as part-of-speech, and typical lexical and grammatical devices. The existing extensive research on spoken English is reviewed and in line with it, illustrative examples based on Czech spoken corpora are provided. These are further contrasted with examples from written data to enhance the inherent differences between spoken and written language and the need to adjust the metalanguage of the description.
Key words: spoken language research, corpus linguistics, spoken Czech, basic descriptive unit for spoken language
Klíčová slova: výzkum mluveného jazyka, korpusová lingvistika, mluvená čeština, jednotka pro popis mluveného jazyka
Daný článek je on-line k dispozici v databázi CEEOL.
Ústav pro jazyk český AV ČR, v. v. i.
Letenská 4, 118 51 Praha 1
Slovo a slovesnost, volume 79 (2018), number 3, pp. 217-240
Previous Radek Skarnitzl: Fonetická realizace slovního přízvuku u delších slov v češtině