Časopis Slovo a slovesnost
en cz

Zachycení výstavby textu v Pražském závislostním korpusu

Šárka Zikánová, Lucie Poláková, Pavlína Jínová, Anna Nedoluzhko, Magdaléna Rysová, Jiří Mírovský, Eva Hajičová

[Články]

(pdf)

Annotation of discourse phenomena in the Prague Dependency Treebank

A B S T R A C T
Language corpora annotation schemes cover various layers of sentence description nowadays – from morphology to semantics. Annotation projects concerning phenomena beyond the sentence boundaries, however, started to attract the attention of corpus linguists only recently. In the present contribution, we describe a unified approach to analysis of discourse phenomena, aimed and developed for a large-scale annotation of Czech empirical data of the Prague Dependency Treebank. This approach is based on two fundamental pillars: (i) it exploits the results of one of the first complex schemes for discourse annotation proposed and realized in the Penn Discourse Treebank for English; (ii) it follows the Praguian Functional Generative Description and treebanking tradition, taking advantage of the tectogrammatical (underlying) layer of sentence analysis and extending it to a full discourse-level description. Our analysis concentrates on two major aspects of discourse coherence: (i) on discourse relations (semantic relations between discourse segments) and discourse connectives as their lexical anchors; and (ii) on coreference and the so-called bridging anaphora. We present a detailed description of the annotation scheme and procedure, address individual problematic issues and offer basic corpus statistics and annotation evaluation.

Key words: text, discourse, phenomena beyond the sentence boundary, discourse relations, discourse connectives, coreference, bridging anaphora, Prague Dependency Treebank
Klíčová slova: text, diskurz, nadvětné jevy, diskurzní vztahy, konektory, koreference, asociační anafora, Pražský závislostní korpus

Daný článek je on-line k dispozici v databázi CEEOL.

Ústav formální a aplikované lingvistiky MFF UK
Malostranské nám. 25, 118 00 Praha 1
zikanova@ufal.mff.cuni.cz
polakova@ufal.mff.cuni.cz
jinova@ufal.mff.cuni.cz
nedoluzhko@ufal.mff.cuni.cz
magdalena.rysova@ufal.mff.cuni.cz
mirovsky@ufal.mff.cuni.cz
hajicova@ufal.mff.cuni.cz

Slovo a slovesnost, ročník 76 (2015), číslo 3, s. 163-197

Předchozí Vít Dovalil: Konec kodifikace? Zpráva z würzburské konference

Následující Václava Kettnerová, Markéta Lopatková, Jarmila Panevová: Shoda doplňku v reflexivních konstrukcích v češtině