Časopis Slovo a slovesnost
en cz

Lexis in spoken and written language

František Čermák



Slovník v mluveném a psaném jazyce

1. Introduction.

It is difficult to say which of the two competing modes of la parole, i.e. speech and writing, is used more, should the decision of the primacy of the former or latter depend on its function and place in communication only. Fortunately it doesn’t. Yet the question is raised again and again (notably by Vachek), though it is evident that any definite pronouncement in favour of this or that has to be conditional and linked to a particular point of view. For a number of reasons, however, it seems that de Saussure’s view, positing unequivocally the spoken language to the forefront, is both valid and worth of stressing again.

A major new reason for this view to have appeared since de Saussure’s times is undoubtedly a disturbing loss of balanced perspective, which is due to a linguist’s everyday preoccupation with written texts mostly and, as a consequence of this, to his subconscious and mistaken attributing of the undisputed formal inertia and stability of written texts to spoken language as well. A linguist who never bothers to first get and then analyze authentic spoken texts becomes never aware of the constant, gradual change and flow of the spoken language, either. The false impression of the language which he thus gets is one of something stable, comfortably clear–cut to enable far–reaching theory making, where, among other things, a stable graphic (orthographic) form, often rigorously codified, is seemingly of utmost importance. But this is not the whole language and the linguist is nothing more but a student of one of the language modes only, then, spoken language being substantially different from what he or she is preoccupied with. Let’s recall that de Saussure saw the source of language change in la parole, i. e. spoken language, making it thus a bridge between synchrony and diachrony, a fact which has been understood only rarely. A rather telling warning he might have said only yesterday, were he alive, that ”writing is no condition for language stability” (de Saussure, 1989, p. 59), suggests that much of the current writing–prone linguistics deserves revision.

In what follows and against the background of this situation, some of the aspects and differences between spoken and written texts will be mentioned in so far as their lexis is concerned. Here, a major as well as general task is to find out what exactly, i. e. what type of factors and lexical devices, and to what extent, may be safely assumed to shape communication once its either written or spoken, literary or oral mode is chosen. A necessary subdivision of these factors and devices would follow the line dividing what is obligatory from what is optional with each mode.

In trying to mark out what is relevant here, it is useful to ask such questions as the following:


What are the factors influencing or predetermining choice of lexical means related to these modes?


Is the choice of lexis the same in both written and spoken mode?


If not, what is the difference and its degree?


Are there lexemes specific of one mode only?


If the answer is yes, what types belong here?


2. Factors and devices.

Factors motivating choice of specific language devices (lexemes and some suprasegmental means in our case) are largely due to a particular type of the spoken or written communication. While no exact border line between the two may be given, it [174]seems that a broad starting point for basically distinguishing these may be sought in the condition hic et nunc and factors or devices which are or are not related to that condition. Largely because written texts are much more explored, the following remarks will deal with the spoken language mostly, i. e. with some of those factors and phenomena that are positively related to the above hic et nunc condition. However, prosodic and extralinguistic factors will have to be left out. Language devices, based on these factors, fall into three loose types or aspects, namely situational, semiotic and linguistic proper.


3. Situational aspects.


This is an underexplored field with many phenomena and devices, which have not been even given a proper name always. It is a part of lexis in so far as its devices are stable language designations committed to memory and extracted from it if needed. Stereotypes are needed in (a) haste, speed for greater expediency of communication, in (b) ritual situations, but also (c) when one does not want, does not bother or cannot name things and thoughts by their right names. These stereotypes, being identical largely with idioms (and phrasemes) and having both a sentential character and phrasal one, are basically of two types. The first are social interaction devices, the second includes topic and discourse devices, such as (1) greetings and closings (how are you, be seeing you), politeness routines (thank you, if you don’t mind), contact means (you know, you see), situational responses (fair enough, sure, quite), (2) attitude pointers (I mean, by no means, on the contrary), reinforcers (ok, and then what he did…, to come back to my question) or hedges (sort of thing). It should be noted that there is no sharp boundary between multiword stereotypes and single word lexems, namely particles.



Only if taken in their original meaning of fillers of an empty space or gap within a construction, metrical line or sentence of a known length, can expletives survive as a term, perhaps. Yet, if one looks closer at them, most of their alleged instances are no expletives at all, as they serve a perfectly legitimate function, though, perhaps, not strictly a syntactical one. Their substance is thus to be seen, as with clichés, in a clearly felt overuse (in this sense only) and unnecessary occurrence of phrases, idioms and perhaps other traditional types of expression (you know). Of course, there is no avoiding a degree of subjectivism in assessing the use of expletives, especially in the spoken language.


4. Semiotic aspects.

Semiotically, there are two basic types of substitutes in language. These substitute nominations, acquiring both single and compound forms, are distinguished by their function. Thus any second, third, fourth occurrence and so on of, for example, the word pen in text is either signalized by usual, standard pronominal forms it, that, etc., if we refer back to it, or, less usually, by anything the speaker may think of as appropriate in the situation, starting from the thing, the instrument, the contraption and having no definite limit except his or her imagination. While these devices, i. e. basically pronouns on the one hand and implicit, indirect and sometimes vague names (designations), or interpretive labels on the other hand, have the common function of substitution and deixis, it is in their relation to the particular referent (pen) that they differ.


Interpretive labels.

These are nouns, acquiring the role of a kind of pronoun for a short while, which serve as substitutes for, i. e. as indirect designations in place of the appropriate noun introduced in the beginning of their particular reference. Interpretive labels are never pure substitutes or indexes as their lexical meaning is never abandoned, though it may [175]be quite general in itself (the thing). But the interpretation of the original noun may go still further and move along two lines. The first kind of interpretation is based on systematic features of the original noun, like the one represented by the choice of the instrument instead of the original noun (pen is an instrument). The second kind of interpretation, often evaluative, is based on situational aspects, such as dys/function (piece of junk, or krám in Czech), value (treasure, or poklad in Czech), standard or nonstandard role (contraption, gadget) etc.

These substitutes differ in form, too. They may be single words but also long descriptive combinations of words, typical in case of an introductory description or in need of a precise scientific term. In so far as the broad categorial and semipronominal type represented by the thing is replaced by other nouns, the nature of these labels changes from (more) implicit to (more) explicit, the explicit standard being that of the original noun, of course. Moreover, many of these labels are rather vague in their reference, as for example, idioms like just what the doctor ordered, a shot in the dark. The explicit vs. implicit labels seem to largely, though perhaps not fully, coincide with the use of direct vs. indirect labels (designations) of the referent. How they differ from each other, is a matter of further research, yet. In text, this vagueness is reduced or disappears, however, i. e. when the specific reference takes over.

It is clear that there is a difference in the distribution of these labels, too, spanning from explicit terminological texts with hardly any implicit or vague designations to intimate dialogues, full of implicit, vague and indirect hints and pointers, firmly anchored in the situation and the mutual background knowledge.



Being very general in their reference, there are very few of them. Indexicals are thus part of grammar and fall, accordingly, outside the scope of these remarks. Yet some interesting though minor extensions might be found here, too. Thus such hedges as sort of thing both delimit something and remain comfortably vague at the same time. An extension of the personal pronouns may be seen in such examples as my humble self or Czech moje maličkost, etc.


5. Linguistic aspects.

The preceding remarks on situational and semiotic aspects (both are, in fact, semiotic, though not of the same order) were concerned with the use and relation of the language designation to its referent. If we, finally, address the linguistic side of this, too, our attention will have to be, briefly, redirected to lexis in general, i. e. to word classes, idioms and terms, primarily.


Word classes.

While the lexis of written discourse is more rich, complex and varied, generally, the spoken lexis tends to be economical on length, somewhat looser in its links and ways of combination but also to be resorting, at times, to repetition. Leaving aside such obvious features of the use of the spoken language lexis as a high representation of evaluative adjectives, adverbs and diminutives, I will mention two other features which seem to be typical. The first is a change in the proportion of pronouns vs. nouns. Czech frequency dictionary (Jelínek – Bečka – Těšitelová, 1961) gives the ratio of 11 % : 28 % for the written language while a sample from a spoken corpus shows a remarkable drop to 11 % against 10 %. While the latter figures may not be quite reliable, a substantial drop in the representation of nouns in the spoken language is to be generally expected, anyway. This is basically corroborated by the Dutch frequency figures, for instance, reading 24 % : 23 % for the written language, and 26 % against 13 % for the spoken language (Boogaart, 1975).

[176]The second feature to be stressed is a remarkably high representation of particles in the spoken texts, as seen against the written ones. Yet particles are basically a terra incognita, partly due to uncertainty as to what they might be, and to the fact that a description of this class is yet to be made. A preliminary analysis of a spoken corpus of Czech indicates to be as much as 8 % of particles found in informal dialogues, which is a very high figure, indeed.


Idioms and multiword terms.

In a sense, single and multiword lexemes lie at the opposite ends of a continuum. What is between the two extremes is largely unexplored, since any judgment here should be based on our knowledge of real occurrence of words, and that has begun to be possible only recently, within the framework of corpus studies. It is to be expected, however, that much of what has traditionally been viewed as syntactic will be revaluated as lexical, since the degree of fixedness in language use might be found to be far greater than assumed up to now (see also Tannen, 1982, p. 6). No matter what we choose to call this no man’s land, i. e. prototypical expressions, collocations, clichés, idioms, or just valency markers, it is admittedly no product of chance, resulting in fortuitous combinations; it has very little to do with freedom of syntactic choice.

Lexical idioms and multiword terms are typical stereotypes, too. Also these tend to appear at opposite ends of a scale: idioms are much more spoken in their character (in the peculiar Czech situation, perhaps over 90 %) while terms occur more in written texts, though not exclusively. Both types of devices have drawn much attention and there is a bulk of literature dealing with the topic. However, at least one point might be worth adding. While there is apparently no limit in the amount of terms a text might contain (which depends on the kind of the text), there seems to be a limit to the use of idioms, however. Although this point requires further research, it seems that a Czech text containing, on the average, one idiom per one hundred words is quite rich in idioms. The question of the maximum is difficult to answer precisely, of course, but it might be safe to estimate that no text contains, on the average, one idiom per ten words.


6. Conclusion.

Factors influencing choice of lexis are situational, semiotic and linguistic. While the answer to the second and third question raised above (in 1.) is negative in both cases, a major difference is to be sought in the degree of representation and use of various devices. However, these devices seem to be formal in their nature only, and any search for semantic, such as abstract–nonabstract, or topical differences seems to be doubtful.




Altenberg, B.: Spoken English and the dictionary. In: J. Svartvik (ed.), The London–Lund Corpus English: Description and Research. Lund U. P., Lund 1990, p. 117–191.

Boogaart, P. C. Uit den (ed.), (Werkgroep Frequentie–Onderzoek van het Nederlands): Woordfrequenties in geschreven en gesproken Nederlands. Oosthoek Scheltema Holkema, Utrecht 1975.

Čermák, F.: Relations of spoken and written Czech (with special reference to the varying degree of acceptability of spoken elements in written language). Wiener Slawistischer Almanach, Band 20, 1987, s. 133–150.

Čermák, F.: Spoken Czech. In: Variation in Language (in print).

Derrida G.: Of Grammatology. Johns Hopkins Press, Baltimore 1976.

[177]Goetz, P. W. (ed.): Writing. In: The New Encyclopedia Britannica in 32 Volumes. 15th ed. Chicago etc. Encyclopedia Britannica 1989, p. 1046–1097.

Jelínek, J. – Bečka, J. V. – Těšitelová, M.: Frekvence slov, slovních druhů a tvarů v českém jazyce. SPN, Praha 1961.

Olson, D. R. – Torrance, N. – Hildyard, A. (ed.): Literacy, Language and Learning: The Nature and Consequences of Reading and Writing, 1985.

Sampson, G.: Writing Systems: A Linguistic Introduction. Hutchison, London 1985.

Saussure, F. de: Kurs obecné lingvistiky. Odeon, Praha 1989 (Cours de linguistique générale).

Tannen, D.: The oral/literate continuum in discourse. In: D. Tannen (ed.) 1982, p. 1–16.

Tannen, D. (ed.): Spoken and Written Language. Exploring Orality and Literacy. Ablex Publ. Corp., Norwood, New Jersey 1982.

Tannen, N. D. (ed.): Coherence in Spoken and Written Discourse. Ablex Publ. Corp., Norwood, New Jersey 1984.

Vachek, J.: Written Language. Mouton, The Hague 1983.



Slovník v mluveném a psaném jazyce

Na pozadí faktorů ovlivňujících volbu těch lexikálních prostředků, jejichž využití je jiné v psaném a mluveném jazyce a které se především řídí svou souvztažností k potřebám hic et nunc či její absencí, se uvažují (1) situační, (2) sémiotické a (3) lingvistické aspekty těchto prostředků. Blíže se (1) stručně rozebírají stereotypy a expletiva, (2) uvažují se standardní a nestandardní (interpretační) formy substitutů a indexální prostředky jiné než pronominální. Z poslední oblasti (3) se reprezentace lexikálních prostředků, především mluvených, uvažuje z hlediska zastoupenosti a poměru slovních druhů, frazémů a termínů. Hlavní rozdíly v povaze mluveného a psaného slovníku je třeba spatřovat v odlišném poměru zastoupení uvažovaných aspektů, popř. dalších, nikoliv v aspektech významu či tématu.

Filozofická fakulta Univerzity Karlovy

Slovo a slovesnost, ročník 54 (1993), číslo 3, s. 173-177

Předchozí Jan Čadil: The role of extra-linguistic means in written texts

Následující Sáva Heřman: Graphemics and/or its orthography as a distinctive feature of any literate social group