Roland Hausser
Foundations of Computational Linguistics.
Man-Machine Communication in Natural Language.
Berlin, Heidelberg, New York. Springer 1999, xii + 534 pp.
ISBN 3-540-66015-1.
Reviewed by Petr Sgall
The book under review constitutes an extremely comprehensive
treatment of the domain of computational linguistics
- a domain that has been developing very quickly and has
been extended to cover computerized approaches to most
different instances of language and speech. The author
understands the goal of computational linguistics in
reproducing the natural transmission of information "by
modelling the speaker's production and the hearer's
interpretation on a suitable type of computer... (which)
amounts to the construction of autonomous cognitive machines
(robots) which can communicate freely in natural language"
(p. 1). Before discussing what "freely" means here, let us
have a look at the structure of this very thoroughly equipped
book (which includes exercises to individual chapters,
a bibliography of 18 large-format pages, a name index of 4
pp. and a subject index of 14 pp.).
The book is organized into four main parts, each of which
comprises six chapters. In Part I, Theory of Language, the
methods and applications of computational linguistics are
first illustrated by the robot CURIOUS, the functioning of
which instantiates the author's theory of Surface
Compositional Linear Internal Matching (SLIM). SLIM contains
a cognitive characterization of semantic primitives, as well
as a theory of signs and a delineation of syntax, semantics
and pragmatics with their integration in the linguistic
interaction of the speaker and the hearer. Comments on the
development of theories of language from G. Frege to C. E.
Shannon and W. Weaver and from F. de Saussure to N. Chomsky
are included. Among these, e.g. R. Hausser's remark on
Frege's principle (according to which a meaning of
a complex expression is a function of the meanings of its
parts and of the mode of composition) claims Frege's
formulation to be defendable if applied "to syntactically
analyzed surfaces" (p. 79); however, it may be discussed
whether such an analysis brings more than what is covered by
"meanings" of the parts and by their "modes of composition".
Part II, Theory of Grammar, concentrates, first of all, on
formal grammar and its role in the description of natural
languages, pointing out that the time-linear nature of
language can be captured by an LA-grammar, based on testing
possible continuations, rather than possible substitutions.
Natural languages, which parse in linear time, belong to the
lowest complexity class of a hierarchy defined on this
basis. Let us just remark that the author's characterization
of reference (without a background anchored in psychology)
might be seen as not fully exhausting the problem, although
his illustrations (pp.111-114) are clear enough to point to
the necessity to work with a notion such as a (multistratal)
stock of shared information. Such a notion would also be
useful in connection with his view of personal pronouns
(including those of 3rd person) as indexical, rather than
symbolic, which is well substantiated.
Two basic parts of grammar are then scrutinized in Part III,
Morphology and Syntax, where, after specifying the basic
notions and processes, the author characterizes the issues
of automatic word form recognition and presents a procedure
for a morphological analysis of English. Within the
left-associative approach then the syntactic notions
of valency and agreement are discussed, along with
(syntactic functions of) word order. A small subsystem of
English grammar is then formulated and is later extended,
step by step, to cover e.g. cases of free word order (in
comparison of English to German, Chapter IV, pp. 310ff),
complex noun phrases, complex verbs (328ff), interrogatives,
and so on.
In Part IV, Semantics and Pragmatics, the author compares
the semantics of languages of three kinds - logical,
programming and natural. Different types of semantics
are then examined (from Tarskian logical semantics and
systems accounting specifically for intensional contexts,
propositional attitudes and vagueness), and their
relationships to different ontologies are found to yield
different empirical results. After having found that
programming languages cannot be based on a
metalanguage-dependent Tarski semantics, the author
concludes that Tarski was right when claiming that a
complete analysis of natural languages is in principle
impossible within logical semantics (pp. 383ff). Tarski's
claim was founded on his analysis of the classical Liar
paradox, based on self-reference. However, here the word
"complete" appears to be of specific significance. As the
present reviewer discussed elsewhere (Sgall 1994), it is
possible for theoretical linguistics to handle the issues
connected with Liar's sentence so that the restrictions on
the relationship between a sentence and a (Carnapian)
proposition are reexamined. A systematic analysis of this
problem from a viewpoint of intensional logic can be found
in Tichy (1988:227-233).
The last two chapters of the book (23 and 24 in Part IV)
show how the author surmounts the just mentioned
difficulties of truth-conditional semantics by the means of
his SLIM-based theory. These issues are illustrated here
with the SLIM machines both in the role of an interpreter
and a producer of messages. Their states of cognition are
characterized as ten activation states, classified as
recognition (contextual, commented, and
language-controlled), action (contextual,
language-controlled and commented), inference,
interpretation, production of language and cognitive
stillstand. Although the account of different cognitive
layers is very rich, systematic and convincing, it still
remains to be checked as for the guaranties of this
classification, and thus also of the author's claim that
these machines can communicate freely in natural language.
If they are able to communicate as freely as human beings
can (even if emotionally marked speech and issues of
stylistics are not directly taken into consideration), then
also issues of the psychological background of speech (with
the contents of memory and its structure, and so on) should
be systematically described.
However, computational linguistics and/or computer modelling
of natural language, although offering extremely rich
possibilities of analyzing and illustrating properties of
language, still differs from theoretical linguistics itself
and from a theory of the semantico-pragmatic intepretation
of discourse. A possible handling of the Liar's paradox in
the context of truth-conditional semantics was already
mentioned. The related problem of "propositional" attitudes
can be accounted for, in a similar context, if linguistic
meaning (underlying structure) is distinguished not only
from truth conditions, but also from intension (cf. Peregrin
and Sgall 1998). Further difficulties connected with the
application of Tarskian semantics to natural language might
be overcome if the role of the topic-focus articulation (as
based on contextual boundness or on the opposition between
"given" and "new" information, cf. Hajiöov , Partee and
Sgall 1998) is duly reflected as pertinent for linguistic
meaning (rather than just for contextual combinability of
sentences), and if the indistinctness of linguistic meaning
is acknowledged as one of its basic properties, see e.g.
Nov k (1993). Truth-conditional semantics certainly has to
be relativized to different contexts, possible worlds or
situations, as can be done if an enriched version of H.
Kamp's Discourse Representation Theory is properly connected
with an account of topic and focus. In such a way the
Tarskian foundations of semantics, safely based on truth
conditions, still can be useful for theoretical linguistics.
These remarks on the complex relationships between
computational and theoretical linguistics cannot be
understood as denying that the book under review in any case
presents an extremely rich and highly useful system of a
computer based approach to semantics, which allows for much
more than just experiments concerning the role of computers
in understanding and producing discourses. The very broad
image of the structure of natural language, which Hausser's
book offers in a pedagogically appropriate way, makes it
possible to analyze most different layers of natural
language in a way diversified enough to cover phenomena of
any aspect of language structure and to check the results of
these analyses in a fully effective way.
References
Hajcova Eva, Barbara H. Partee and Petr Sgall (1998).
Topic-Focus Articulation, Tripartite Structures, and
Semantic Content. Dordrecht:Kluwer.
Novak Vilem (1993). The alternative mathematical model of
linguistic semantics and pragmatics. New York: Plenum
Publishing Corporation.
Peregrin Jaroslav and Petr Sgall (1998). Meaning and
"propositional attitudes". In: J.J. Jadacki and W.
Strawinski, ed.: In the world of signs. Amsterdam/Atlanta,
GA: Rodopi, 73-80.
Sgall Petr (1994). Meaning, reference and discourse
patterns. In: Ph. Luelsdorff, ed.: The Prague School of
Structural and Functional Linguistics.
Amsterdam/Philadelphia: J. Benjamins, 277-309.
Tichy Pavel (1988). The foundations of Frege's Logic
Berlin/New York: W. de Gruyter.