Computerlinguistik Uni Erlangen-Nürnberg: Reviews R.R. Hausser

By Prof. Gerard Sabah
Langage et Cognition Group
LIMSI - CNRS
BP 133
91403 Orsay cedex
FRANCE

Published in the new journal of "Cognitive Processing".

Hausser, Roland 1999, Foundations of Computational Linguistics, Springer-Verlag, Berlin.

Roland Hausser's book addresses the mechanisms of Natural Language communication with speaking robots of the future. Although it focuses on language, it also tries to take into account a few multi-modal aspects of communication . The book has a very strict organisation : four parts (language theory, grammar theory, morphology and syntax, semantics and pragmatics), each made up of six chapters, each of which, in turn, contains five sections.

An Introduction first presents the "SLIM" linguistic theory (meaning Surface compositional Linear Internal Matching), stating its four basic principles: surface compositionality (methodological principle), derivational order's strict linearity relative to time (empirical principle), utterance interpretation and production analysed as cognitive processes (ontological principle), reference modelled in terms of matching an utterance's meaning with context (functional principle).

The first part (language theory) develops a view of language illustrated by the description of a robot (Curious by name). This theory (SLIM) relies upon cognitive semantic primitives, a sign theory, along with their functional integration within generative and interpretative processes.

More precisely, Chapter 1 immediately positions the topic within the Human-Computer Interaction (HCI) paradigm, viewing computational linguistics as basically pluri-disciplinary. The author distinguishes two branches of AI (the classic, formal one, and the robotics-based one, the difference being that, for the latter, the environment undergoes perpetual, unpredictable changes). Two kinds of Artificial Intelligence yield two kinds of HCI, to which he adds a third case of communication, within the Virtual Reality framework. Following the classic distinction between various levels (phonology, morphology, lexicon, syntax, semantics and pragmatics), the author, who somehow mixes computer processing and Internet aspects, presents the innovative features of present-day publications (from CD-ROM to SGML).

Taking as examples the search in indexing mechanisms in data bases, along with automatic translation (presently limited to on-line help, crude translation or restricted language), Chapter 2 presents the essentials of Natural Language Processing, and underlines phenomena which call for linguistic solutions at the various aforementioned levels, reminding the reader about the alternative between "smart" and "solid" solutions (which reminds us of Tomita's distinction between "interesting" and "useful" problems).Footnote

Chapter 3, starting from the connections between understanding and communication on the one hand, and between understanding and perception on the other, leads to a proposal for an "iconic" representation of objects, which, although it is somewhat crude, seems to be efficient in the current field. However, I found that the role of lexical categories is treated in an over-simplified manner, and the limits of this kind of representation could have been more seriously explored (raises such questions as: can all sorts of objects be described in this way? How does one deal with more or less abstract entities?).

The reference phenomenon is then considered, using a mechanism of best match between representations of known objects in the world and representation from linguistic descriptions. This means that one elaborates on relations between literal meaning, compositionality and pragmatics aspects. Here, other language theories are presented, (mainly Chomsky's and Grice's), but the author's own theory is also compared with the traditional semantic theories: those of Frege, Pierce' Saussure, Bühler, and of Shannon and Weaver, detailing their formal and methodological foundations, and showing their connections with the proposed one. This presentation remains very "reference-oriented", and the study of conditions for success in HCI could have gone deeper...

Project "CURIOUS" (a virtual robot which continually observes and analyses its changing environment), presented in Chapter 4, relies upon the use of a formal language, whose propositions can match states of the world one-to-one. It can be noted here that such a language does not necessarily exist, and we regret that so crucial a point is not more thoroughly examined within a cognitive vision on language. As to the various features of CURIOUS, we also wonder if they can be extended to more complex worlds; a positive answer is in no way certain. Nonetheless, the projects seem to be a good validation for the proposed theory.

While dealing, in a rather cursive way, with Information Theory, Chapter 5 puts more emphasis on pragmatic aspects (here seen as a search for connections between an interpretation and a coherent sub-context), making some fundamental principles of the domain explicit.

Chapter 6 is of a more semiotic tendency. It comes back to the question of reference and delivers another set of fundamental principles of pragmatics, via some aspects of pronominal reference, and of relations between icons and symbols.

The second part (grammar theory) is about the theory of formal grammars and its methodological, mathematical and computational role in the description of natural languages. Chapter 7 states empirical and mathematical problems in the description of languages, as they arise from questions of substitution and continuation, by developing basic notions of generative and categorial grammars. As a tentative remedy, the author proposes Left-Associative Grammars (LAG), which means grammars with a left-associative derivation order, modelling the time-linear essence of natural languages. Here, even if the rest of the book argues well for this choice, I find that it could have been provided with better credentials. Moreover, definitions (even short ones) of grammars should be proposed in the first lines of the chapter, which would make understanding easier for beginners.

The essential distinction is following: for generative and categorial grammars, derivations rely upon the concept of substitutions, while in LAGs, they depend on possible continuations. Chapter 8 compares the respective complexities of those grammars, using examples from artificial languages. Chapter 9 develops basic notions of syntactic parsing, and its relations with the current grammars. Chapter 10 gives a more formal view of LAGs and illustrates them with examples from artificial languages, along with some English ones (mainly about discontinuous components and non-grammatical input).

The next two chapters detail the LAG hierarchy: A-LAG (covering all recursive languages), B-LAG (all context-sensitive languages), C-LAG (all regular non contextual languages and some context-sensitive ones) - the latter being, in their turn, divided in three subclasses. In this hierarchy, natural languages are in a low-complexity class, allowing for parsing in linear time. Using considerations about ambiguities and their respective complexity, this hierarchy is compared with that of phrase structure grammar, and the author shows that they are non-equivalent, being indeed "orthogonal", and that a LAG yields "better" parses of ambiguities (from my point of view, it is rather a case of better representations of ambiguities, not "parses", but this is open to discussion).

The third part (morphology and syntax), does focus on morphologic and syntactic aspects of natural languages. It uses many examples in English and German, and elaborate comparisons between them both (particularly on German's free order as opposed to English's constrained one), to show possible morphological processes, along with grammars one can think of implementing. Starting from small examples (inflexions, neologisms, allomorphs in English and German), the author systematically augments them, showing more and more intricate phrases, and derivations that are applied in each case.

This chapter argues for the possibility of using a unified formalism for morphology, lexicon and syntax within the LAG framework. The author could perhaps have developed a comprehensive comparison with more traditional approaches (particularly Winograd-like procedural parsing, or the declarative method illustrated by Pitrat - the storage mechanism having, by the way, a close relationship with the latter one).

Chapter 15 is about a possible use of that morphological analysis in corpus processing and the connected distributional analyses; Chapter 16 deals with the basic concepts of parsing (valence, agreement and word order), and to these traditional ones, it adds the higher left-association principle. This is followed by a detailed analysis of constraints on word order, freer in German, fixed in English (I for one would haxe exchanged both paragraphs, as I find English more understandable).

Chapter 17 provides us with various concrete examples of English processing, and Chapter 18 does the same for German, comparing the examples to the English ones. Again, I felt some inconvenience due to the order of exposition: Paragraph 18.1 presenting general thoughts about standard parsing processes would, in my opinion, better find its place in Chapter 16, as well as Paragraph 18.3, developing elaborate and interesting parallels between English and German verb positions. This minor flaw in the text order notwithstanding, these various elements are excellent illustrations of the observed phenomena.

Finally, the fourth part (semantics and pragmatics) deals (as you would expect) with semantics and pragmatics. Chapter 19 develops the fundamental contrasts between three kinds of semantics: of formal logic languages, of programming languages and of natural languages. It explores at some depth a possibility of applying logic semantics to natural languages. Intensional context, propositional attitudes and the vagueness phenomenon make it visible that different semantics rely upon different ontologies (which is illustrated by various classical paradoxes, stated in Chapter 20, and by the difference between absolute and contingent propositions in Chapter 21).

The author then delivers a rather philosophical consideration of how a semantic interpretation in natural languages leads to increasing complexity, and tells us how this can be avoided in the SLIM theory. The last two chapters show detailed examples of representations at various levels of the "SLIM machine", respectively in the receiving and in the generating stance. Here again, the approach is well illustrated.

Viewed from the realm of Cognitive Science, this book, which aims to span many disciplines, fails at it in some significant regards. Nothing is said in terms of neurobiology of language, nothing again on learning, on linguistic knowledge acquisition, and very little on "error processing" (non-grammatical input, typos, syntax errors) which does represent a fundamental aspect of HCI. Still, we have here quite a substantive book, whose short chapters make for pleasant reading and come complete with relevant exercises. As it totals 534 pages in its present state, we can hardly blame it for not taking every topic into account! Despite our minor criticisms, stated in the above paragraphs, this book is a rich source of learning. It clearly describes the basic elements of automatic natural language processing, along with fundamental issues of the domain. On such bases, it proposes an original and interesting grammatical theory, and shows possible applications of it.

The mere length of the present analysis (even if it is somewhat critical at places) speaks in itself for the interest I felt in reading this book!

Gérard Sabah

----------------------
Footnote: The former have no obvious solution, call for the development of sophisticated theories, or make it possible to exhibit general linguistic principles, and can be treated by a small number of general rules (e. g. relativisation, causation, group movement). In contrast, "useful" problems are those for which obvious solutions exist, or those for which no general principle can be exhibited; one just has to add specific rules to solve each one of those problems (punctuation, expression of dates, idioms). The latter kind of problem is difficult to anticipate, and requires very large amounts of knowledge. Note that it is just as essential to solve both kinds of problems if one wants running applications! However, if one is mainly interested in theoretical aspects, it will be possible to neglect problems of the second kind, for which simple solutions potentially exist, but would be costly to implement. Moreover, the attitude of some linguists, who are constructing a complete catalogue of all figures of discourse (Gross, Mel'cuk) implies that all problems are of the second kind.