Roland Hausser.
Foundations of computational linguistics:
man-machine communication in natural language.
Berlin: Springer. 1999. Pp. xii+534.
Reviewed by RUTH KEMPSON, King's College London.
Hausser 1999 sets out a detailed case for the view that all aspects of
language - language-processing, language-production, even the grammar
formalism itself - are strictly `time linear', that is reflect
processing in real time, a view which if it can be sustained involves
a radical shift in our concepts of language, linguistic knowledge and
the relation between language and language use. The central part of
the book defines a `Left-Associative Grammar', this being a grammar
formalism which generates strings on a strictly left-right basis. The
evidence presented for such a model is of two major types. On the one
hand, Hausser argues that the complexity results it makes available
are a substantial advance on all other formalisms, whether
phrase-structure grammar, transformational grammar or categorial
grammar. Secondly, he sets out in detail grammars for fragments of
English and German, as a display of how it can be used to analyse both
fixed and free word order languages, with a detailed account in
addition of German morphology.
These central chapters of the book are by no means the sole objective
of the book however. Hausser argues that all grammar formalisms should
be nested within a comprehensive theory of cognition which encompasses
both linguistic and nonlinguistic action. Furthermore he argues that a
criterion for evaluating grammar formalisms is that grammar, parser
(modelling processing) and generator (modelling production) conform to
the strongest form of type transparency (chapter 9 section 3), namely
that both parser and generator use rules of the grammar directly and
in the same order as articulated in the grammar. Bravely, he sets out
to meet this challenge by devising a general computational model of
communication ( presented in detail in part IV). He defines a
computational system which reflects a nonlinguistic concept of context
(chapter 22); he incorporates a parser of a left-associative grammar
formalism, with a mapping of the strings generated by the grammar onto
structured objects representing thoughts and their representation in
some containing context database (chapter 23); and he also defines a
generator (chapter 24) which is a reverse mapping with `autonomous
navigation through the propositions of the contextual word
bank.. simultaneously put into contextual action and into words'
(453). And all these are set out against a new approach to pragmatics
as background, with seven pragmatic principles, with an accompanying
theory of signs (chapters 5-6).
The most successful parts of the book are the central parts setting
out the formalism, and demonstrating the mathematical and linguistic
results. Hausser provides a clear and devastating critique of
orthodox constituent-based phrase-structure grammar and categorial
grammar formalisms on the grounds of their undecidability (part II),
and provides proofs (ch11 section 1) that a left-associative grammar
formalism generates all and only the recursive languages (in marked
contrast to phrase-structure and categorial formalisms), and thus in
principle is able to provide a complete characterisation of formal
languages and by extension, natural languages, relative to a
constraint that all operations in the left-associative formalisms to
be posited must only add finite complexity to the core
left-associative grammar format. The type-transparency between
grammar and parser then ensures that the impressively low complexity
results of the grammar carry over to the parser. On the basis of this
left-associative grammar formalism, Hausser defines a new hierarchy of
languages, giving rise to new complexity results (chapter 12). The
mathematical results obtained are substantial, and a major challenge
to grammar formalisms based on a concept of constituent structure and
substitution. As Hausser points out, whether these formal results can
be sustained in application to natural language depends on there being
alternative analyses of data purporting to show the necessity of
levels of complexity in natural language well above those defined in
the hierarchy Hausser presents. For example he sidesteps the normally
recognised observation that natural languages are of at least
exponential complexity as supposedly demonstrated by the systematic
ambiguity of postposed prepositional phrases as either postnominal or
adverbial modifiers (as in The man saw the girl with the telescope) by
providing an analysis purely in terms of adjacency (236), suggesting
that the ambiguity is not structural, but merely semantic/pragmatic.
He then goes on in part III to set out detailed grammars for fragments
of English and German. These are of very considerable interest in
their own right, displaying both the el
egance of such grammars in certain respects, and the extent to which
the lack of invocation of structure necessitates disjunctive statements.
Unlike phrase-structure grammars, which are based on substitution of
one constituent type by some other, left-association grammars are
based on the principle of possible continuations, using a concept of
category covering every possible sequence of expressions. Beginning
with the first word, the grammar describes possible continuations for
each resulting category, called a new sentence-start. The rule format
is a transition from one sequence of words to another for each rule r,
together with an associated rule package. Elementary categories are
of two sorts: X, X' where X' is a requirement for a sequence of
expressions of category X, with an associated operation that cancels
out the category X' in the presence of a category X. Individual
categories can then be constructed as a sequence of other categories:
so for example, a transitive verb is of the category (N' A' V), being
a category which needs a nominative-marked sequence and an
accusative-marked sequence to yield a sequence of category V, to wit a
sentence. A simple example of an elegant solution provided by this
form of grammar is its ability to characterise languages in which
there is relatively free constituent order with a fixed verb
position. Allowing variables in the description and recursive
application of any given rule package, the system can express
straightforwardly statements such as the first constituent must be an
NP but once there is an NP and the verb is next then a sequence of NPs
may follow (chapter 16 section 5)
Despite the fact that the formalism generates words in left-right
sequence, it has a number of mechanisms for handling discontinuity
effects in natural language (introduced as part of the detailed
application to English and German in part III):
(i) the possible rule packages made available at each stage (central
to a left-associative grammar);
(ii) the checking off of any imposed category requirements (the
primary device for capturing discontinuity effects);
(iii) concatenating required categories in a list (358) so that
noncontiguous x' and y' can be combined as x' o y' and satisfied
together (used for German Mittelfeld constituent-order variation);
(iv) manipulating the method of adding a category to a sequence of
categories at a fixed point in that sequence, e.g. to ensure checking
of clausal adverbials identically whether that clausal sequence
precedes or follows the verb (362);
(v) A linearisation device specific to generation which ensures that
from some subpart of a semantic structure (484), the process of
linearisation can return to that subpart having generated some
subordinate sequence (used to define a linearisation procedure for
relative clauses -- 488).
Hausser claims that these do not involve more than finite extensions
of the core left-associative grammar, and sentences generated by such
grammars remain parsable in linear time. Hausser's complexity results
turn on the fact that all grammars that generate the required
string-sets are defined only as inducing operations upon strings:
there is no pairing of strings with structures defined over them. What
is less clear however is whether the system lacks any concept of
syntactic structure, terminology aside. As Hausser points out, at the
level of interpretation, a tree structure configuration is built, a
level arguably also essential to a characterisation of structural
properties of the language, for example in addressing the adjunct
attachment problem. Moreover this level has to be invoked in
production as a language-specific pragmatic level, mapping structure
in the context word base onto a linearised configuration reflecting
word order and relative-pronoun choices (484 -488), and such a level
provides an essential part of the characterisation of individual
language-particular properties of relative clauses. But if this is
so, the issue of parsability of natural languages in real time as a
reflection of properties of the grammar formalism, turns on whether
the concept of parsing for complexity results defined exclusively over
string sets is the same as that associated with parsing for the
purpose of pairing such strings with intended interpretations. For if
it is not, the significance of the complexity results Hausser
establishes for linguistic theory in general becomes much less clear.
Inevitably in such an ambitious book, some sections are much more
successful than others, and in my view, the setting out of a novel
pragmatics covering both linguistic and nonlinguistic actions, and of
language processing within that, is very much less successful than the
sections on formal properties of left-associative grammars. Here the
book suffers from apparently having been written in a vacuum, making
only token reference to two decades of relevant work. Despite the fact
that he is modelling a process of how language is interpreted in
context, Hausser makes no reference to work on the context-dependency
of natural language interpretation done within Situation Theory
(Barwise & Perry 1983), Discourse Representation Theory (Kamp & Reyle
1993), or Dynamic Predicate Logic (Groenendijk & Stokhof 1991) other
than one cursory footnote (400, n.12), even when the concepts he
defines are close to competing frameworks. To give one example, though
he provides a formal way of differentiating what he calls M-concepts
which are context-neutral, from I-concepts which are relativised to
individual contexts he does not draw out the striking parallels
between this and the concept of (parametrised) infon developed in
situation theory by Barwise, Perry and others (see Barwise & Perry
1983). He makes no reference to the more recent work in pragmatics
(Relevance Theory - see Sperber & Wilson 1995) and AI (Centring Theory
- see Walker et al 1998 for a representative collection). He makes no
reference to parsing work in the computational linguistics field, e.g.
the work on D-Tree Grammars of Marcus and colleagues (Marcus 1980 and
subsequently), and only the most minimal reference to psycholinguistic
work on parsing or production (400). Moreover, the assumptions he
makes about concepts and their one to one correspondence with lexical
items are essentially identical to those of Fodor (first set out in
Fodor 1981, 1983, but more recently in Fodor 1998), but none of the
debate between Fodor and others in the philosophy of psychology in
this connection receives even a passing mention (see Fodor & Lepore
1992 for an evaluation of the state of the art in this area). The
problem with having set aside all such work virtually without comment,
is that Hausser fails to address the central background problem to
which much of that work has been directed, namely that linguistic
content systematically under-determines interpretation in context, of
which it provides but a partial specification. He takes as his
starting point a Buehler metaphor of language as a tool (91-3),
comparing a sentence and its relation to interpretation in context as
a best-match analysis parallel to the use of a tool in a nonlinguistic
action. Using a screwdriver devised for screwing and loosening screws
for some action of stirring one's tea is, he suggests, a use that is
available for a screwdriver only if there is no better match between
tool and action in the context, such as provided by a spoon. This
characterisation of a tool and its extended uses is applied to natural
language, though, unlike Buehler, with a cognitive construal. An
expression is said to be interpreted nonliterally only if no better
match is available. This concept of a language as a tool, however, is
of limited applicability, for it fails to bring out the gap between
linguistic content and interpretation in context: unlike the case of
language, there is no sense in which a screw-driver has intrinsic
content that provides partial determination of its role both to
tighten/loosen screws and its role to stir ones tea. Following up on
this tool metaphor, he takes Shannon-Weaver's information theory as
his point of departure (91), but this is a code model based on the
assumption of transfer of some thought by an agent, suitably encoded,
to the hearer, a view of language which for good reason is no longer
held by others, in particular because it fails to allow either
expression of the gap between signal and interpretation or the lack of
certainty in the interpretation process (see Sperber and Wilson 1995
for extensive criticism). More than this, Hausser claims, at least in
principle, that production of language simply involves the mapping
between string and semantic structure defined in parsing set in
reverse: " in production, the elementary signs follow the time-linear
order of the underlying thought path while in interpretation the
thought path follows the time-linear order of the incoming elementary
signs" (98). Indeed he sets up a computational system which does
precisely this, relative to a suitably constrained database as
context. Despite Hausser's distinction between M-concept and I-concept
within an explicitly cognitive account (70-72), he places very little
emphasis on the mapping of an M-concept onto an associated I-concept,
rendering this distinction almost trivial so that the difference in
sustaining generation as the inverse of parsing is not brought
out. Indeed, the one instance of anaphoric connection at which this
essential gap might be addressed is said by Hausser to be established
at the level of semantic structure as part of the projection of the
string onto a sequence of M-concepts (464). Hausser says (93-95) all
utterances are interpreted from a STAR-point (S -space, T-time, A -
agent, R - recipient), and claims that evaluation is invariably
relative to these contextually provided values, the star point
regulating reference to data structures already present. But the
nature of context-dependence is far more wide-spread than these four
parameters, affecting tense, pronouns, ellipsis, scope construal,
requiring a much more general analysis of how particular
interpretations are established in processing, or realised in
production. Production, in particular, cannot simply involve some
sequence of actions in a reverse direction. At the very least, it
involves some decision from a fully specified thought onto some linear
sequence with critical choices to be made in case of all aspects of
linguistic content where there is not a full matching between
grammar-internal specification of content and context-dependent
values. Even from his own formulation, the transition from some
thought to the linearisation of a string involves decisions about word
order in a so-called language-specific pragmatics module (484) that go
well beyond the steps which the LA-grammar articulates as the steps of
parsing which a hearer has to entertain in retrieving the appropriate
propositional content.
Such invocation of language-specific pragmatics buttresses the worry
that Hausser's use of familiar terms has become stretched beyond the
point for which they remain suitable. Syntax is defined to be
generation of strings, with no concept of structure. Semantics is
defined to involve the projection of structure. Pragmatics is taken to
determine the mapping between semantic structure and linear word order
in ways that are specific to individual languages (488). Moreover, it
involves classifications of expressions into discrete categories (345,
489): yet neither of these phenomena fall within the remit of
pragmatics, given the conventional assumption that pragmatics concerns
the general nonlinguistic constraints underpinning communication and
their interaction with linguistic input in communication. In order to
evaluate the strength of Hausser's claims about the essential
left-right dynamics of natural language, and the abandonment of all
concepts of constituent structure, one needs to have clear statements
about the nature of tree structure representations in the semantic
vocabulary, the lack of relevance of these to the complexity results,
and the nature of the pragmatic mapping that determines the
correspondence between these and the linear order of words in a
string; but these are lacking.
The book is presented as a textbook with exercises checking
comprehension at the end of each chapter, but it is unlikely to be
successful as such, veering as it does between mathematical results of
considerable complexity, low-level linguistic introductions, and
solutions to philosophical issues which are extremely naïve, based on
uninsightful feature-based classifications (chapters 20-21). Moreover,
as already itemised, it is disappointing in a book purporting to be a
textbook that no attempt is made to set the account against the
pragmatic/semantic/psychological/computational background that has
been developing concurrently with the development of the proposed
analysis, so that the reader is given access to alternative approaches
with which to evaluate Hausser's analysis. Furthermore, there is no
introduction to the process of devising a computational parser, no
assignment of problem sets with provided solutions, or any other of
the other normal accoutrements familiar in computational linguistic
(or other) textbooks. This is surprising, given the availability of
implementations of this formalism, and the interested reader is
strongly encouraged to access:
http://www.linguistik.uni-erlangen.de/Uebungen.html for programming
exercises accompanying the four parts of the book, with sample
solutions.
Overall then, the book is both provocative and provoking. Though the
attempt by Hausser to establish a general framework for cognition is
not in my view a success, the substantive claim that natural language
grammar formalisms are time linear is a claim now receiving increasing
recognition (see Tugwell 1999, Kempson et al. 2000), and it is
Hausser's major contribution to the field to have been the first to
give this hypothesis detailed formal substance. Furthermore,
notwithstanding the only partial success of the larger cognitive
enterprise which Hausser articulates, it is clear that if the the
consequences of adopting strict type correspondence between grammar,
parser and generator are followed through, then some such novel
philosophy of language and mind will have to be articulated (as also
urged by Tugwell and Kempson et al), departing as it does from
orthodox assumptions of the complete separation of linguistic
knowledge in the form of a grammar and any implementation of it. So
the attempt by Hausser to articulate such a global view is to be
applauded for its courage, and for the provision of a starting point
for others to develop. In the meantime, setting aside this attempt at
a general computational model of cognition, the formal results
involving left-associative grammars and the application to English and
German fragments are of very considerable general interest, and well
worth serious consideration by linguists. Indeed, the formal results
achieved present a major challenge to linguists working in other
orthodoxies. Author's address :
Philosophy Department,
King's College London,
The Strand,
London, WC2R 2LS,
U.K.
E-mail:ruth.kempson@kcl.ac.uk
REFERENCES
Barwise, J. & Perry, J. (1983). Situations and attitudes. Cambridge,
Mass: MIT Press.
Buehler, K. (1934). Sprachtheorie: die Darstellingsfunktion der
Sprache, Stuttgart: Fischer.
Fodor, J.A. (1981). Representations. Cambridge, Mass: MIT Press.
Fodor, J.A. (1983). Modularity of mind. Cambridge, Mass: MIT Press.
Fodor, J.A. (1998). Concepts: where cognitive science went
wrong. Oxford. Oxford University Press.
Fodor, J.A. & Lepore, E. (1992). Holism: a shopper's
guide. Oxford. Blackwell.
Groenendijk, J. & Stokhof, M. (1991). Dynamic predicate
logic. Linguistics and Philosophy 14, 39-100.
Kamp, H. & Reyle, U. (1993). From discourse to logic. Dordrecht:
Kluwer.
Kempson, R.M. Meyer-Viol, W. & Gabbay, D. (2000). Dynamic syntax: the
flow of language understanding. Oxford: Blackwell.
Marcus, M . (1980). A theory of syntactic recognition for natural
language. Cambridge, Mass: MIT Press.
Sperber, D. & Wilson, D. (1995). Relevance: communication and
cognition (2nd edn). Oxford: Blackwell.
Tugwell, D. (1999). Dynamic syntax. Ph.D Dissertation. Edinburgh
University.
Walker, M., Joshi, A. & Prince, E. (1998). Centering theory in
discourse. Oxford: Clarendon Press.