Interpretación tabular de autómatas para lenguajes de adjunción de árboles

Alonso, Miguel Á.

Interpretación tabular de autómatas para lenguajes de adjunción de árboles

Alonso, Miguel Á.

Dirixida por:

Manuel Vilares Ferro Director
Eric Villemonte de la Clercerie Director

Universidade de defensa: Universidade da Coruña

Fecha de defensa: 25 de setembro de 2000

Tribunal:

Josep Miro Presidente/a
Antonio Blanco Ferro Secretario/a
José Mira Mira Vogal
Pierre Boullier Vogal
Mark-Jan Nedermof Vogal

Tipo: Tese

Teseo: 80145 DIALNET RUC editor

Resumo

Tree adjoining grammars are an extension of context-free grammars that use trees instead of productions as the primary representing structure and that are considered to be adequate to describe most of syntactic phenomena occurring in natural languages. These grammars generate the class of tree adjoining languages, which is equivalent to the class of languages generated by linear indexed grammars and other mildly context-sensitive formalisms. In the first part of this dissertation, we introduce the problem of parsing tree adjoining grammars and linear indexed grammars, creating, for both formalisms, a continuum from simple pure bottom-up algorithms to complex predictive algorithms and showing what transformations must be applied to each one in order to obtain the next one in the continuum. In the second part, we define several models of automata that accept the class of tree adjoining languages, proposing techniques for their efficient execution. The use of automata for parsing is interesting because they allow us to separate the problem of the definition of parsing algorithms from the problem of their execution. We have considered the following types of automata: • Top-down and bottom-up embedded push-down automata, two extensions of push-down automata working on nested stacks. A new definition is provided in which the finite-state control has been eliminated and several kinds of normalized transition have been defined, preserving the equivalence with tree adjoining languages. • Logical push-down automata restricted to the case of tree adjoining languages. Depending on the set of allowed transitions, we obtain three different types of automata. • Linear indexed automata, left-oriented and right-oriented to describe parsing strategies in which adjuntions are recognized top-down and bottom-up, respectively, and stronglydriven to define parsing strategies recognizing adjunctions top-down and/or bottom-up. • 2-stack automata, an extension of push-down automata working on a pair of stacks, a master stack driving the parsing process and an auxiliary stack restricting the set of transitions that can be applied at a given moment. Strongly-driven 2-stack automata can be used to describe bottom-up, top-down or mixed parsing strategies for tree adjoining languages with respect to the recognition of the adjunctions. Bottom-up 2-stack automata are specifically designed for parsing strategies recognizing adjunctions bottom-up. Compilation schemata for these models of automata have been defined. A compilation schema allow us to obtain the set of transitions corresponding to the implementation of a^ parsing strategy for a given grammar. All the presented automata can be executed in polynomial time with respect to the length of the input string by applying tabulation techniques. A tabular technique makes possible to interpret an automaton by means of the manipulation of collapsed representation of configurations (called items) instead of actual configurations. Items are stored into a table in order to be reused, avoiding redundant computations. Finally, we have studied the relations among the diíferent classes of automata, the main dif%rence being the storage structure used: embedded stacks, indices lists or coupled stacks. According to the strategies that can be implemented, we can distinguish three kinds of automata: bottom-up automata, including bottom-up embedded push-down automata, bottomup restricted logic push-down automata, right-oriented linear indexed automata and bottom-up 2-stack automata; top-down automata, including (top-down) embedded push-down automata, top-down restricted logic push-down automata and left-oriented linear indexed automata; and general automata, including strongly-driven linear indexed automata and strongly-driven 2- stack automata.