Hierarcical phrase-based translation with weighted finite-state transducers

  1. Iglesias Iglesias, Gonzalo José
Dirixida por:
  1. Eduardo Rodríguez Banga Director

Universidade de defensa: Universidade de Vigo

Fecha de defensa: 25 de marzo de 2010

Tribunal:
  1. José Bernardo Mariño Acebal Presidente/a
  2. Leandro Rodríguez Liñares Secretario
  3. David Cabrero Vogal
  4. Carmen García Mateo Vogal
Departamento:
  1. Teoría do sinal e comunicacións

Tipo: Tese

Teseo: 309981 DIALNET

Resumo

The dissertation in focused in the Statical Machine Translation fiel (SMT), particularly in hierarchical phrase-based translation frameworks. We first study and redesign hierarchical models using several filtering techniques. Hierarchical search spaces are based on automatically extracted translation rules. As originally defined they are to handle directly without filtering. In this thesis we create more space-efficient models, aiming at faster decoding times without a cost in performance. We propose more refined strategies such as pattern filtering and shallow-N grammars. The aim is to reduce a priorio the search space as much as possible without losing performance (or even improving it), so that search errors wil be avoided. We also propose a new algorithm in the hierarchical phrase-based machine translation framework , called HIFST. For the fist time, as far as we are aware, an SMT system combines successsfully knowledge from two other research areas simultaneously: parsing, and the compact representation and powerful semiring operations of weighted finite-state transducers. Combined with our fingings for hierarchical grammars, we are able to build search-error free translation systems with state-of-the-art performance. Keywords: HiFST, SMT, hierarchical phrase-based decoding, parsing, CYK, WFSTs, transducers.