Hierarcical phrase-based translation with weighted finite-state transducers

  1. Iglesias Iglesias, Gonzalo José
Zuzendaria:
  1. Eduardo Rodríguez Banga Zuzendaria

Defentsa unibertsitatea: Universidade de Vigo

Fecha de defensa: 2010(e)ko martxoa-(a)k 25

Epaimahaia:
  1. José Bernardo Mariño Acebal Presidentea
  2. Leandro Rodríguez Liñares Idazkaria
  3. David Cabrero Kidea
  4. Carmen García Mateo Kidea
Saila:
  1. Teoría do sinal e comunicacións

Mota: Tesia

Teseo: 309981 DIALNET

Laburpena

The dissertation in focused in the Statical Machine Translation fiel (SMT), particularly in hierarchical phrase-based translation frameworks. We first study and redesign hierarchical models using several filtering techniques. Hierarchical search spaces are based on automatically extracted translation rules. As originally defined they are to handle directly without filtering. In this thesis we create more space-efficient models, aiming at faster decoding times without a cost in performance. We propose more refined strategies such as pattern filtering and shallow-N grammars. The aim is to reduce a priorio the search space as much as possible without losing performance (or even improving it), so that search errors wil be avoided. We also propose a new algorithm in the hierarchical phrase-based machine translation framework , called HIFST. For the fist time, as far as we are aware, an SMT system combines successsfully knowledge from two other research areas simultaneously: parsing, and the compact representation and powerful semiring operations of weighted finite-state transducers. Combined with our fingings for hierarchical grammars, we are able to build search-error free translation systems with state-of-the-art performance. Keywords: HiFST, SMT, hierarchical phrase-based decoding, parsing, CYK, WFSTs, transducers.