Automated Collection and Sharing of Adaptive Amino Acid Changes Data

  1. Noé Vázquez
  2. Cristina P. Vieira
  3. Bárbara S. R. Amorim
  4. André Torres
  5. Hugo López-Fernández
  6. Florentino Fdez-Riverola
  7. José L.R. Sousa
  8. Miguel Reboiro-Jato
  9. Jorge Vieira
Libro:
11th International Conference on Practical Applications of Computational Biology & Bioinformatics
  1. Fernández Riverola, Florentino (ed. lit.)

Editorial: Springer Suiza

ISBN: 978-3-319-60815-0

Año de publicación: 2017

Páginas: 18-25

Congreso: Practical Applications of Computational Biology & Bioinformatics (PACBB). International Conference (11. 2017. null)

Tipo: Aportación congreso

Resumen

When changes at few amino acid sites are the target of selection, adaptive amino acid changes in protein sequences can be identified using maximum-likelihood methods based on models of codon substitution (such as codeml). Such methods have been used numerous times using a variety of different organisms but the time needed to collect the data and prepare the input files means that tens or a couple of hundred coding regions are usually analyzed. Nevertheless, the recent availability of flexible and ease to use computer appli‐ cations to collect the relevant data (such as BDBM), and infer positively selected amino acid sites (such as ADOPS) means that the whole process is easier and quicker than before, but the lack of a batch option in ADOPS, here reported, still precluded the analysis of hundreds or thousands of sequence files. Given the interest and possibility of running such large scale projects, we also developed a database where ADOPS projects can be stored. Therefore, here we also present B+ that is both a data repository and a convenient interface to look at the infor‐ mation contained in ADOPS projects without the need to download and unzip the corresponding ADOPS project file. The ADOPS projects available at B+ can also be downloaded, unzipped, and opened using the ADOPS graphical interface. The availability of such a database ensures results repeatability, promotes data reuse with significant savings on the time needed for preparing datasets, and allows further exploration of the data contained in ADOPS projects effortlessly