Auto-phyloa pipeline maker for phylogenetic studies

  1. Hugo López-Fenández 26
  2. Miguel Pinto 13
  3. Cristina P. Vieira 34
  4. Pedro Duque 1345
  5. Miguel Reboiro-Jato 26
  6. Jorge Vieira 34
  1. 1 Faculdade de Ciências da Universidade do Porto (FCUP), Rua do Campo Alegre, S/N, 4169-007, Porto, Portugal
  2. 2 CINBIO, Department of Computer Science, ESEI—Escuela Superior de Ingeniería Informática, Universidade de Vigo, 32004, Ourense, Spain
  3. 3 Instituto de Investigação e Inovação em Saúde (I3S), Universidade do Porto, Rua Alfredo Allen, 208, 4200-135, Porto, Portugal
  4. 4 Instituto de Biologia Molecular e Celular (IBMC), Rua Alfredo Allen, 208, 4200-135, Porto, Portugal
  5. 5 School of Medicine and Biomedical Sciences (ICBAS), Porto University, Rua de Jorge Viterbo Ferreira, 228, 4050-313, Porto, Portugal
  6. 6 SING Research Group, Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, 36213, Vigo, Spain
Book:
Practical applications of computational biology and bioinformatics, 17th International Conference (PACBB 2023)
  1. Miguel Rocha (ed. lit.)
  2. Florentino Fdez-Riverola (ed. lit.)
  3. Mohd Saberi Mohamad (ed. lit.)
  4. Ana Belén Gil-González (ed. lit.)

Publisher: Springer Suiza

ISBN: 978-3-031-38079-2 978-3-031-38078-5

Year of publication: 2023

Pages: 24-33

Congress: Practical Applications of Computational Biology & Bioinformatics (PACBB). International Conference (17. 2023. Miño)

Type: Conference paper

Abstract

Inferences on the evolutionary history of a gene can provide insight into whether the findings made for a given gene in a given species can be extrapolated to other species, including humans, help explain morphological evolution or give an explanation for unexpected findings regarding gene expression suppression experiments, among others. The large amount of sequence data that is already available, and that is predicted to dramatically increase in the next few years, means that life science researchers need efficient automated ways of analyzing such data. Moreover, especially when dealing with divergent sequences, inferences can be affected by the chosen alignment and tree building algorithms, and thus the same dataset should be analyzed in different ways, reinforcing the need for the availability of efficient automated ways of analyzing the sequencing data. Therefore, here, we present auto-phylo, a simple pipeline maker for phylogenetic studies, and provide two examples of its utility: one involving a small already formatted sequenced dataset (41 CDS) to determine the impact of the use of different alignment and tree building algorithms in an automated way, and another one involving the automated identification and processing of the sequences of interest, starting from 16550 bacterial CDS FASTA files downloaded from the NCBI Assembly RefSeq database, and subsequent alignment and tree building inferences.