Biological phenomena and technological artifacts
Ante Turudić (Author), Zlatko Liber (Author), Martina Grdiša (Author), Jernej Jakše (Author), Filip Varga (Author), Zlatko Šatović (Author)

Abstract

The development of bioinformatic solutions is guided by biological knowledge of the subject. In some cases, we use unambiguous biological models, while in others we rely on assumptions. A commonly used assumption for genomes is that related species have similar genome sequences. This is even more obvious in the case of chloroplast genomes due to their slow evolution. We investigated whether the lengths of complete chloroplast sequences are closely related to the taxonomic proximity of the species. The study was performed using all available RefSeq sequences from the asterid and rosid clades. In general, chloroplast length distributions are narrow at both the family and genus levels. In addition, clear biological explanations have already been reported for families and genera that exhibit particularly wide distributions. The main factors responsible for the length variations are parasitic life forms, IR loss, IR expansions and contractions, and polyphyly. However, the presence of outliers in the distribution at the genus level is a strong indication of possible inaccuracies in sequence assembly.

Keywords

genome database;chloroplast genome;sequence length;taxonomy;bioinformatics;

Data

Language: English
Year of publishing:
Typology: 1.01 - Original Scientific Article
Organization: UL BF - Biotechnical Faculty
UDC: 577.2
COBISS: 139300611 Link will open in a new window
ISSN: 2223-7747
Views: 18
Downloads: 1
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Slovenian
Secondary keywords: genom;baze podatkov;kloroplast;genom kloroplasta;dolžina sekvenc;taksonomija;
Type (COBISS): Article
Pages: 12 str.
Volume: ǂVol. ǂ12
Issue: ǂiss. ǂ2, art. 254
Chronology: 2023
DOI: 10.3390/plants12020254
ID: 17832775