Boštjan Murovec (Author), James M. Tiedje (Author), Blaž Stres (Author)

Abstract

The exponential growth of available DNA sequences and the increased interoperability of biological information is triggering intergoivernmental efforts aimed at increasing the access, dissemination, and analysis of sequence data. Achieving the efficient storage and processing of DNA material is an important goal that parallels well with the foreseen coding standardization on the horizon. This paper proposes novel coding approaches, for both the dissemination and processing of sequences, where the speed of the DNA processing is shown to be boosted by exploring more than the normally utilized eight bits for encoding a single nucleotide. Further gains are achived by encoding the nucleotides together with their trailing alignament information as a single 64-bit data structure. the paper also proposes a slight modification to the established FASTA scheme in order to improve on its representation of alignament information. The significance of the proposition is confirmed by the encouraging results from empirical tests.

Keywords

molekularna genetika;DNK;sekvence;bioinformatika;

Data

Language: English
Year of publishing:
Typology: 1.01 - Original Scientific Article
Organization: UL BF - Biotechnical Faculty
UDC: 575
COBISS: 2625416 Link will open in a new window
ISSN: 0169-2607
Views: 1138
Downloads: 213
Average score: 0 (0 votes)
Metadata: JSON JSON-RDF JSON-LD TURTLE N-TRIPLES XML RDFA MICRODATA DC-XML DC-RDF RDF

Other data

Secondary language: Unknown
Type (COBISS): Not categorized
Pages: str. 175-190
Volume: ǂVol. ǂ100
Issue: ǂno. ǂ2
Chronology: 2010
DOI: 10.1016/j.cmpb.2010.03.014
ID: 1033714