Skip to content

Enum: SequenceFileFormat

Standard file formats used for storing sequence data

URI: valuesets:SequenceFileFormat

Permissible Values

Value Meaning Description Content Extensions Status
FASTA EDAM:format_1929 FASTA sequence format sequences only .fa, .fasta, .fna, .ffn, .faa, .frn
FASTQ EDAM:format_1930 FASTQ sequence with quality format sequences and quality scores .fq, .fastq
SAM EDAM:format_2573 Sequence Alignment Map format aligned sequences (text) .sam
BAM EDAM:format_2572 Binary Alignment Map format aligned sequences (binary) .bam
CRAM Compressed Reference-oriented Alignment Map compressed aligned sequences .cram
VCF EDAM:format_3016 Variant Call Format genetic variants .vcf
BCF EDAM:format_3020 Binary Variant Call Format genetic variants (binary) .bcf
GFF3 Generic Feature Format version 3 genomic annotations .gff, .gff3
GTF Gene Transfer Format gene annotations .gtf
BED Browser Extensible Data format genomic intervals .bed
BIGWIG BigWig format for continuous data continuous genomic data .bw, .bigwig
BIGBED BigBed format for interval data genomic intervals (indexed) .bb, .bigbed
HDF5 Hierarchical Data Format 5 multi-dimensional arrays .h5, .hdf5
SFF EDAM:format_3284 Standard Flowgram Format (454) 454 sequencing data .sff legacy
FAST5 Fast5 format (Oxford Nanopore) nanopore raw signal data .fast5
POD5 POD5 format (Oxford Nanopore, newer) nanopore raw signal data (compressed) .pod5

Slots

Name Description
sequence_file_format Standard file formats used for storing sequence data

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/linkml/valuesets

LinkML Source

name: SequenceFileFormat
instantiates:
- valuesets_meta:ValueSetEnumDefinition
description: Standard file formats used for storing sequence data
title: Sequence File Formats
from_schema: https://w3id.org/linkml/valuesets
contributors:
- orcid:0000-0002-6601-2165
- https://github.com/anthropics/claude-code
status: DRAFT
rank: 1000
permissible_values:
  FASTA:
    text: FASTA
    description: FASTA sequence format
    meaning: EDAM:format_1929
    annotations:
      extensions:
        tag: extensions
        value: .fa, .fasta, .fna, .ffn, .faa, .frn
      content:
        tag: content
        value: sequences only
  FASTQ:
    text: FASTQ
    description: FASTQ sequence with quality format
    meaning: EDAM:format_1930
    annotations:
      extensions:
        tag: extensions
        value: .fq, .fastq
      content:
        tag: content
        value: sequences and quality scores
  SAM:
    text: SAM
    description: Sequence Alignment Map format
    meaning: EDAM:format_2573
    annotations:
      extensions:
        tag: extensions
        value: .sam
      content:
        tag: content
        value: aligned sequences (text)
  BAM:
    text: BAM
    description: Binary Alignment Map format
    meaning: EDAM:format_2572
    annotations:
      extensions:
        tag: extensions
        value: .bam
      content:
        tag: content
        value: aligned sequences (binary)
  CRAM:
    text: CRAM
    description: Compressed Reference-oriented Alignment Map
    annotations:
      extensions:
        tag: extensions
        value: .cram
      content:
        tag: content
        value: compressed aligned sequences
  VCF:
    text: VCF
    description: Variant Call Format
    meaning: EDAM:format_3016
    annotations:
      extensions:
        tag: extensions
        value: .vcf
      content:
        tag: content
        value: genetic variants
  BCF:
    text: BCF
    description: Binary Variant Call Format
    meaning: EDAM:format_3020
    annotations:
      extensions:
        tag: extensions
        value: .bcf
      content:
        tag: content
        value: genetic variants (binary)
  GFF3:
    text: GFF3
    description: Generic Feature Format version 3
    annotations:
      extensions:
        tag: extensions
        value: .gff, .gff3
      content:
        tag: content
        value: genomic annotations
  GTF:
    text: GTF
    description: Gene Transfer Format
    annotations:
      extensions:
        tag: extensions
        value: .gtf
      content:
        tag: content
        value: gene annotations
  BED:
    text: BED
    description: Browser Extensible Data format
    annotations:
      extensions:
        tag: extensions
        value: .bed
      content:
        tag: content
        value: genomic intervals
  BIGWIG:
    text: BIGWIG
    description: BigWig format for continuous data
    annotations:
      extensions:
        tag: extensions
        value: .bw, .bigwig
      content:
        tag: content
        value: continuous genomic data
  BIGBED:
    text: BIGBED
    description: BigBed format for interval data
    annotations:
      extensions:
        tag: extensions
        value: .bb, .bigbed
      content:
        tag: content
        value: genomic intervals (indexed)
  HDF5:
    text: HDF5
    description: Hierarchical Data Format 5
    annotations:
      extensions:
        tag: extensions
        value: .h5, .hdf5
      content:
        tag: content
        value: multi-dimensional arrays
  SFF:
    text: SFF
    description: Standard Flowgram Format (454)
    meaning: EDAM:format_3284
    annotations:
      extensions:
        tag: extensions
        value: .sff
      content:
        tag: content
        value: 454 sequencing data
      status:
        tag: status
        value: legacy
  FAST5:
    text: FAST5
    description: Fast5 format (Oxford Nanopore)
    annotations:
      extensions:
        tag: extensions
        value: .fast5
      content:
        tag: content
        value: nanopore raw signal data
  POD5:
    text: POD5
    description: POD5 format (Oxford Nanopore, newer)
    annotations:
      extensions:
        tag: extensions
        value: .pod5
      content:
        tag: content
        value: nanopore raw signal data (compressed)