Skip to content

Enum: GenomeFeatureType

Genome feature types from SOFA (Sequence Ontology Feature Annotation).

This is the subset of Sequence Ontology terms used in GFF3 files.

Organized hierarchically following the Sequence Ontology structure.

URI: valuesets:GenomeFeatureType

Permissible Values

Value Meaning Description
REGION SO:0000001 A sequence feature with an extent greater than zero
BIOLOGICAL_REGION SO:0001411 A region defined by its biological properties
GENE SO:0000704 A region (or regions) that includes all of the sequence elements necessary to...
TRANSCRIPT SO:0000673 An RNA synthesized on a DNA or RNA template by an RNA polymerase
PRIMARY_TRANSCRIPT SO:0000185 A transcript that has not been processed
MRNA SO:0000234 Messenger RNA; includes 5'UTR, coding sequences and 3'UTR
EXON SO:0000147 A region of the transcript sequence within a gene which is not removed from t...
CDS SO:0000316 Coding sequence; sequence of nucleotides that corresponds with the sequence o...
INTRON SO:0000188 A region of a primary transcript that is transcribed, but removed from within...
FIVE_PRIME_UTR SO:0000204 5' untranslated region
THREE_PRIME_UTR SO:0000205 3' untranslated region
NCRNA SO:0000655 Non-protein coding RNA
RRNA SO:0000252 Ribosomal RNA
TRNA SO:0000253 Transfer RNA
SNRNA SO:0000274 Small nuclear RNA
SNORNA SO:0000275 Small nucleolar RNA
MIRNA SO:0000276 MicroRNA
LNCRNA SO:0001877 Long non-coding RNA
RIBOZYME SO:0000374 An RNA with catalytic activity
ANTISENSE_RNA SO:0000644 RNA that is complementary to other RNA
PSEUDOGENE SO:0000336 A sequence that closely resembles a known functional gene but does not produc...
PROCESSED_PSEUDOGENE SO:0000043 A pseudogene arising from reverse transcription of mRNA
REGULATORY_REGION SO:0005836 A region involved in the control of the process of gene expression
PROMOTER SO:0000167 A regulatory region initiating transcription
ENHANCER SO:0000165 A cis-acting sequence that increases transcription
SILENCER SO:0000625 A regulatory region which upon binding of transcription factors, suppresses t...
TERMINATOR SO:0000141 The sequence of DNA located either at the end of the transcript that causes R...
ATTENUATOR SO:0000140 A sequence that causes transcription termination
POLYA_SIGNAL_SEQUENCE SO:0000551 The recognition sequence for the cleavage and polyadenylation machinery
BINDING_SITE SO:0000409 A region on a molecule that binds to another molecule
TFBS SO:0000235 Transcription factor binding site
RIBOSOME_ENTRY_SITE SO:0000139 Region where ribosome assembles on mRNA
POLYA_SITE SO:0000553 Polyadenylation site
REPEAT_REGION SO:0000657 A region of sequence containing one or more repeat units
DISPERSED_REPEAT SO:0000658 A repeat that is interspersed in the genome
TANDEM_REPEAT SO:0000705 A repeat where the same sequence is repeated in the same orientation
INVERTED_REPEAT SO:0000294 A repeat where the sequence is repeated in the opposite orientation
TRANSPOSABLE_ELEMENT SO:0000101 A DNA segment that can change its position within the genome
MOBILE_ELEMENT SO:0001037 A nucleotide region with the ability to move from one place in the genome to ...
SEQUENCE_ALTERATION SO:0001059 A sequence that deviates from the reference sequence
INSERTION SO:0000667 The sequence of one or more nucleotides added between two adjacent nucleotide...
DELETION SO:0000159 The removal of a sequences of nucleotides from the genome
INVERSION SO:1000036 A continuous nucleotide sequence is inverted in the same position
DUPLICATION SO:1000035 One or more nucleotides are added between two adjacent nucleotides
SUBSTITUTION SO:1000002 A sequence alteration where one nucleotide replaced by another
ORIGIN_OF_REPLICATION SO:0000296 The origin of replication; starting site for duplication of a nucleic acid mo...
POLYC_TRACT None A sequence of Cs
GAP SO:0000730 A gap in the sequence
ASSEMBLY_GAP SO:0000730 A gap between two sequences in an assembly
CHROMOSOME SO:0000340 Structural unit composed of DNA and proteins
SUPERCONTIG SO:0000148 One or more contigs that have been ordered and oriented using end-read inform...
CONTIG SO:0000149 A contiguous sequence derived from sequence assembly
SCAFFOLD SO:0000148 One or more contigs that have been ordered and oriented
CLONE SO:0000151 A piece of DNA that has been inserted into a vector
PLASMID SO:0000155 A self-replicating circular DNA molecule
POLYPEPTIDE SO:0000104 A sequence of amino acids linked by peptide bonds
MATURE_PROTEIN_REGION SO:0000419 The polypeptide sequence that remains after post-translational processing
SIGNAL_PEPTIDE SO:0000418 A peptide region that targets a polypeptide to a specific location
TRANSIT_PEPTIDE SO:0000725 A peptide that directs the transport of a protein to an organelle
PROPEPTIDE SO:0001062 A peptide region that is cleaved during maturation
OPERON SO:0000178 A group of contiguous genes transcribed as a single unit
STEM_LOOP SO:0000313 A double-helical region formed by base-pairing between adjacent sequences
D_LOOP SO:0000297 Displacement loop; a region where DNA is displaced by an invading strand
MATCH SO:0000343 A region of sequence similarity
CDNA_MATCH SO:0000689 A match to a cDNA sequence
EST_MATCH SO:0000668 A match to an EST sequence
PROTEIN_MATCH SO:0000349 A match to a protein sequence
NUCLEOTIDE_MATCH SO:0000347 A match to a nucleotide sequence
JUNCTION_FEATURE SO:0000699 A boundary or junction between sequence regions
SPLICE_SITE SO:0000162 The position where intron is excised
FIVE_PRIME_SPLICE_SITE SO:0000163 The 5' splice site (donor site)
THREE_PRIME_SPLICE_SITE SO:0000164 The 3' splice site (acceptor site)
START_CODON SO:0000318 The first codon to be translated
STOP_CODON SO:0000319 The codon that terminates translation
CENTROMERE SO:0000577 A region where chromatids are held together
TELOMERE SO:0000624 The terminal region of a linear chromosome

Slots

Name Description
genome_feature Genome feature types from SOFA (Sequence Ontology Feature Annotation)

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/linkml/valuesets

LinkML Source

name: GenomeFeatureType
description: 'Genome feature types from SOFA (Sequence Ontology Feature Annotation).

  This is the subset of Sequence Ontology terms used in GFF3 files.

  Organized hierarchically following the Sequence Ontology structure.'
from_schema: https://w3id.org/linkml/valuesets
rank: 1000
permissible_values:
  REGION:
    text: REGION
    description: A sequence feature with an extent greater than zero
    meaning: SO:0000001
  BIOLOGICAL_REGION:
    text: BIOLOGICAL_REGION
    description: A region defined by its biological properties
    meaning: SO:0001411
    is_a: REGION
  GENE:
    text: GENE
    description: A region (or regions) that includes all of the sequence elements
      necessary to encode a functional transcript
    meaning: SO:0000704
    is_a: BIOLOGICAL_REGION
  TRANSCRIPT:
    text: TRANSCRIPT
    description: An RNA synthesized on a DNA or RNA template by an RNA polymerase
    meaning: SO:0000673
    is_a: BIOLOGICAL_REGION
  PRIMARY_TRANSCRIPT:
    text: PRIMARY_TRANSCRIPT
    description: A transcript that has not been processed
    meaning: SO:0000185
    is_a: TRANSCRIPT
  MRNA:
    text: MRNA
    description: Messenger RNA; includes 5'UTR, coding sequences and 3'UTR
    meaning: SO:0000234
    is_a: TRANSCRIPT
  EXON:
    text: EXON
    description: A region of the transcript sequence within a gene which is not removed
      from the primary RNA transcript by RNA splicing
    meaning: SO:0000147
    is_a: BIOLOGICAL_REGION
  CDS:
    text: CDS
    description: Coding sequence; sequence of nucleotides that corresponds with the
      sequence of amino acids in a protein
    meaning: SO:0000316
    is_a: BIOLOGICAL_REGION
  INTRON:
    text: INTRON
    description: A region of a primary transcript that is transcribed, but removed
      from within the transcript by splicing
    meaning: SO:0000188
    is_a: BIOLOGICAL_REGION
  FIVE_PRIME_UTR:
    text: FIVE_PRIME_UTR
    description: 5' untranslated region
    meaning: SO:0000204
    is_a: BIOLOGICAL_REGION
  THREE_PRIME_UTR:
    text: THREE_PRIME_UTR
    description: 3' untranslated region
    meaning: SO:0000205
    is_a: BIOLOGICAL_REGION
  NCRNA:
    text: NCRNA
    description: Non-protein coding RNA
    meaning: SO:0000655
    is_a: TRANSCRIPT
  RRNA:
    text: RRNA
    description: Ribosomal RNA
    meaning: SO:0000252
    is_a: NCRNA
    structured_aliases:
      rRNA:
        literal_form: rRNA
        source: SO:0000252
  TRNA:
    text: TRNA
    description: Transfer RNA
    meaning: SO:0000253
    is_a: NCRNA
  SNRNA:
    text: SNRNA
    description: Small nuclear RNA
    meaning: SO:0000274
    is_a: NCRNA
  SNORNA:
    text: SNORNA
    description: Small nucleolar RNA
    meaning: SO:0000275
    is_a: NCRNA
  MIRNA:
    text: MIRNA
    description: MicroRNA
    meaning: SO:0000276
    is_a: NCRNA
  LNCRNA:
    text: LNCRNA
    description: Long non-coding RNA
    meaning: SO:0001877
    is_a: NCRNA
  RIBOZYME:
    text: RIBOZYME
    description: An RNA with catalytic activity
    meaning: SO:0000374
    is_a: NCRNA
  ANTISENSE_RNA:
    text: ANTISENSE_RNA
    description: RNA that is complementary to other RNA
    meaning: SO:0000644
    is_a: NCRNA
  PSEUDOGENE:
    text: PSEUDOGENE
    description: A sequence that closely resembles a known functional gene but does
      not produce a functional product
    meaning: SO:0000336
    is_a: BIOLOGICAL_REGION
  PROCESSED_PSEUDOGENE:
    text: PROCESSED_PSEUDOGENE
    description: A pseudogene arising from reverse transcription of mRNA
    meaning: SO:0000043
    is_a: PSEUDOGENE
  REGULATORY_REGION:
    text: REGULATORY_REGION
    description: A region involved in the control of the process of gene expression
    meaning: SO:0005836
    is_a: BIOLOGICAL_REGION
  PROMOTER:
    text: PROMOTER
    description: A regulatory region initiating transcription
    meaning: SO:0000167
    is_a: REGULATORY_REGION
  ENHANCER:
    text: ENHANCER
    description: A cis-acting sequence that increases transcription
    meaning: SO:0000165
    is_a: REGULATORY_REGION
  SILENCER:
    text: SILENCER
    description: A regulatory region which upon binding of transcription factors,
      suppresses transcription
    meaning: SO:0000625
    is_a: REGULATORY_REGION
  TERMINATOR:
    text: TERMINATOR
    description: The sequence of DNA located either at the end of the transcript that
      causes RNA polymerase to terminate transcription
    meaning: SO:0000141
    is_a: REGULATORY_REGION
  ATTENUATOR:
    text: ATTENUATOR
    description: A sequence that causes transcription termination
    meaning: SO:0000140
    is_a: REGULATORY_REGION
  POLYA_SIGNAL_SEQUENCE:
    text: POLYA_SIGNAL_SEQUENCE
    description: The recognition sequence for the cleavage and polyadenylation machinery
    meaning: SO:0000551
    is_a: REGULATORY_REGION
  BINDING_SITE:
    text: BINDING_SITE
    description: A region on a molecule that binds to another molecule
    meaning: SO:0000409
    is_a: BIOLOGICAL_REGION
  TFBS:
    text: TFBS
    description: Transcription factor binding site
    meaning: SO:0000235
    is_a: BINDING_SITE
    title: TF_binding_site
  RIBOSOME_ENTRY_SITE:
    text: RIBOSOME_ENTRY_SITE
    description: Region where ribosome assembles on mRNA
    meaning: SO:0000139
    is_a: BINDING_SITE
  POLYA_SITE:
    text: POLYA_SITE
    description: Polyadenylation site
    meaning: SO:0000553
    is_a: BIOLOGICAL_REGION
  REPEAT_REGION:
    text: REPEAT_REGION
    description: A region of sequence containing one or more repeat units
    meaning: SO:0000657
    is_a: BIOLOGICAL_REGION
  DISPERSED_REPEAT:
    text: DISPERSED_REPEAT
    description: A repeat that is interspersed in the genome
    meaning: SO:0000658
    is_a: REPEAT_REGION
  TANDEM_REPEAT:
    text: TANDEM_REPEAT
    description: A repeat where the same sequence is repeated in the same orientation
    meaning: SO:0000705
    is_a: REPEAT_REGION
  INVERTED_REPEAT:
    text: INVERTED_REPEAT
    description: A repeat where the sequence is repeated in the opposite orientation
    meaning: SO:0000294
    is_a: REPEAT_REGION
  TRANSPOSABLE_ELEMENT:
    text: TRANSPOSABLE_ELEMENT
    description: A DNA segment that can change its position within the genome
    meaning: SO:0000101
    is_a: BIOLOGICAL_REGION
  MOBILE_ELEMENT:
    text: MOBILE_ELEMENT
    description: A nucleotide region with the ability to move from one place in the
      genome to another
    meaning: SO:0001037
    is_a: BIOLOGICAL_REGION
    title: mobile_genetic_element
  SEQUENCE_ALTERATION:
    text: SEQUENCE_ALTERATION
    description: A sequence that deviates from the reference sequence
    meaning: SO:0001059
    is_a: REGION
  INSERTION:
    text: INSERTION
    description: The sequence of one or more nucleotides added between two adjacent
      nucleotides
    meaning: SO:0000667
    is_a: SEQUENCE_ALTERATION
  DELETION:
    text: DELETION
    description: The removal of a sequences of nucleotides from the genome
    meaning: SO:0000159
    is_a: SEQUENCE_ALTERATION
  INVERSION:
    text: INVERSION
    description: A continuous nucleotide sequence is inverted in the same position
    meaning: SO:1000036
    is_a: SEQUENCE_ALTERATION
  DUPLICATION:
    text: DUPLICATION
    description: One or more nucleotides are added between two adjacent nucleotides
    meaning: SO:1000035
    is_a: SEQUENCE_ALTERATION
  SUBSTITUTION:
    text: SUBSTITUTION
    description: A sequence alteration where one nucleotide replaced by another
    meaning: SO:1000002
    is_a: SEQUENCE_ALTERATION
  ORIGIN_OF_REPLICATION:
    text: ORIGIN_OF_REPLICATION
    description: The origin of replication; starting site for duplication of a nucleic
      acid molecule
    meaning: SO:0000296
    is_a: BIOLOGICAL_REGION
  POLYC_TRACT:
    text: POLYC_TRACT
    description: A sequence of Cs
    is_a: REGION
  GAP:
    text: GAP
    description: A gap in the sequence
    meaning: SO:0000730
    is_a: REGION
  ASSEMBLY_GAP:
    text: ASSEMBLY_GAP
    description: A gap between two sequences in an assembly
    meaning: SO:0000730
    is_a: GAP
    title: gap
  CHROMOSOME:
    text: CHROMOSOME
    description: Structural unit composed of DNA and proteins
    meaning: SO:0000340
    is_a: REGION
  SUPERCONTIG:
    text: SUPERCONTIG
    description: One or more contigs that have been ordered and oriented using end-read
      information
    meaning: SO:0000148
    is_a: REGION
  CONTIG:
    text: CONTIG
    description: A contiguous sequence derived from sequence assembly
    meaning: SO:0000149
    is_a: REGION
  SCAFFOLD:
    text: SCAFFOLD
    description: One or more contigs that have been ordered and oriented
    meaning: SO:0000148
    is_a: REGION
    title: supercontig
  CLONE:
    text: CLONE
    description: A piece of DNA that has been inserted into a vector
    meaning: SO:0000151
    is_a: REGION
  PLASMID:
    text: PLASMID
    description: A self-replicating circular DNA molecule
    meaning: SO:0000155
    is_a: REGION
  POLYPEPTIDE:
    text: POLYPEPTIDE
    description: A sequence of amino acids linked by peptide bonds
    meaning: SO:0000104
    is_a: REGION
  MATURE_PROTEIN_REGION:
    text: MATURE_PROTEIN_REGION
    description: The polypeptide sequence that remains after post-translational processing
    meaning: SO:0000419
    is_a: POLYPEPTIDE
  SIGNAL_PEPTIDE:
    text: SIGNAL_PEPTIDE
    description: A peptide region that targets a polypeptide to a specific location
    meaning: SO:0000418
    is_a: POLYPEPTIDE
  TRANSIT_PEPTIDE:
    text: TRANSIT_PEPTIDE
    description: A peptide that directs the transport of a protein to an organelle
    meaning: SO:0000725
    is_a: POLYPEPTIDE
  PROPEPTIDE:
    text: PROPEPTIDE
    description: A peptide region that is cleaved during maturation
    meaning: SO:0001062
    is_a: POLYPEPTIDE
    title: propeptide
  OPERON:
    text: OPERON
    description: A group of contiguous genes transcribed as a single unit
    meaning: SO:0000178
    is_a: BIOLOGICAL_REGION
  STEM_LOOP:
    text: STEM_LOOP
    description: A double-helical region formed by base-pairing between adjacent sequences
    meaning: SO:0000313
    is_a: REGION
  D_LOOP:
    text: D_LOOP
    description: Displacement loop; a region where DNA is displaced by an invading
      strand
    meaning: SO:0000297
    is_a: REGION
  MATCH:
    text: MATCH
    description: A region of sequence similarity
    meaning: SO:0000343
    is_a: REGION
  CDNA_MATCH:
    text: CDNA_MATCH
    description: A match to a cDNA sequence
    meaning: SO:0000689
    is_a: MATCH
  EST_MATCH:
    text: EST_MATCH
    description: A match to an EST sequence
    meaning: SO:0000668
    is_a: MATCH
  PROTEIN_MATCH:
    text: PROTEIN_MATCH
    description: A match to a protein sequence
    meaning: SO:0000349
    is_a: MATCH
  NUCLEOTIDE_MATCH:
    text: NUCLEOTIDE_MATCH
    description: A match to a nucleotide sequence
    meaning: SO:0000347
    is_a: MATCH
  JUNCTION_FEATURE:
    text: JUNCTION_FEATURE
    description: A boundary or junction between sequence regions
    meaning: SO:0000699
    is_a: BIOLOGICAL_REGION
    title: junction
  SPLICE_SITE:
    text: SPLICE_SITE
    description: The position where intron is excised
    meaning: SO:0000162
    is_a: JUNCTION_FEATURE
  FIVE_PRIME_SPLICE_SITE:
    text: FIVE_PRIME_SPLICE_SITE
    description: The 5' splice site (donor site)
    meaning: SO:0000163
    is_a: SPLICE_SITE
    title: five_prime_cis_splice_site
  THREE_PRIME_SPLICE_SITE:
    text: THREE_PRIME_SPLICE_SITE
    description: The 3' splice site (acceptor site)
    meaning: SO:0000164
    is_a: SPLICE_SITE
    title: three_prime_cis_splice_site
  START_CODON:
    text: START_CODON
    description: The first codon to be translated
    meaning: SO:0000318
    is_a: BIOLOGICAL_REGION
  STOP_CODON:
    text: STOP_CODON
    description: The codon that terminates translation
    meaning: SO:0000319
    is_a: BIOLOGICAL_REGION
  CENTROMERE:
    text: CENTROMERE
    description: A region where chromatids are held together
    meaning: SO:0000577
    is_a: BIOLOGICAL_REGION
  TELOMERE:
    text: TELOMERE
    description: The terminal region of a linear chromosome
    meaning: SO:0000624
    is_a: BIOLOGICAL_REGION