Enum: GenomeFeatureType
Genome feature types from SOFA (Sequence Ontology Feature Annotation).
This is the subset of Sequence Ontology terms used in GFF3 files.
Organized hierarchically following the Sequence Ontology structure.
URI: valuesets:GenomeFeatureType
Permissible Values
| Value | Meaning | Description |
|---|---|---|
| REGION | SO:0000001 | A sequence feature with an extent greater than zero |
| BIOLOGICAL_REGION | SO:0001411 | A region defined by its biological properties |
| GENE | SO:0000704 | A region (or regions) that includes all of the sequence elements necessary to... |
| TRANSCRIPT | SO:0000673 | An RNA synthesized on a DNA or RNA template by an RNA polymerase |
| PRIMARY_TRANSCRIPT | SO:0000185 | A transcript that has not been processed |
| MRNA | SO:0000234 | Messenger RNA; includes 5'UTR, coding sequences and 3'UTR |
| EXON | SO:0000147 | A region of the transcript sequence within a gene which is not removed from t... |
| CDS | SO:0000316 | Coding sequence; sequence of nucleotides that corresponds with the sequence o... |
| INTRON | SO:0000188 | A region of a primary transcript that is transcribed, but removed from within... |
| FIVE_PRIME_UTR | SO:0000204 | 5' untranslated region |
| THREE_PRIME_UTR | SO:0000205 | 3' untranslated region |
| NCRNA | SO:0000655 | Non-protein coding RNA |
| RRNA | SO:0000252 | Ribosomal RNA |
| TRNA | SO:0000253 | Transfer RNA |
| SNRNA | SO:0000274 | Small nuclear RNA |
| SNORNA | SO:0000275 | Small nucleolar RNA |
| MIRNA | SO:0000276 | MicroRNA |
| LNCRNA | SO:0001877 | Long non-coding RNA |
| RIBOZYME | SO:0000374 | An RNA with catalytic activity |
| ANTISENSE_RNA | SO:0000644 | RNA that is complementary to other RNA |
| PSEUDOGENE | SO:0000336 | A sequence that closely resembles a known functional gene but does not produc... |
| PROCESSED_PSEUDOGENE | SO:0000043 | A pseudogene arising from reverse transcription of mRNA |
| REGULATORY_REGION | SO:0005836 | A region involved in the control of the process of gene expression |
| PROMOTER | SO:0000167 | A regulatory region initiating transcription |
| ENHANCER | SO:0000165 | A cis-acting sequence that increases transcription |
| SILENCER | SO:0000625 | A regulatory region which upon binding of transcription factors, suppresses t... |
| TERMINATOR | SO:0000141 | The sequence of DNA located either at the end of the transcript that causes R... |
| ATTENUATOR | SO:0000140 | A sequence that causes transcription termination |
| POLYA_SIGNAL_SEQUENCE | SO:0000551 | The recognition sequence for the cleavage and polyadenylation machinery |
| BINDING_SITE | SO:0000409 | A region on a molecule that binds to another molecule |
| TFBS | SO:0000235 | Transcription factor binding site |
| RIBOSOME_ENTRY_SITE | SO:0000139 | Region where ribosome assembles on mRNA |
| POLYA_SITE | SO:0000553 | Polyadenylation site |
| REPEAT_REGION | SO:0000657 | A region of sequence containing one or more repeat units |
| DISPERSED_REPEAT | SO:0000658 | A repeat that is interspersed in the genome |
| TANDEM_REPEAT | SO:0000705 | A repeat where the same sequence is repeated in the same orientation |
| INVERTED_REPEAT | SO:0000294 | A repeat where the sequence is repeated in the opposite orientation |
| TRANSPOSABLE_ELEMENT | SO:0000101 | A DNA segment that can change its position within the genome |
| MOBILE_ELEMENT | SO:0001037 | A nucleotide region with the ability to move from one place in the genome to ... |
| SEQUENCE_ALTERATION | SO:0001059 | A sequence that deviates from the reference sequence |
| INSERTION | SO:0000667 | The sequence of one or more nucleotides added between two adjacent nucleotide... |
| DELETION | SO:0000159 | The removal of a sequences of nucleotides from the genome |
| INVERSION | SO:1000036 | A continuous nucleotide sequence is inverted in the same position |
| DUPLICATION | SO:1000035 | One or more nucleotides are added between two adjacent nucleotides |
| SUBSTITUTION | SO:1000002 | A sequence alteration where one nucleotide replaced by another |
| ORIGIN_OF_REPLICATION | SO:0000296 | The origin of replication; starting site for duplication of a nucleic acid mo... |
| POLYC_TRACT | None | A sequence of Cs |
| GAP | SO:0000730 | A gap in the sequence |
| ASSEMBLY_GAP | SO:0000730 | A gap between two sequences in an assembly |
| CHROMOSOME | SO:0000340 | Structural unit composed of DNA and proteins |
| SUPERCONTIG | SO:0000148 | One or more contigs that have been ordered and oriented using end-read inform... |
| CONTIG | SO:0000149 | A contiguous sequence derived from sequence assembly |
| SCAFFOLD | SO:0000148 | One or more contigs that have been ordered and oriented |
| CLONE | SO:0000151 | A piece of DNA that has been inserted into a vector |
| PLASMID | SO:0000155 | A self-replicating circular DNA molecule |
| POLYPEPTIDE | SO:0000104 | A sequence of amino acids linked by peptide bonds |
| MATURE_PROTEIN_REGION | SO:0000419 | The polypeptide sequence that remains after post-translational processing |
| SIGNAL_PEPTIDE | SO:0000418 | A peptide region that targets a polypeptide to a specific location |
| TRANSIT_PEPTIDE | SO:0000725 | A peptide that directs the transport of a protein to an organelle |
| PROPEPTIDE | SO:0001062 | A peptide region that is cleaved during maturation |
| OPERON | SO:0000178 | A group of contiguous genes transcribed as a single unit |
| STEM_LOOP | SO:0000313 | A double-helical region formed by base-pairing between adjacent sequences |
| D_LOOP | SO:0000297 | Displacement loop; a region where DNA is displaced by an invading strand |
| MATCH | SO:0000343 | A region of sequence similarity |
| CDNA_MATCH | SO:0000689 | A match to a cDNA sequence |
| EST_MATCH | SO:0000668 | A match to an EST sequence |
| PROTEIN_MATCH | SO:0000349 | A match to a protein sequence |
| NUCLEOTIDE_MATCH | SO:0000347 | A match to a nucleotide sequence |
| JUNCTION_FEATURE | SO:0000699 | A boundary or junction between sequence regions |
| SPLICE_SITE | SO:0000162 | The position where intron is excised |
| FIVE_PRIME_SPLICE_SITE | SO:0000163 | The 5' splice site (donor site) |
| THREE_PRIME_SPLICE_SITE | SO:0000164 | The 3' splice site (acceptor site) |
| START_CODON | SO:0000318 | The first codon to be translated |
| STOP_CODON | SO:0000319 | The codon that terminates translation |
| CENTROMERE | SO:0000577 | A region where chromatids are held together |
| TELOMERE | SO:0000624 | The terminal region of a linear chromosome |
Slots
| Name | Description |
|---|---|
| genome_feature | Genome feature types from SOFA (Sequence Ontology Feature Annotation) |
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/linkml/valuesets
LinkML Source
name: GenomeFeatureType
description: 'Genome feature types from SOFA (Sequence Ontology Feature Annotation).
This is the subset of Sequence Ontology terms used in GFF3 files.
Organized hierarchically following the Sequence Ontology structure.'
from_schema: https://w3id.org/linkml/valuesets
rank: 1000
permissible_values:
REGION:
text: REGION
description: A sequence feature with an extent greater than zero
meaning: SO:0000001
BIOLOGICAL_REGION:
text: BIOLOGICAL_REGION
description: A region defined by its biological properties
meaning: SO:0001411
is_a: REGION
GENE:
text: GENE
description: A region (or regions) that includes all of the sequence elements
necessary to encode a functional transcript
meaning: SO:0000704
is_a: BIOLOGICAL_REGION
TRANSCRIPT:
text: TRANSCRIPT
description: An RNA synthesized on a DNA or RNA template by an RNA polymerase
meaning: SO:0000673
is_a: BIOLOGICAL_REGION
PRIMARY_TRANSCRIPT:
text: PRIMARY_TRANSCRIPT
description: A transcript that has not been processed
meaning: SO:0000185
is_a: TRANSCRIPT
MRNA:
text: MRNA
description: Messenger RNA; includes 5'UTR, coding sequences and 3'UTR
meaning: SO:0000234
is_a: TRANSCRIPT
EXON:
text: EXON
description: A region of the transcript sequence within a gene which is not removed
from the primary RNA transcript by RNA splicing
meaning: SO:0000147
is_a: BIOLOGICAL_REGION
CDS:
text: CDS
description: Coding sequence; sequence of nucleotides that corresponds with the
sequence of amino acids in a protein
meaning: SO:0000316
is_a: BIOLOGICAL_REGION
INTRON:
text: INTRON
description: A region of a primary transcript that is transcribed, but removed
from within the transcript by splicing
meaning: SO:0000188
is_a: BIOLOGICAL_REGION
FIVE_PRIME_UTR:
text: FIVE_PRIME_UTR
description: 5' untranslated region
meaning: SO:0000204
is_a: BIOLOGICAL_REGION
THREE_PRIME_UTR:
text: THREE_PRIME_UTR
description: 3' untranslated region
meaning: SO:0000205
is_a: BIOLOGICAL_REGION
NCRNA:
text: NCRNA
description: Non-protein coding RNA
meaning: SO:0000655
is_a: TRANSCRIPT
RRNA:
text: RRNA
description: Ribosomal RNA
meaning: SO:0000252
is_a: NCRNA
structured_aliases:
rRNA:
literal_form: rRNA
source: SO:0000252
TRNA:
text: TRNA
description: Transfer RNA
meaning: SO:0000253
is_a: NCRNA
SNRNA:
text: SNRNA
description: Small nuclear RNA
meaning: SO:0000274
is_a: NCRNA
SNORNA:
text: SNORNA
description: Small nucleolar RNA
meaning: SO:0000275
is_a: NCRNA
MIRNA:
text: MIRNA
description: MicroRNA
meaning: SO:0000276
is_a: NCRNA
LNCRNA:
text: LNCRNA
description: Long non-coding RNA
meaning: SO:0001877
is_a: NCRNA
RIBOZYME:
text: RIBOZYME
description: An RNA with catalytic activity
meaning: SO:0000374
is_a: NCRNA
ANTISENSE_RNA:
text: ANTISENSE_RNA
description: RNA that is complementary to other RNA
meaning: SO:0000644
is_a: NCRNA
PSEUDOGENE:
text: PSEUDOGENE
description: A sequence that closely resembles a known functional gene but does
not produce a functional product
meaning: SO:0000336
is_a: BIOLOGICAL_REGION
PROCESSED_PSEUDOGENE:
text: PROCESSED_PSEUDOGENE
description: A pseudogene arising from reverse transcription of mRNA
meaning: SO:0000043
is_a: PSEUDOGENE
REGULATORY_REGION:
text: REGULATORY_REGION
description: A region involved in the control of the process of gene expression
meaning: SO:0005836
is_a: BIOLOGICAL_REGION
PROMOTER:
text: PROMOTER
description: A regulatory region initiating transcription
meaning: SO:0000167
is_a: REGULATORY_REGION
ENHANCER:
text: ENHANCER
description: A cis-acting sequence that increases transcription
meaning: SO:0000165
is_a: REGULATORY_REGION
SILENCER:
text: SILENCER
description: A regulatory region which upon binding of transcription factors,
suppresses transcription
meaning: SO:0000625
is_a: REGULATORY_REGION
TERMINATOR:
text: TERMINATOR
description: The sequence of DNA located either at the end of the transcript that
causes RNA polymerase to terminate transcription
meaning: SO:0000141
is_a: REGULATORY_REGION
ATTENUATOR:
text: ATTENUATOR
description: A sequence that causes transcription termination
meaning: SO:0000140
is_a: REGULATORY_REGION
POLYA_SIGNAL_SEQUENCE:
text: POLYA_SIGNAL_SEQUENCE
description: The recognition sequence for the cleavage and polyadenylation machinery
meaning: SO:0000551
is_a: REGULATORY_REGION
BINDING_SITE:
text: BINDING_SITE
description: A region on a molecule that binds to another molecule
meaning: SO:0000409
is_a: BIOLOGICAL_REGION
TFBS:
text: TFBS
description: Transcription factor binding site
meaning: SO:0000235
is_a: BINDING_SITE
title: TF_binding_site
RIBOSOME_ENTRY_SITE:
text: RIBOSOME_ENTRY_SITE
description: Region where ribosome assembles on mRNA
meaning: SO:0000139
is_a: BINDING_SITE
POLYA_SITE:
text: POLYA_SITE
description: Polyadenylation site
meaning: SO:0000553
is_a: BIOLOGICAL_REGION
REPEAT_REGION:
text: REPEAT_REGION
description: A region of sequence containing one or more repeat units
meaning: SO:0000657
is_a: BIOLOGICAL_REGION
DISPERSED_REPEAT:
text: DISPERSED_REPEAT
description: A repeat that is interspersed in the genome
meaning: SO:0000658
is_a: REPEAT_REGION
TANDEM_REPEAT:
text: TANDEM_REPEAT
description: A repeat where the same sequence is repeated in the same orientation
meaning: SO:0000705
is_a: REPEAT_REGION
INVERTED_REPEAT:
text: INVERTED_REPEAT
description: A repeat where the sequence is repeated in the opposite orientation
meaning: SO:0000294
is_a: REPEAT_REGION
TRANSPOSABLE_ELEMENT:
text: TRANSPOSABLE_ELEMENT
description: A DNA segment that can change its position within the genome
meaning: SO:0000101
is_a: BIOLOGICAL_REGION
MOBILE_ELEMENT:
text: MOBILE_ELEMENT
description: A nucleotide region with the ability to move from one place in the
genome to another
meaning: SO:0001037
is_a: BIOLOGICAL_REGION
title: mobile_genetic_element
SEQUENCE_ALTERATION:
text: SEQUENCE_ALTERATION
description: A sequence that deviates from the reference sequence
meaning: SO:0001059
is_a: REGION
INSERTION:
text: INSERTION
description: The sequence of one or more nucleotides added between two adjacent
nucleotides
meaning: SO:0000667
is_a: SEQUENCE_ALTERATION
DELETION:
text: DELETION
description: The removal of a sequences of nucleotides from the genome
meaning: SO:0000159
is_a: SEQUENCE_ALTERATION
INVERSION:
text: INVERSION
description: A continuous nucleotide sequence is inverted in the same position
meaning: SO:1000036
is_a: SEQUENCE_ALTERATION
DUPLICATION:
text: DUPLICATION
description: One or more nucleotides are added between two adjacent nucleotides
meaning: SO:1000035
is_a: SEQUENCE_ALTERATION
SUBSTITUTION:
text: SUBSTITUTION
description: A sequence alteration where one nucleotide replaced by another
meaning: SO:1000002
is_a: SEQUENCE_ALTERATION
ORIGIN_OF_REPLICATION:
text: ORIGIN_OF_REPLICATION
description: The origin of replication; starting site for duplication of a nucleic
acid molecule
meaning: SO:0000296
is_a: BIOLOGICAL_REGION
POLYC_TRACT:
text: POLYC_TRACT
description: A sequence of Cs
is_a: REGION
GAP:
text: GAP
description: A gap in the sequence
meaning: SO:0000730
is_a: REGION
ASSEMBLY_GAP:
text: ASSEMBLY_GAP
description: A gap between two sequences in an assembly
meaning: SO:0000730
is_a: GAP
title: gap
CHROMOSOME:
text: CHROMOSOME
description: Structural unit composed of DNA and proteins
meaning: SO:0000340
is_a: REGION
SUPERCONTIG:
text: SUPERCONTIG
description: One or more contigs that have been ordered and oriented using end-read
information
meaning: SO:0000148
is_a: REGION
CONTIG:
text: CONTIG
description: A contiguous sequence derived from sequence assembly
meaning: SO:0000149
is_a: REGION
SCAFFOLD:
text: SCAFFOLD
description: One or more contigs that have been ordered and oriented
meaning: SO:0000148
is_a: REGION
title: supercontig
CLONE:
text: CLONE
description: A piece of DNA that has been inserted into a vector
meaning: SO:0000151
is_a: REGION
PLASMID:
text: PLASMID
description: A self-replicating circular DNA molecule
meaning: SO:0000155
is_a: REGION
POLYPEPTIDE:
text: POLYPEPTIDE
description: A sequence of amino acids linked by peptide bonds
meaning: SO:0000104
is_a: REGION
MATURE_PROTEIN_REGION:
text: MATURE_PROTEIN_REGION
description: The polypeptide sequence that remains after post-translational processing
meaning: SO:0000419
is_a: POLYPEPTIDE
SIGNAL_PEPTIDE:
text: SIGNAL_PEPTIDE
description: A peptide region that targets a polypeptide to a specific location
meaning: SO:0000418
is_a: POLYPEPTIDE
TRANSIT_PEPTIDE:
text: TRANSIT_PEPTIDE
description: A peptide that directs the transport of a protein to an organelle
meaning: SO:0000725
is_a: POLYPEPTIDE
PROPEPTIDE:
text: PROPEPTIDE
description: A peptide region that is cleaved during maturation
meaning: SO:0001062
is_a: POLYPEPTIDE
title: propeptide
OPERON:
text: OPERON
description: A group of contiguous genes transcribed as a single unit
meaning: SO:0000178
is_a: BIOLOGICAL_REGION
STEM_LOOP:
text: STEM_LOOP
description: A double-helical region formed by base-pairing between adjacent sequences
meaning: SO:0000313
is_a: REGION
D_LOOP:
text: D_LOOP
description: Displacement loop; a region where DNA is displaced by an invading
strand
meaning: SO:0000297
is_a: REGION
MATCH:
text: MATCH
description: A region of sequence similarity
meaning: SO:0000343
is_a: REGION
CDNA_MATCH:
text: CDNA_MATCH
description: A match to a cDNA sequence
meaning: SO:0000689
is_a: MATCH
EST_MATCH:
text: EST_MATCH
description: A match to an EST sequence
meaning: SO:0000668
is_a: MATCH
PROTEIN_MATCH:
text: PROTEIN_MATCH
description: A match to a protein sequence
meaning: SO:0000349
is_a: MATCH
NUCLEOTIDE_MATCH:
text: NUCLEOTIDE_MATCH
description: A match to a nucleotide sequence
meaning: SO:0000347
is_a: MATCH
JUNCTION_FEATURE:
text: JUNCTION_FEATURE
description: A boundary or junction between sequence regions
meaning: SO:0000699
is_a: BIOLOGICAL_REGION
title: junction
SPLICE_SITE:
text: SPLICE_SITE
description: The position where intron is excised
meaning: SO:0000162
is_a: JUNCTION_FEATURE
FIVE_PRIME_SPLICE_SITE:
text: FIVE_PRIME_SPLICE_SITE
description: The 5' splice site (donor site)
meaning: SO:0000163
is_a: SPLICE_SITE
title: five_prime_cis_splice_site
THREE_PRIME_SPLICE_SITE:
text: THREE_PRIME_SPLICE_SITE
description: The 3' splice site (acceptor site)
meaning: SO:0000164
is_a: SPLICE_SITE
title: three_prime_cis_splice_site
START_CODON:
text: START_CODON
description: The first codon to be translated
meaning: SO:0000318
is_a: BIOLOGICAL_REGION
STOP_CODON:
text: STOP_CODON
description: The codon that terminates translation
meaning: SO:0000319
is_a: BIOLOGICAL_REGION
CENTROMERE:
text: CENTROMERE
description: A region where chromatids are held together
meaning: SO:0000577
is_a: BIOLOGICAL_REGION
TELOMERE:
text: TELOMERE
description: The terminal region of a linear chromosome
meaning: SO:0000624
is_a: BIOLOGICAL_REGION