Skip to content

Enum: DatasetSplitType

Standard dataset split types used in machine learning for training,

validation, and testing. These splits are fundamental to ML model

development and evaluation workflows.

__

URI: valuesets:DatasetSplitType

Permissible Values

Value Meaning Description
TRAIN None Training split used for model learning
VALIDATION None Validation split used for hyperparameter tuning and model selection
TEST None Test split used for final model evaluation
ALL None Complete dataset without splits

Identifier and Mapping Information

Schema Source

  • from schema: https://w3id.org/linkml/valuesets

LinkML Source

name: DatasetSplitType
description: 'Standard dataset split types used in machine learning for training,

  validation, and testing. These splits are fundamental to ML model

  development and evaluation workflows.

  '
from_schema: https://w3id.org/linkml/valuesets
rank: 1000
permissible_values:
  TRAIN:
    text: TRAIN
    description: Training split used for model learning
    annotations:
      typical_size:
        tag: typical_size
        value: 60-80% of data
      purpose:
        tag: purpose
        value: model training
  VALIDATION:
    text: VALIDATION
    description: Validation split used for hyperparameter tuning and model selection
    annotations:
      typical_size:
        tag: typical_size
        value: 10-20% of data
      purpose:
        tag: purpose
        value: model tuning
      aliases:
        tag: aliases
        value: val, dev
  TEST:
    text: TEST
    description: Test split used for final model evaluation
    annotations:
      typical_size:
        tag: typical_size
        value: 10-20% of data
      purpose:
        tag: purpose
        value: model evaluation
  ALL:
    text: ALL
    description: Complete dataset without splits