Enum: DatasetSplitType

Standard dataset split types used in machine learning for training,

validation, and testing. These splits are fundamental to ML model

development and evaluation workflows.

__

Permissible Values

Value	Description	Aliases	Purpose	Typical Size
TRAIN	Training split used for model learning		model training	60-80% of data
VALIDATION	Validation split used for hyperparameter tuning and model selection	val, dev	model tuning	10-20% of data
TEST	Test split used for final model evaluation		model evaluation	10-20% of data
ALL	Complete dataset without splits

Identifier and Mapping Information

Schema Source

from schema: https://w3id.org/valuesets

LinkML Source

name: DatasetSplitType
instantiates:
- valuesets_meta:ValueSetEnumDefinition
description: 'Standard dataset split types used in machine learning for training,

  validation, and testing. These splits are fundamental to ML model

  development and evaluation workflows.

  '
title: Dataset Split Type
from_schema: https://w3id.org/valuesets
contributors:
- orcid:0000-0002-6601-2165
- https://github.com/anthropics/claude-code
status: DRAFT
rank: 1000
permissible_values:
  TRAIN:
    text: TRAIN
    description: Training split used for model learning
    annotations:
      typical_size:
        tag: typical_size
        value: 60-80% of data
      purpose:
        tag: purpose
        value: model training
  VALIDATION:
    text: VALIDATION
    description: Validation split used for hyperparameter tuning and model selection
    annotations:
      typical_size:
        tag: typical_size
        value: 10-20% of data
      purpose:
        tag: purpose
        value: model tuning
      aliases:
        tag: aliases
        value: val, dev
  TEST:
    text: TEST
    description: Test split used for final model evaluation
    annotations:
      typical_size:
        tag: typical_size
        value: 10-20% of data
      purpose:
        tag: purpose
        value: model evaluation
  ALL:
    text: ALL
    description: Complete dataset without splits