A system for managing changes to data schemas and definitions using meaningful version numbers.
Semantic versioning for data (DataVer) applies the principles of software semantic versioning (SemVer) to data schemas, metric definitions, and ontologies. Just as software uses MAJOR.MINOR.PATCH version numbers to communicate the nature of changes, data semantic versioning assigns version numbers that indicate whether a change is breaking (MAJOR — changes the meaning of existing data), additive (MINOR — adds new fields or metrics), or corrective (PATCH — fixes errors without changing semantics).
As AI systems increasingly depend on specific data definitions and schemas, unversioned changes to data structures cause silent failures and incorrect outputs. Data semantic versioning has become a standard practice in mature data organizations, enabling AI systems to declare their data dependencies with precision and receive alerts when breaking changes occur. dbt, Atlan, and Collibra all now support semantic versioning for data assets.
Data semantic versioning is implemented by tagging data assets (tables, metrics, ontology classes) with version numbers and maintaining a changelog. Breaking changes (renaming a column, changing a metric calculation, removing a field) increment the MAJOR version. Additive changes (adding a new column, adding a new metric) increment MINOR. Corrections increment PATCH. Downstream consumers declare version dependencies, and automated systems alert when incompatible changes are published.
A data team changes the definition of 'revenue' from gross revenue to net revenue — a breaking change that increments the metric version from 2.1.3 to 3.0.0. All downstream AI models and dashboards that declared a dependency on revenue@^2.x.x receive automated alerts. The team can see exactly which AI systems will be affected before deploying the change, preventing silent failures in production.