Skip to content

Logical Meta-Data Annotations Vacabulary⚓︎

The Logical Meta-Data Annotations Vacabulary, defined as an extension of the SAS, provides commonly used metadata for annotating schemas at a logical level. The keywords in this vocabulary do not include, and will not include in the future, any information about the underlying datastore or details for schema or data validation.

The current URI for this vocabulary, known as the Logical Meta-Data Annotations Vacabulary, is: https://dpds.opendatamesh.org/specifications/sas/1.0.0-DRAFT/vocab/meta-data-logical.

The current URI for the corresponding meta-schema is: https://dpds.opendatamesh.org/specifications/sas/1.0.0/meta/meta-data-logical.

Keywords applicable to schema or properties⚓︎

summary⚓︎

The value of this keyword MUST be a string.

The summary keyword specifies a short human readable description of the element. It SHOULD be used by frontend tools to visualize the item description in lists or tooltips where there is not enough space for using the full description.

Example-1:

JSON
{
  "title": "Leg",
  "summary": "The association between Trasport Units and Trips",
  "description": "A **Leg** is the association between a `Transport Unit` and a specific `Trip`. It represents how individual `Transport Units` are moved during a particular segment of their journey. Multiple  `Leg` can exist for a `Transport Unit` if it is moved across different `Trips` before reaching its final destination. Similarly, a single `Trip` can involve `Transport Units` from multiple `Transport Orders`.",
}

modelRole⚓︎

The value of this keyword MUST be a string

The modellingStyle keyword specifies the role of a schema element within its specific modelling style, helping to clarify its function in the context of the chosen model. For example, if the modellingStyle is set to starSchema, the schema might define an entity with a modelRole of either fact or dimension. In this case, the properties of the schema can have a modelRole of either attribute or measure. On the other hand, if the modellingStyle is set to rawDataVault, the schema might define an entity with a modelRole of hub, satellite, or link. This distinction helps clarify the purpose of each element within the overall model.

tags⚓︎

The value of this keyword MUST be an array of string

The tags keyword specifies a list of tags associated to the element.

This keyword is equivalent to: - bitol.schema.tags

externalDocs⚓︎

The value of this keyword MUST be an array of object

The externalDocs keyword specifies a list of additional documentation for the given element. Each item in the list is a pointer to a specific documentation source described as follow:

Field Name Type Description
description string A description of the target resource. CommonMark syntax MAY be used for rich text representation.
mediaType string The media type of target resource. It must conform to media type format, according to RFC6838.
$href string:uri REQUIRED. The URI of the target resource. It must conform to the URI format, according to RFC3986.

Example-1:

JSON
"externalDocs": [{
  "description": "Find more info here",
  "mediaType": "text/html",
  "$href": "https://example.com"
}]

This keyword can be maped to: - bitol.schema.primaryKey

Keywords applicable only to schema⚓︎

owner⚓︎

The value of this keyword MUST be a string.

The owner keyword specifies the identifier of the subject who owns the schema. It SHOULD be a person or a team. If the schema is not shared, it MUST be equal to the owner of the dataset upon which the schema is defined.

This keyword is equivalent to: - bitol.schema.owner

domain⚓︎

The value of this keyword MUST be a string.

The domain keyword specifies the domain to which the dataset described by the schema belongs. If the schema is not shared, it MUST be equal to the domain of the dataset upon which the schema is defined.

This keyword is equivalent to: - bitol.schema.domain

schemaType⚓︎

The value of this keyword MUST be a string.

The schemaType keyword specifies the structure of the data described by the schema. It indicates whether the data is organized in a tabular format, typical of relational databases (e.g., tables with rows and columns), or in a nested document format, common in document-oriented databases, streaing platforms and RESTFUL API (e.g., JSON or XML with hierarchical relationships). The possible values for this property are:

value description
tabular When the schema doesn't contain properties of type object (i.e. the schema describe a tabular document)
document When the schema contains properties of type object (i.e. the schema describe a neasted document)

Example-1

JSON
{
    "title": "transportOrder",
  "schemaType": "tabular",
    "type": "object",
    "properties": {
      "orderId": {"type": "integer"},
      "customerName": {"type": "string"},
      "orderDate": {"type": "string"},
      "deliveryDate": {"type": "string"},
      "destination": { "type": "string"},
      "orderStatus": {"type": "string" }
    }
  }

Example-2

JSON
{
    "title": "transportOrderDetail",
  "schemaType": "document",
    "type": "object",
    "properties": {
      "orderId": {"type": "integer"},
      "customerName": {"type": "string"},
      "orderDate": {"type": "string"},
      "deliveryDate": {"type": "string"},
      "destination": { "type": "string"},
      "orderStatus": {"type": "string" },
    "transportUnits": {
      "type": "array",
          "description": "List of transport units associated with the transport order.",
          "items": {
            "type": "object",
            "properties": {
                "unitId": {"type": "integer"},
                "unitDescription": {"type": "string"}
            }
        }
      }
  }
}

modellingStyle⚓︎

The value of this keyword MUST be a string.

The modellingStyle keyword specifies the data modelling approach or framework within which the structure of the entity described by this schema is defined. This could include frameworks like starSchema, rawDataVault, or unifiedStarSchema, each with its own set of rules for how data is organized, related, and queried. Understanding the modelling style is useful for better contextualizing the schema's structure and gaining a clearer understanding of the roles and relationships of other elements defined within it (see modelRole keyword).

contactPoints⚓︎

The value of this keyword MUST be an array of object.

The contactPoints keyword specifies a list of contact information for the given schema. Each item in the list is a valid contact point described as follow:

Field Name Type Description
name string:name The name of the contact point.
description string The contact point description. CommonMark syntax MAY be used for rich text representation.
channel string The channel used to address the contact point. It can be for example equal to web, mail, or phone.
address string The address of the contact point. Depending on the channel it can be for example a URL, an email address, or a phone number.

Example-1

JSON
{
  "contactPoints": [{
    "name": "Support Team Mail",
      "description": "The mail address of the team that gives support on this product",
      "channel": "email",
      "address": "trip-execution-support@company-xyz.com"
  }, {
      "name": "Issue Tracker",
      "description": "The address of the issue tracker associated with this product",
      "channel": "web",
      "address": "https://readmine.company-xyz.com/trip-execution"
    }
  ]
}

This keyword cona be mapped to:

status⚓︎

The value of this keyword MUST be a string

The status keyword specifies the state of the schema, which MAY be development, test, or production, depending on how the schema's lifecycle is defined.

This keyword is equivalent to:

Keywords applicable only to properties⚓︎

primaryKey⚓︎

The value of this keyword MUST be a boolean.

When set to true, the primaryKey keyword indicates that the property is part of the dataset's primary key.

The default value is false.

This keyword is equivalent to:

primaryKeyPosition⚓︎

The value of this keyword MUST be a integer.

When the property is part of the primary key, the primaryKeyPosition keyword specifies the position of the property within th primary key. Starts from 1.

The default value is -1.

Example-1: Given the table TRIP with a composite primary key defined as follow PRIMARY KEY (UNIT_ID, TRIP_ID, LEG_SEQUENCE) the annotation of the properties in the schema will be as follow:

JSON
  {
    "properties": {
      "UNIT_ID": {
        "type": "string",
        "primaryKey": true,
        "primaryKeyPosition": 1
      },
      "TRIP_ID": {
        "type": "string",
        "primaryKey": true,
        "primaryKeyPosition": 2
      },
      "LEG_SEQUENCE": {
        "type": "integer",
        "primaryKey": true,
        "primaryKeyPosition": 3
      }
    }
  }

This keyword is equivalent to: - bitol.schema.primaryKeyPosition

unique⚓︎

The value of this keyword MUST be a boolean.

The unique keyword specifies whether two distinct entries in the dataset can have the same value for this property.

The default value is false.

Example-1:

JSON
{
  "title": "TransportOrder",

 "type": "object",
  "properties": {
    "orderId": { "type": "integer", "unique": true},
    "customerName": { "type": "string"}
  },
}

This keyword is equivalent to: - bitol.schema.unique

nullable⚓︎

The value of this keyword MUST be a boolean.

The nullable keyword specifies if the property value can be null. The value of a property is considered null if the property is not defined or its value is: - undefined - contained in the list of values specified by the nullValues keyword - matched by the regular expression specified by the nullValuePattern keyword

The default value is true.

Example-1:

JSON
{
  "title": "TransportOrder",

 "type": "object",
  "properties": {
    "orderId": { "type": "integer"},
    "customerName": {"type": "string", "nullable": false},
    "orderDate": {"type": "string"},
    "deliveryDate": {"type": "string"},
    "destination": {"type": "string"},
    "orderStatus": {"type": "string"}
  },
}

This keyword is the opposite of: - bitol.schema.required

nullValuesEnum⚓︎

The value of this keyword MUST be an array whose item type is equal to the type of the property.

The nullValuesEnum keyword specifies a list of values for the property that can be considere as null.

The default value is [].

Example-1:

JSON
{
  "title": "TransportOrder",

 "type": "object",
  "properties": {
    "orderId": { "type": "integer"},
    "customerName": {
        "type": "string", 
        "nullable": false,
        "nullValuesEnum": ["UNKNOWN", "TBD", "NA"]
    }
  },
}

nullValuesPattern⚓︎

The value of this keyword MUST be a string. This string SHOULD be a valid regular expression, according to the ECMA-262 regular expression dialect.

The nullValuesPattern keyword specifies the regular expression matched by property null values

The default value is ``.

Example-1:

JSON
{
  "title": "TransportOrder",

  "type": "object",
  "properties": {
    "orderId": { "type": "integer"},
    "customerName": {
      "type": "string", 
      "nullable": false,
      "nullValuesPattern": "^\\s*(UNKNOWN|TBD|NA)?\\s*$"
    }
  }
}

validityTime⚓︎

The value of this keyword MUST be a boolean.

When set to true, the validityTime keyword indicates that this property value represents when a fact is true in the real world. For example, a dataset's entry showing the employment status of an employee might have a valid time indicating when that employment status was valid in reality.

The default value is false.

creationTime⚓︎

The value of this keyword MUST be a boolean.

When set to true, the creationTime keyword indicates that this property value represents when a fact is recorded in the system. It's the transaction time of the entry's creation.

The default value is false.

lastUpdateTime⚓︎

The value of this keyword MUST be a boolean.

When set to true, the lastUpdateTime keyword indicates that this property value represents when a fact is recorded or updated in the system. It's the transaction time of the entry's last update.

The default value is false.

deletionTime⚓︎

The value of this keyword MUST be a boolean.

When set to true, the deletionTime keyword indicates that this property value represents when a fact is soft deletaed from the system. It's the transaction time of the entry's delation.

The default value is false.

sequenceKey⚓︎

The value of this keyword MUST be a boolean.

When set to true, the sequenceKey keyword indicates that this property can be used to order the dataset's entries from the oldest to the most recent, based on creation transaction time. The property is updated with each entry's change and is typically a timestamp or an incremental key.

The default value is true if keywords creationTime set to true for this property, false otherwise.

watermarkKey⚓︎

The value of this keyword MUST be a boolean.

When set to true, the watermarkKey keyword indicates that this property can be used to order the dataset's entries from the oldest to the most recent, based on the last update transaction time. The property is updated with each entry's change and is typically a timestamp or an incremental key.

The default value is true if keywords lastUpdateTime set to true for this property, false otherwise.