Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
.DS_Store
.docusaurus
node_modules/
docs/_examples/
docs/_schema/
docs/schema/
docs/schema
static/_schema/
build/
9 changes: 9 additions & 0 deletions .markdownlint.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"default": true,
"MD013": false,
"MD033": false,
"MD041": false,
"MD024": {
"siblings_only": true
}
}
6 changes: 6 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
build/
.docusaurus/
node_modules/
package-lock.json
*.md
*.mdx
10 changes: 10 additions & 0 deletions .prettierrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"semi": true,
"singleQuote": true,
"tabWidth": 2,
"trailingComma": "es5",
"printWidth": 100,
"bracketSpacing": true,
"arrowParens": "always",
"proseWrap": "preserve"
}
69 changes: 69 additions & 0 deletions docs/schema/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: Overview
slug: /schema

# This page is available at docs.overturemaps.org/schema
---
import CodeBlock from '@theme/CodeBlock';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

### Top-level properties

In the Overture schema, all features have a unique `id` called a [GERS ID](https://docs.overturemaps.org/gers/), a `geometry` object that follows the OGC geometry specification, and other top-level properties.

<details>
<summary>**GeoParquet columns for top-level Overture properties**</summary>
| column_name | column_type | description |
| --- | --- | --- |
| **id** | *string* | an Overture feature's unique id, part of the Global Entity Reference System (GERS) |
| **geometry** | *binary* | well-known binary (WKB) representation of the feature geometry |
| **bbox** | *struct\<xmin: float, xmax: float, ymin: float, ymax: float\>* | area defined by two longitudes and two latitudes: latitude is a decimal number between -90.0 and 90.0; longitude is a decimal number between -180.0 and 180.0. |
| **theme** | *string* | one of six Overture data themes |
| **type** | *string* | one of 14 Overture feature types |
| **version** | *int32* | version number of the feature, incremented in each Overture release where the geometry or attributes of this feature changed |
| **sources** | *list\<element: struct\<property: string, dataset: string, record_id: string, update_time: string, confidence: double, between: list\<double\>\>\>* | array of source information for the properties of a given feature |
</details>

### Other key schema properties

Most but not all of the feature types in the Overture schema require data for the `names`, `subtype`, and `class` properties. The `names` property is complex enough to have its own schema.

### Properties may be specific to a feature type

Some properties in the Overture schema are only populated with data for specific feature types. For example, the `place` feature type must include data for the `categories` property, as required by the schema. The `division_area` and `address` feature types require the `country` property to be populated with ISO 3166-1 alpha-2 country codes. The `segment` feature type in the transportation theme is the only feature type that includes data for a complex set of properties that describe roads. The reference section of this documentation digs into the details of these complexities.

## Schema conventions

### Notations

#### Snake case

We use snake case instead of camel case for all property names, string enumeration members, and string-valued enumeration equivalents. We do this because of case sensitivity and transformation issues in different databases and query engines. For example, Athena/Trino downcases everything, so text string splits in camel case property names get lost; in contrast, snake case passes through without issues.

#### Booleans

Boolean properties have a prefix verb "is" or "has" in a way that grammatically makes sense, e.g. `has_street_lights=true` and `is_accessible=false`.

### Measurements

<!-- add to the docs: if we're using both feet and meters in measurements, what's the best way to determine the unit of measure? the schema, presumably, but also the bounding box of the data?
-->

Measurements of real-world objects and features follow [The International System of Units (SI)](https://www.bipm.org/en/publications/si-brochure): heights, widths, lengths, etc. In the Overture schema, these values are provided as scalar numeric value without units such as feet or meters. Overture does this to maximize consistency and predictability.

Quantities specified in regulatory rules, norms and customs follow local specifications wherever possible. In the schema, these values are provided as two-element arrays where the first element is the scalar numeric value and the second value is the units. Overture uses local units of measurement: feet in the United States and meters in the EU, for example. The exact unit is confirmed in the specification of the property but is not repeated in the data.

### Regulations and restrictions

All quantities that relate to posted or ordinance regulations and restrictions are expressed in the same units as used in the regulation. The unit is explicitly included with the property in the data.

### Opening hours and validity periods

Opening hours and the time frame during which time dependent properties are applicable are indicated following the [OSM Opening Hours specification](https://wiki.openstreetmap.org/wiki/Key:opening_hours/specification).

<!-- This is not yet true
### Extensions

Overture allows for add hoc extensions beyond what is described in the schema. All extensions are prefixed with `ext_`. Extensions can be provided at the theme level, the type level, or the property level.
-->
16 changes: 16 additions & 0 deletions docs/schema/reference/Names/name_rule.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# NameRule

Name rule with variant and language specification.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `side` | `string` ([Side](/schema/codegen/Names/name_rule/side/)) (optional) | Examples: `left`, `right` |
| `between` | `list<float64>` (optional) | |
| `value` | `string` | |
| `variant` | `string` ([NameVariant](/schema/codegen/Names/name_rule/name_variant/)) | Examples: `common`, `official`, `alternate`, ... |
| `language` | `string` (optional) | |
| `perspectives` | `object` (`[Perspectives](perspectives)`) (optional) | |
| `perspectives.mode` | `string` ([PerspectiveMode](perspective_mode)) | Whether the perspective holder accepts or disputes this name. Examples: `accepted_by`, `disputed_by` |
| `perspectives.countries` | `list` | Countries holding the given mode of perspective. |
10 changes: 10 additions & 0 deletions docs/schema/reference/Names/name_variant.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# NameVariant

An enumeration.

## Values

- `common`
- `official`
- `alternate`
- `short`
19 changes: 19 additions & 0 deletions docs/schema/reference/Names/names.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Names

Multilingual names container.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `primary` | `string` | The most commonly used name. |
| `common` | `object` (optional) | |
| `rules` | `list<object (`[NameRule](name_rule)`)>` (optional) | Rules for names that cannot be specified in the simple common names property. These rules can cover other name variants such as official, alternate, and short; and they can optionally include geometric scoping (linear referencing) and side-of-road scoping for complex cases. |
| `rules.side` | `string` ([Side](side)) (optional) | Examples: `left`, `right` |
| `rules.between` | `list<float64>` (optional) | |
| `rules.value` | `string` | |
| `rules.variant` | `string` ([NameVariant](name_variant)) | Examples: `common`, `official`, `alternate`, ... |
| `rules.language` | `string` (optional) | |
| `rules.perspectives` | `object` (`[Perspectives](perspectives)`) (optional) | |
| `rules.perspectives.mode` | `string` ([PerspectiveMode](perspective_mode)) | Whether the perspective holder accepts or disputes this name. Examples: `accepted_by`, `disputed_by` |
| `rules.perspectives.countries` | `list` | Countries holding the given mode of perspective. |
8 changes: 8 additions & 0 deletions docs/schema/reference/Names/perspective_mode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# PerspectiveMode

Perspective mode for disputed names.

## Values

- `accepted_by`
- `disputed_by`
10 changes: 10 additions & 0 deletions docs/schema/reference/Names/perspectives.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Perspectives

Political perspectives container.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `mode` | `string` ([PerspectiveMode](/schema/codegen/Names/perspectives/perspective_mode/)) | Whether the perspective holder accepts or disputes this name. Examples: `accepted_by`, `disputed_by` |
| `countries` | `list` | Countries holding the given mode of perspective. |
10 changes: 10 additions & 0 deletions docs/schema/reference/Names/side.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Side

Represents the side on which something appears relative to a facing or heading
direction, e.g. the side of a road relative to the road orientation, or relative to
the direction of travel of a person or vehicle.

## Values

- `left`
- `right`
13 changes: 13 additions & 0 deletions docs/schema/reference/address.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Address

Base model that forbids additional properties in JSON Schema.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `freeform` | `string` (optional) | Free-form address that contains street name, house number and other address info |
| `locality` | `string` (optional) | Name of the city or neighborhood where the address is located |
| `postcode` | `string` (optional) | Postal code where the address is located |
| `region` | `string` (optional) | |
| `country` | `string` (optional) | |
59 changes: 59 additions & 0 deletions docs/schema/reference/addresses/address/address.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Address

Addresses are geographic points used for locating businesses and individuals. The
rules, fields, and fieldnames of an address can vary extensively between locations.
We use a simplified schema to capture worldwide address points. This initial schema
is largely based on the OpenAddresses (www.openaddresses.io) project.

The address schema allows up to 5 "admin levels". Rather than have field names that
apply across all countries, we provide an array called "address_levels" containing
the necessary administrative levels for an address.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `id` | `string` | |
| `theme` | `"addresses"` | |
| `type` | `"address"` | |
| `geometry` | `geometry` | Geometry (Point) |
| `version` | `int32` | |
| `sources[]` | `list<object (`[SourcePropertyItem](../../Sources/source_property_item)`)>` (optional) | |
| `sources[].between` | `list<float64>` (optional) | |
| `sources[].property` | `string` | |
| `sources[].dataset` | `string` | |
| `sources[].record_id` | `string` (optional) | Refers to the specific record within the dataset that was used. |
| `sources[].update_time` | `string` (optional) | |
| `sources[].confidence` | `float64` (optional) | |
| `address_levels` | `list<object (`[AddressLevel](address_level)`)>` (optional) | The administrative levels present in an address. The number of values in this list and their meaning is country-dependent. For example, in the United States we expect two values: the state and the municipality. In other countries there might be only one. Other countries could have three or more. The array is ordered with the highest levels first. Note: when a level is not known - most likely because the data provider has not supplied it and we have not derived it from another source, the array element container must be present, but the "value" field should be omitted |
| `address_levels.value` | `string` (optional) | |
| `country` | `string` (optional) | |
| `number` | `string` (optional) | The house number for this address. This field may not strictly be a number. Values such as "74B", "189 1/2", "208.5" are common as the number part of an address and they are not part of the "unit" of this address. |
| `postal_city` | `string` (optional) | In some countries or regions, a mailing address may need to specify a different city name than the city that actually contains the address coordinates. This optional field can be used to specify the alternate city name to use. Example from US National Address Database: 716 East County Road, Winchester, Indiana has "Ridgeville" as its postal city Another example in Slovenia: Tomaj 71, 6221 Dutovlje, Slovenia |
| `postcode` | `string` (optional) | The postcode for the address |
| `street` | `string` (optional) | The street name associated with this address. The street name can include the street "type" or street suffix, e.g., Main Street. Ideally this is fully spelled out and not abbreviated but we acknowledge that many address datasets abbreviate the street name so it is acceptable. |
| `unit` | `string` (optional) | The suite/unit/apartment/floor number |

## Examples

| Column | Value |
|-------:|-------|
| `geometry` | `POINT (-176.5637854 -43.9471955)` |
| `address_levels[0].value` | `Chatham Islands` |
| `address_levels[1].value` | `Chatham Island` |
| `country` | `NZ` |
| `number` | `54` |
| `postal_city` | `null` |
| `postcode` | `null` |
| `street` | `Tikitiki Hill Road` |
| `unit` | `null` |
| `id` | `416ab01c-d836-4c4f-aedc-2f30941ce94d` |
| `sources[0].between` | `null` |
| `sources[0].confidence` | `null` |
| `sources[0].dataset` | `OpenAddresses/LINZ` |
| `sources[0].property` | |
| `sources[0].record_id` | `null` |
| `sources[0].update_time` | `null` |
| `theme` | `addresses` |
| `type` | `address` |
| `version` | `1` |
14 changes: 14 additions & 0 deletions docs/schema/reference/addresses/address/address_level.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# AddressLevel

An address "admin level".

We want to avoid the phrase "admin level" and have chosen "address level". These
represent states, regions, districts, cities, neighborhoods, etc. The address schema
defines several numbered levels with per-country rules indicating which parts of a
country's address goes to which numbered level.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `value` | `string` (optional) | |
48 changes: 48 additions & 0 deletions docs/schema/reference/base/bathymetry/bathymetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Bathymetry

Topographic representation of an underwater area, such as a part of the ocean
floor.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `cartography` | `object` (`[CartographicHints](../../cartographic_hints)`) (optional) | |
| `cartography.prominence` | `integer` (optional) | |
| `cartography.min_zoom` | `integer` (optional) | |
| `cartography.max_zoom` | `integer` (optional) | |
| `cartography.sort_key` | `integer` (optional) | |
| `id` | `string` | |
| `theme` | `"base"` | |
| `type` | `"bathymetry"` | |
| `geometry` | `geometry` | Geometry (Polygon or MultiPolygon) |
| `version` | `int32` | |
| `sources[]` | `list<object (`[SourcePropertyItem](../../Sources/source_property_item)`)>` (optional) | |
| `sources[].between` | `list<float64>` (optional) | |
| `sources[].property` | `string` | |
| `sources[].dataset` | `string` | |
| `sources[].record_id` | `string` (optional) | Refers to the specific record within the dataset that was used. |
| `sources[].update_time` | `string` (optional) | |
| `sources[].confidence` | `float64` (optional) | |
| `depth` | `int32` | |

## Examples

| Column | Value |
|-------:|-------|
| `geometry` | `MULTIPOLYGON (((-170.71296928 -76.744313428, -170.719841483 -76.757076376, -170.731061124 -76.761566...` |
| `depth` | `500` |
| `cartography.max_zoom` | `null` |
| `cartography.min_zoom` | `null` |
| `cartography.prominence` | `null` |
| `cartography.sort_key` | `12` |
| `id` | `5d40bd6c-db14-5492-b29f-5e25a59032bc` |
| `sources[0].between` | `null` |
| `sources[0].confidence` | `null` |
| `sources[0].dataset` | `ETOPO/GLOBathy` |
| `sources[0].property` | |
| `sources[0].record_id` | `2024-12-09T00:00:00.000Z` |
| `sources[0].update_time` | `null` |
| `theme` | `base` |
| `type` | `bathymetry` |
| `version` | `0` |
69 changes: 69 additions & 0 deletions docs/schema/reference/base/infrastructure/infrastructure.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Infrastructure

Various features from OpenStreetMap such as bridges, airport runways, aerialways,
or communication towers and lines.

## Fields

| Name | Type | Description |
|-----:|:----:|-------------|
| `source_tags` | `record<string, Any>` (optional) | |
| `wikidata` | `string` (optional) | |
| `level` | `int32` (optional) | |
| `names` | `object` (`[Names](../../Names/names)`) (optional) | |
| `names.primary` | `string` | The most commonly used name. |
| `names.common` | `object` (optional) | |
| `names.rules[]` | `list<object (`[NameRule](../../Names/name_rule)`)>` (optional) | Rules for names that cannot be specified in the simple common names property. These rules can cover other name variants such as official, alternate, and short; and they can optionally include geometric scoping (linear referencing) and side-of-road scoping for complex cases. |
| `names.rules[].side` | `string` ([Side](../../Names/side)) (optional) | Examples: `left`, `right` |
| `names.rules[].between` | `list<float64>` (optional) | |
| `names.rules[].value` | `string` | |
| `names.rules[].variant` | `string` ([NameVariant](../../Names/name_variant)) | Examples: `common`, `official`, `alternate`, ... |
| `names.rules[].language` | `string` (optional) | |
| `names.rules[].perspectives` | `object` (`[Perspectives](../../Names/perspectives)`) (optional) | |
| `names.rules[].perspectives.mode` | `string` ([PerspectiveMode](../../Names/perspective_mode)) | Whether the perspective holder accepts or disputes this name. Examples: `accepted_by`, `disputed_by` |
| `names.rules[].perspectives.countries` | `list` | Countries holding the given mode of perspective. |
| `id` | `string` | |
| `theme` | `"base"` | |
| `type` | `"infrastructure"` | |
| `geometry` | `geometry` | Geometry (Point, LineString, Polygon, or MultiPolygon) |
| `version` | `int32` | |
| `sources[]` | `list<object (`[SourcePropertyItem](../../Sources/source_property_item)`)>` (optional) | |
| `sources[].between` | `list<float64>` (optional) | |
| `sources[].property` | `string` | |
| `sources[].dataset` | `string` | |
| `sources[].record_id` | `string` (optional) | Refers to the specific record within the dataset that was used. |
| `sources[].update_time` | `string` (optional) | |
| `sources[].confidence` | `float64` (optional) | |
| `class` | `string` ([InfrastructureClass](infrastructure_class)) | Examples: `aerialway_station`, `airport`, `airport_gate`, ... |
| `subtype` | `string` ([InfrastructureSubtype](infrastructure_subtype)) | Examples: `aerialway`, `airport`, `barrier`, ... |
| `height` | `float64` (optional) | |
| `surface` | `string` ([SurfaceMaterial](../surface_material)) (optional) | Examples: `asphalt`, `cobblestone`, `compacted`, ... |

## Examples

| Column | Value |
|-------:|-------|
| `geometry` | `LINESTRING (-176.6518141 -44.0074721, -176.6509243 -44.0063362)` |
| `subtype` | `barrier` |
| `height` | `null` |
| `surface` | `null` |
| `class` | `fence` |
| `id` | `06e4de8d-bdce-314c-8e25-90ce70b8fe57` |
| `level` | `null` |
| `names.common` | `null` |
| `names.primary` | `null` |
| `names.rules` | `null` |
| `source_tags.LINZ:source_version` | `V16` |
| `source_tags.attribution` | `http://wiki.osm.org/wiki/Attribution#LINZ` |
| `source_tags.barrier` | `fence` |
| `source_tags.source_ref` | `http://www.linz.govt.nz/topography/topo-maps/` |
| `sources[0].between` | `null` |
| `sources[0].confidence` | `null` |
| `sources[0].dataset` | `OpenStreetMap` |
| `sources[0].property` | |
| `sources[0].record_id` | `w56754564@1` |
| `sources[0].update_time` | `2010-04-28T12:01:53.000Z` |
| `theme` | `base` |
| `type` | `infrastructure` |
| `version` | `0` |
| `wikidata` | `null` |
Loading