Skip to content
Merged
Changes from 67 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
d6241c9
remove references to dataclass objects
kmoscoe Jun 12, 2025
101c5fd
Fix a copy-paste error.
kmoscoe Jun 12, 2025
504d852
remove extra file
kmoscoe Jun 12, 2025
1697b22
Merge branch 'datacommonsorg:master' into master
kmoscoe Jun 12, 2025
426a81c
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Jun 24, 2025
a40c7c8
Merge branch 'master' of https://github.com/kmoscoe/docsite
kmoscoe Jun 24, 2025
4c69c15
Merge branch 'datacommonsorg:master' into master
kmoscoe Sep 17, 2025
dd5b50f
Merge branch 'datacommonsorg:master' into master
kmoscoe Sep 23, 2025
23cb4c4
Merge branch 'datacommonsorg:master' into master
kmoscoe Sep 23, 2025
1157311
Merge branch 'datacommonsorg:master' into master
kmoscoe Sep 24, 2025
23d3429
Merge branch 'datacommonsorg:master' into master
kmoscoe Sep 25, 2025
2a3409f
Merge branch 'datacommonsorg:master' into master
kmoscoe Sep 30, 2025
516ed75
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Oct 7, 2025
564457c
Merge branch 'datacommonsorg:master' into master
kmoscoe Oct 7, 2025
7052453
Merge branch 'datacommonsorg:master' into master
kmoscoe Oct 7, 2025
5da50d1
Merge branch 'datacommonsorg:master' into master
kmoscoe Oct 8, 2025
a011388
Merge branch 'master' of https://github.com/kmoscoe/docsite
kmoscoe Oct 8, 2025
12b9749
Merge branch 'datacommonsorg:master' into master
kmoscoe Oct 14, 2025
f4861e4
Merge branch 'datacommonsorg:master' into master
kmoscoe Oct 15, 2025
33234d2
Fix a copy-paste error.
kmoscoe Jun 12, 2025
169781f
Merge branch 'datacommonsorg:master' into master
kmoscoe Oct 24, 2025
f3a9005
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Oct 27, 2025
205cb04
Merge branch 'datacommonsorg:master' into master
kmoscoe Nov 3, 2025
3daf24f
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Nov 5, 2025
7599eff
Merge branch 'datacommonsorg:master' into master
kmoscoe Nov 5, 2025
4dd0251
Merge branch 'master' of https://github.com/kmoscoe/docsite
kmoscoe Nov 5, 2025
54ca4cf
Remove unused file
kmoscoe Nov 5, 2025
5ab6c5c
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Nov 11, 2025
1c2a36c
Merge branch 'datacommonsorg:master' into master
kmoscoe Nov 24, 2025
5d2800b
Merge branch 'datacommonsorg:master' into master
kmoscoe Nov 25, 2025
b2cbfd4
Merge branch 'datacommonsorg:master' into master
kmoscoe Dec 3, 2025
494375d
Merge branch 'datacommonsorg:master' into master
kmoscoe Dec 9, 2025
f4da5c3
Merge branch 'datacommonsorg:master' into master
kmoscoe Dec 9, 2025
73c0d41
Merge branch 'datacommonsorg:master' into master
kmoscoe Dec 9, 2025
735db87
Merge branch 'datacommonsorg:master' into master
kmoscoe Dec 16, 2025
d165d9e
Merge branch 'datacommonsorg:master' into master
kmoscoe Dec 17, 2025
4559295
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Dec 17, 2025
85d15a4
Merge branch 'master' of https://github.com/kmoscoe/docsite
kmoscoe Dec 17, 2025
d80ed79
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Jan 13, 2026
6184f92
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Jan 14, 2026
4994889
Start draft on new resolver parameters
kmoscoe Jan 27, 2026
bf1871d
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe Jan 28, 2026
171028b
Merge branch 'master' of https://github.com/datacommonsorg/docsite in…
kmoscoe Jan 28, 2026
9e5f038
first draft of new resolve API
kmoscoe Jan 28, 2026
5464aaa
First complete draft of new resolve API parameter
kmoscoe Jan 29, 2026
c0ba399
cut redundant text
kmoscoe Jan 29, 2026
2a581dc
Remove cohort from glossary
kmoscoe Jan 29, 2026
56506a2
Update api/python/v2/resolve.md
kmoscoe Jan 29, 2026
c435dfc
Update api/python/v2/resolve.md
kmoscoe Jan 29, 2026
4fb516f
minor fixes from Gemini
kmoscoe Jan 29, 2026
329fe2c
Merge branch 'datacommonsorg:master' into master
kmoscoe Feb 4, 2026
85af0c9
Merge branch 'master' of https://github.com/kmoscoe/docsite into api
kmoscoe Feb 4, 2026
878a9ab
Update api/python/v2/resolve.md
kmoscoe Feb 4, 2026
10aaa7b
Some changes from Christie
kmoscoe Feb 4, 2026
84868fc
Merge branch 'datacommonsorg:master' into api
kmoscoe Feb 4, 2026
91e27fc
Merge branch 'api' of https://github.com/kmoscoe/docsite into api
kmoscoe Feb 4, 2026
540239a
minor changes
kmoscoe Feb 4, 2026
f5c7a49
Update api/rest/v2/resolve.md
kmoscoe Feb 4, 2026
22d4ab6
Update api/rest/v2/resolve.md
kmoscoe Feb 4, 2026
6b506f2
Update api/rest/v2/resolve.md
kmoscoe Feb 4, 2026
5736dd7
Update api/rest/v2/resolve.md
kmoscoe Feb 4, 2026
41c8ab0
changes from Christie
kmoscoe Feb 4, 2026
7ba5375
Merge branch 'api' of https://github.com/kmoscoe/docsite into api
kmoscoe Feb 4, 2026
52c76eb
Changes for consistency
kmoscoe Feb 4, 2026
10adb2a
small fix from Christie
kmoscoe Feb 4, 2026
37cc7d3
Merge branch 'master' of https://github.com/datacommonsorg/docsite in…
kmoscoe Mar 4, 2026
fb93e9d
Updates to data model page
kmoscoe Mar 4, 2026
98171c0
Update data_model.md
kmoscoe Mar 4, 2026
af03d94
Remove link
kmoscoe Mar 4, 2026
83577ca
Merge branch 'api' of https://github.com/kmoscoe/docsite into api
kmoscoe Mar 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions data_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Whether you're just exploring the data on [datacommons.org](http://datacommons.o

To allow data from hundreds of organizations around the world, in a myriad of models and formats to be interoperable and queryable in a unified way, Data Commons needs to have a common way of understanding and representing this data. To do so, it applies a schema, or vocabulary to all its data, that is largely derived from earlier schemes developed for semantic understanding of web pages – most notably, the data models and schemas of [Schema.org](http://Schema.org) (which were in turn based on earlier schemes such as Microformats and Resource Description Framework (RDF)).

The Data Commons schema is in fact a superset of [Schema.org schemas](https://schema.org/docs/schemas.html){: target="_blank"}, with a particular emphasis on time series and statistical data. Every data set must have an associated schema, written in [Meta Content Format](https://en.wikipedia.org/wiki/Meta_Content_Framework){: target="_blank"} (MCF) language, that maps the provider's data to existing concepts in the Data Commons.
The Data Commons schema is in fact a superset of [Schema.org schemas](https://schema.org/docs/schemas.html){: target="_blank"}, with a particular emphasis on time series and statistical data. Every data set must have an associated schema, written in [Meta Content Format](https://en.wikipedia.org/wiki/Meta_Content_Framework){: target="_blank"} (MCF) language, that maps the provider's data to existing concepts in the Data Commons knowledge graph.

## Knowledge Graph

Expand All @@ -43,9 +43,9 @@ Each node consists of some kind of entity or value, and each edge describes some
- Various [properties](#property)
- A [provenance](#provenance-source-dataset)

As in other knowledge graphs, each pair of connected nodes is a _triple_ consisting of a subject node, predicate (or "edge") and object node. The Data Commons knowledge graph is made up of billions of triples. The triple is not generally exposed in Data Commons as a concept that you need to know (although it can be queried from some APIs).
As in other knowledge graphs, each pair of connected nodes is a _triple_ consisting of a subject node, predicate (or "edge") and object node. The Data Commons knowledge graph is made up of billions of triples. The triple is not generally exposed in Data Commons as a concept that you need to know.

You can get all the information about a node and its edges by looking at the Knowledge Graph browser. If you know the [DCID](#unique-identifier-dcid) for a node, you can access it directly by typing <code>https://datacommons.org/browser/<var>DCID</var></code>. For example, here is the entry for the `City` node, available at [https://datacommons.org/browser/City](https://datacommons.org/browser/City){: target="_blank"}:
You can get information about a node and its edges by looking at the [Knowledge Graph browser](https://datacommons.org/browser){: target="_blank"}. If you know the [DCID](#unique-identifier-dcid) for a node, you can access it directly by typing <code>https://datacommons.org/browser/<var>DCID</var></code>. For example, here is the entry for the `City` node, available at [https://datacommons.org/browser/City](https://datacommons.org/browser/City){: target="_blank"}:

![KG browser]({{site.url}}/assets/images/dc/concept2.png){: width="900"}

Expand Down Expand Up @@ -118,14 +118,19 @@ To find the DCID for a place using other methods:
- Use the REST v2 [Resolve API](/api/rest/v2/resolve.html), either interactively (e.g. using curl or the browser address bar) or programmatically.
- Use the Python v2 [`resolve.fetch_dcids_by_name`](/api/python/v2/resolve.html#fetch_dcids_by_name) API method, either interactively or programmatically.

To find the DCID for a statistical variable:
To find the DCID for a statistical variable using the datacommons.org website:

1. Open the Statistical Variable Explorer.
1. Search for the variable of interest, and optionally filter by data source and dataset.
1. Look under the heading for the DCID.

![Stat Var Explorer]({{site.url}}/assets/images/dc/concept9.png){: width="900"}

To find the DCID for a statistical variable using other methods:

- Use the REST v2 [Resolve API](/api/rest/v2/resolve.html) with the `resolver=indicator` option.
- Use the Python v2 [`resolve.fetch_indicators`](/api/python/v2/resolve.html#fetch_indicators) API method.

## Property

Every node also contains properties or characteristics that describe its entity, event, or statistical variable. Each property is actually an edge to another node, with a label. If the object node is a primitive type, such as a string, date, or number, it is a "leaf", or terminal node, which we call an _attribute_. Examples are properties such as `latitude`, `year`, various unique IDs and so on.
Expand All @@ -152,20 +157,19 @@ Time series made up of many observations underlie the data available in the [Tim

Every node and triple also have some important properties that indicate the origin of the data.

- [`Provenance`](https://datacommons.org/browser/Provenance){: target="_blank"}: All triples have a provenance, typically the URL of the data provider's website; for example, [www.abs.gov.au](https://datacommons.org/browser/dc/base/AustraliaStatistics){: target="_blank"}. In addition, all entity types also have a provenance, defined with a DCID, such as [`AustraliaStatistics`](https://datacommons.org/browser/dc/base/AustraliaStatistics){: target="_blank"}. It also (For many property types, which are defined by the Data Commons schema, their provenance is always [datacommons.org](datacommons.org).)
- [`Provenance`](https://datacommons.org/browser/Provenance){: target="_blank"}: All triples have a provenance, typically the URL of the data provider's website; for example, [www.abs.gov.au](https://datacommons.org/browser/dc/base/AustraliaStatistics){: target="_blank"}. In addition, all entity types also have a provenance, defined with a DCID, such as [`AustraliaStatistics`](https://datacommons.org/browser/dc/base/AustraliaStatistics){: target="_blank"}. (For many property types, which are defined by the Data Commons schema, their provenance is always [datacommons.org](datacommons.org).)
- [`Source`](https://datacommons.org/browser/Source){: target="_blank"}: This is a property of a provenance, and a dataset, usually the name of an organization that provides the data or the schema. For example, for provenance [www.abs.gov.au](www.abs.gov.au), the source is the [Australian Bureau of Statistics](https://datacommons.org/browser/dc/s/AustralianBureauOfStatistics){: target="_blank"}.
- [`Dataset`](https://datacommons.org/browser/Dataset){: target="_blank"}: This is the name of a specific dataset provided by a provider. Many sources provide multiple datasets. For example, the source Australian Bureau of Statistics provides two datasets, [Australia Statistics](https://datacommons.org/browser/dc/d/AustralianBureauOfStatistics_AustraliaStatistics){: target="_blank"} (not to be confused with the provenance above), and [Australia Subnational Administrative Boundaries](https://datacommons.org/browser/dc/d/AustralianBureauOfStatistics_AustraliaSubnationalAdministrativeBoundaries){: target="_blank"}.

![Knowledge graph]({{site.url}}/assets/images/dc/concept12.png){: width="600"}


Note that a given statistical variable may have multiple provenances, since many data sets define the same variables. You can see the list of all the data sources for a given statistical variable in the Statistical Variable Explorer. For example, the explorer shows multiple sources (Censuses from India, Mexico, Vietnam, OECD, World Bank, etc.) for the variable [Life Expectancy](https://datacommons.org/tools/statvar#LifeExpectancy_Person=&sv=LifeExpectancy_Person){: target="_blank"}:

![Stat Var Explorer]({{site.url}}/assets/images/dc/concept13.png){: width="900"}

You can see a list of all sources and data sets in several places:

- The [Data sources](/datasets/) pages in this site.
- The **Data source** and **Dataset** drop-down menus in the Statistical Variable Explorer.
- The [Data Sources](https://datacommons.org/data/){: target="_blank"} pages
- The **Data source** and **Dataset** drop-down menus in the Statistical Variable Explorer

![Stat Var Explorer]({{site.url}}/assets/images/dc/concept14.png){: width="600"}
Loading