diff --git a/api/index.md b/api/index.md index 0c7b3bb99..ce0ececb4 100644 --- a/api/index.md +++ b/api/index.md @@ -18,7 +18,7 @@ The Data Commons APIs allow developers to programmatically access the data in Da * A [REST API](/api/rest/v2) that can be used on the command line as well as in any language with an HTTP library. * A [Python](/api/python/v2) client library that wraps the REST APIs and includes support for [Pandas](https://pandas.pydata.org/){: target="_blank}. -The endpoints can be roughly grouped into four categories: +The endpoints can be roughly grouped into three categories: - **Statistical data**: Given a set of statistical variables, dates and entities, get observations. @@ -26,10 +26,6 @@ The endpoints can be roughly grouped into four categories: - **Resolution to DCIDs**: Given a set of place nodes identified by other means, get their Data Commons IDs. -- **Graph query/SPARQL**: Given a subgraph where some of the nodes are - variables, retrieve possible matches. This corresponds to a subset of the - graph query language [SPARQL](https://www.w3.org/TR/rdf-sparql-query/){: target="_blank"}. This is useful for complex node connections which would require multiple API calls; for example, "hate crimes motivated by disability status in Californian cities". - In addition, Data Commons provides additional tools for accessing its data that call the REST APIs under the hood: - [Google Sheets](sheets/index.md): provides several custom functions that populate spreadsheets with data from the Data Commons knowledge graph diff --git a/api/pandas/index.md b/api/pandas/index.md deleted file mode 100644 index 6493c8467..000000000 --- a/api/pandas/index.md +++ /dev/null @@ -1,43 +0,0 @@ ---- -layout: default -title: Pandas (V1) -nav_order: 40 -parent: API - Query data programmatically -has_children: true ---- - -# Data Commons Pandas API - -> **Warning:** This version of the Data Commons Python API will be deprecated in early 2026. Please migrate your applications to [V2](/api/python/v2). For help on translating your requests, see the [Migration guide](/api/python/v2/migration.html). - -The **Data Commons Pandas API** is a superset of the Data Commons Python API: -all functions from the Python API are also accessible from -the Pandas API, and supplemental functions help with directly creating -[pandas](https://pandas.pydata.org/){: target="_blank"} -objects using data from the Data Commons knowledge graph for common -use cases. - -Before proceeding, make sure you have followed the setup instructions below. - -## Install the Python Data Commons API - -1. If not done already, install python3 and pip3. See [Setting up a Python development environment](https://cloud.google.com/python/docs/setup#installing_python){: target="_blank"} for procedures. -1. Install the `datacommons_pandas` package: - -```bash -$ pip install datacommons_pandas -``` -You are ready to go! You can view our [tutorials](/api/python/tutorials.html) on how to use the -API to perform certain tasks using [Google Colab](https://colab.sandbox.google.com/){: target="_blank"}, or refer to pages in the navigation bar for detailed information about all the methods available. - -## Run Python interactively - -The pages in this site demonstrate running the Pandas methods interactively from the Bash shell. To use this facility, be sure to import the `datacommons_pandas` package: - -From your virtual environment, run: - -```bash -python3 ->>> import datacommons_pandas ->>> -``` diff --git a/api/pandas/multivariate_dataframe.md b/api/pandas/multivariate_dataframe.md deleted file mode 100644 index 86351763d..000000000 --- a/api/pandas/multivariate_dataframe.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -layout: default -title: Get multivariate DataFrame -nav_order: 3 -parent: Pandas (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve multivariate DataFrame - -Returns a `pandas.DataFrame` with [`places`](https://datacommons.org/browser/Place) -as index and [`stat_vars`](https://datacommons.org/browser/StatisticalVariable) -as columns, where each cell is latest observed statistic for -its `Place` and `StatisticalVariable`. - -See the [full list of `StatisticalVariable` classes](/statistical_variables.html). - -## General information about this method - -**Signature**: - -```python -datacommons_pandas.build_multivariate_dataframe(places, stat_vars) -``` - -**Required arguments**: - -* [`places`](/glossary.html): The `dcid` or `dcid` list of the [`Place`](https://datacommons.org/browser/Place) objects to query for. -* [`stat_vars`](/glossary.html): The `dcid` or `dcid` list of the [`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable) objects to query for. - -**NOTE:** In Data Commons, [`dcid`](/glossary.html) stands for Data Commons ID and indicates the unique identifier assigned to every node in the knowledge graph. - -## Assembling the information you will need for a call to the build_multivariate_dataframe method - -Going into more detail on how to assemble the values for the required arguments: - - - `places`: Data Commons uniquely identifies nodes by assigning them DCIDs, or Data Commons IDs. Your query will need to specify the DCIDs for the nodes of interest. - - - `stat_vars`: This argument specifies the types of places sought in the response. For example, when examining places contained within American `States`, you would be able to select `City` or `County` (among others). For a full list of available types, see [`subClassOf Place`](https://datacommons.org/browser/Place). - -## Example: Compare the historic populations, median ages, and unemployment rates of the US, California, and Santa Clara County. - -```python ->>> datacommons_pandas.build_multivariate_dataframe(["country/USA", "geoId/06", "geoId/06085"],["Count_Person", "Median_Age_Person", "UnemploymentRate_Person"]) - Median_Age_Person Count_Person UnemploymentRate_Person -place -country/USA 37.9 328239523 NaN -geoId/06 36.3 39512223 11.6 -geoId/06085 37.0 1927852 7.5 -``` - -## Error Returns - -If a nonexistent place is passed as an argument, it will not render in the dataframe. In the following example, "geoId/123123123123123123" is one such nonexistent place. - -```python ->>> datacommons_pandas.build_multivariate_dataframe(["country/USA", "geoId/06", "geoId/123123123123123123"],["Count_Person", "Median_Age_Person", "UnemploymentRate_Person"]) - Count_Person Median_Age_Person UnemploymentRate_Person -place -country/USA 328239523 37.9 NaN -geoId/06 39512223 36.3 11.4 -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons_pandas.build_multivariate_dataframe(["country/USA", "geoId/06", "geoId/123123123123123123"]) -Traceback (most recent call last): - File "", line 1, in -TypeError: build_multivariate_dataframe() missing 1 required positional argument: 'stat_vars' -``` diff --git a/api/pandas/time_series.md b/api/pandas/time_series.md deleted file mode 100644 index d43cc3096..000000000 --- a/api/pandas/time_series.md +++ /dev/null @@ -1,166 +0,0 @@ ---- -layout: default -title: Get time series for a place -nav_order: 1 -parent: Pandas (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve time series for a place - -Returns a `pandas.Series` representing a time series for the [`place`](https://datacommons.org/browser/Place) and -[`stat_var`](https://datacommons.org/browser/StatisticalVariable) satisfying any optional parameters. - -See the [full list of `StatisticalVariable` classes](/statistical_variables.html). - -## General information about this method - -**Signature**: -```python -datacommons_pandas.build_time_series(place, stat_var, measurement_method=None,observation_period=None, unit=None, scaling_factor=None) -``` - -**Required arguments**: - -* `place`: The `dcid` of the [`Place`](https://datacommons.org/browser/Place) to query for. -* `stat_var`: The `dcid` of the [`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable). See the [full list of `StatisticalVariable` variables](/statistical_variables.html). - -**NOTE:** In Data Commons, [`dcid`](/glossary.html) stands for Data Commons ID and indicates the unique identifier assigned to every node in the knowledge graph. - -## Assembling the information you will need for a call to the build_time_series method - -Going into more detail on how to assemble the values for the required arguments: - -- [`place`](/glossary.html): For this parameter, you will need to specify the DCID (the unique ID assigned by Data Commons to each node in the graph) of the place you are interested in. -- [`stat_var`](/glossary.html): The statistical variable whose value you are interested in. - -In addition to these required properties, this endpoint also allows for other, optional arguments. Here are helpful arguments in regular use by Data Commons developers: - -- [`measurement_method`](/glossary.html): The technique used for measuring a statistical variable. - -- [`observation_period`](/glossary.html): The time period over which an observation is made. - -- [`unit`](/glossary.html): The unit of measurement. - -- [`scaling_factor`](/glossary.html): Property of statistical variables indicating factor by which a measurement is multiplied to fit a certain format. - -Note that specifying arguments that do not exist for the target place and variable will result in an empty response. For more information on any of these arguments, check out [the glossary](/glossary.html). - -## Examples - -### Example 1: Retrieve the count of men in the state of California. - -```python ->>> datacommons_pandas.build_time_series("geoId/05", "Count_Person_Male") -2017 1461651 -2018 1468412 -2011 1421287 -2012 1431252 -2013 1439862 -2014 1447235 -2015 1451913 -2016 1456694 -dtype: int64 -``` - -### Example 2: Retrieve the number of people in Bosnia and Herzegovina as counted by the Bosnian census. - -```python ->>> datacommons_pandas.build_time_series("country/BIH", "Count_Person", measurement_method="BosniaCensus") -2013 3791622 -dtype: int64 -``` - -### Example 3: Retrieve the death count in Miami-Dade County over a period of one year. - -```python ->>> datacommons_pandas.build_time_series("geoId/12086", "Count_Death", observation_period="P1Y") -2001 19049 -2004 18384 -2008 18012 -2011 17997 -2000 18540 -2003 18399 -2006 18261 -2013 18473 -1999 19170 -2002 18176 -2009 17806 -2014 19013 -2015 19542 -2016 20277 -2005 18400 -2007 17982 -2010 18048 -2012 18621 -2017 20703 -dtype: int64 - -``` - -### Example 4: Retrieve the distrubtion of naloxone in Miami-Dade County in grams. - -```python ->>> datacommons_pandas.build_time_series("geoId/12086", "RetailDrugDistribution_DrugDistribution_Naloxone", unit="Grams") -2006-10 55.21 -2007-01 59.63 -2007-04 65.98 -2007-07 80.34 -2007-10 118.79 -2006-01 44.43 -2006-04 48.28 -2006-07 54.98 -dtype: float64 -``` - -### Example 5: Retrieve the percentage of nominal GDP spent by the government of the Gambia on education. - -```python ->>> datacommons_pandas.build_time_series("country/GMB", "Amount_EconomicActivity_ExpenditureActivity_EducationExpenditure_Government_AsFractionOf_Amount_EconomicActivity_GrossDomesticProduction_Nominal", scaling_factor="100.0000000000") -1986 3.48473 -2008 3.52738 -2012 4.10118 -1991 3.78061 -1996 2.56628 -1999 1.56513 -2002 1.44292 -2003 1.36338 -2014 2.17849 -2006 1.20949 -2013 1.82979 -1989 2.97409 -1990 2.82584 -2001 1.15810 -2004 1.03450 -2007 1.30849 -1985 4.29515 -1992 1.16984 -1995 2.55356 -2015 2.13528 -2000 1.46587 -2005 1.13919 -2009 3.07235 -2010 4.15610 -2011 3.92511 -2016 2.05946 -2018 2.43275 -dtype: float64 -``` - -## Error Returns - -If there is no value associated with the requested property, an empty `Series` object is returned: - -```python ->>> datacommons_pandas.build_time_series("geoId/000", "Count_Person_Male") -Series([], dtype: float64) -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons_pandas.build_time_series("geoId/000") -Traceback (most recent call last): - File "", line 1, in -TypeError: build_time_series() missing 1 required positional argument: 'stat_var' -``` diff --git a/api/pandas/time_series_dataframe.md b/api/pandas/time_series_dataframe.md deleted file mode 100644 index d3be5d33c..000000000 --- a/api/pandas/time_series_dataframe.md +++ /dev/null @@ -1,86 +0,0 @@ ---- -layout: default -title: Get time series DataFrame -nav_order: 2 -parent: Pandas (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve time series DataFrame - -Returns a `pandas.DataFrame` with [`places`](https://datacommons.org/browser/Place) -as index and dates as columns, where each cell is the observed statistic for -its place and date for the -[`stat_var`](https://datacommons.org/browser/StatisticalVariable). - -See the [full list of `StatisticalVariable` classes](/statistical_variables.html). - -## General information about this method - -**Signature**: - -```python -datacommons_pandas.build_time_series_dataframe(places, stat_var) -``` - -**Required arguments**: - -* `places`: The `dcid` or `dcid` list of the [`Place`](https://datacommons.org/browser/Place) objects to query for. -* `stat_var`: The `dcid` of the [`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable). - -**NOTE:** In Data Commons, [`dcid`](/glossary.html) stands for Data Commons ID and indicates the unique identifier assigned to every node in the knowledge graph. - -## Assembling the information you will need for a call to the build_time_series method - -Going into more detail on how to assemble the values for the required arguments: - - - [`places`](/glossary.html): Data Commons uniquely identifies nodes by assigning them DCIDs, or Data Commons IDs. Your query will need to specify the DCIDs for the places of interest. - - - [`stat_var`](/glossary.html): This argument specifies the statistical variable whose measurement you seek. - -For more information, check out [the glossary](/glossary.html). - -## Examples - -### Example 1: Retrieve the count of men in the state of California. - -```python ->>> datacommons_pandas.build_time_series_dataframe("geoId/05", "Count_Person_Male") - 2010 2011 2012 ... 2017 2018 2019 -place ... -geoId/05 1430837 1447850 1449265 ... 1479682 1476680 1474705 - -[1 rows x 10 columns] -``` - -### Example 2: Compare the historic populations of Sudan and South Sudan. - -```python ->>> datacommons_pandas.build_time_series_dataframe(["country/SSD","country/SDN"], "Count_Person") - 2019 2019-06 -place -country/SDN NaN 41592539.0 -country/SSD 12778250.0 NaN -``` - -## Error Returns - -If a nonexistent place is passed as an argument, it will not render in the dataframe, as follows: - -```python ->>> datacommons_pandas.build_time_series_dataframe(["geoId/123123123123123123","geoId/36"], "Count_Person") - 2001 2002 2003 ... 2017 2018 2019 -place ... -geoId/36 19082800 19137800 19175900 ... 19589600 19530400 19453600 - -[1 rows x 19 columns] -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons_pandas.build_time_series_dataframe(["geoId/123123123123123123","geoId/36"]) -Traceback (most recent call last): - File "", line 1, in -TypeError: build_time_series_dataframe() missing 1 required positional argument: 'stat_var' -``` diff --git a/api/python/index.md b/api/python/index.md index 7fc737912..0dd24819a 100644 --- a/api/python/index.md +++ b/api/python/index.md @@ -8,41 +8,4 @@ has_children: true # Data Commons Python API -> **Warning:** This version of the Data Commons Python API will be deprecated in early 2026. Please migrate your applications to [V2](/api/python/v2). For help on translating your requests, see the [Migration guide](/api/python/v2/migration.html). - -The Data Commons Python API is a Python library that enables developers to -programmatically access nodes in the Data Commons knowledge graph. This package -allows users to explore the structure of the graph, integrate statistics from -the graph into data analysis workflows and much more. - -Before proceeding, make sure you have followed the setup instructions below. - -## Install the Python Data Commons API - -This procedure uses a Python virtual environment as recommended by Google Cloud [Setting up a Python development environment](https://cloud.google.com/python/docs/setup){: target="_blank"}. - -1. If not done already, install python3 and pip3. See [Installing Python](https://cloud.google.com/python/docs/setup#installing_python) for procedures. -1. Go to your project directory and create a virtual environment using venv, as described in [Using venv to isolate dependencies](https://cloud.google.com/python/docs/setup#installing_and_using_virtualenv){: target="_blank"}. -1. Install the the `datacommons` package: - -```bash -$ pip install datacommons -``` - -You are ready to go! You can view our [tutorials](/api/python/tutorials.html) on how to use the -API to perform certain tasks using [Google Colab](https://colab.sandbox.google.com/){: target="_blank"}, or refer to pages in the navigation bar for detailed information about all the methods available. - -## Run Python interactively - -The pages in this site demonstrate running Python methods interactively from the Bash shell. To use this facility, be sure to import the `datacommons` package: - -From your virtual environment, run: - -```bash -python3 ->>> import datacommons ->>> -``` - - - +> This version of the Data Commons Python API is deprecated. Please migrate your applications to [V2](/api/python/v2). For help on translating your requests, see the [Migration guide](/api/python/v2/migration.html). \ No newline at end of file diff --git a/api/python/place_in.md b/api/python/place_in.md deleted file mode 100644 index 88d58891e..000000000 --- a/api/python/place_in.md +++ /dev/null @@ -1,88 +0,0 @@ ---- -layout: default -title: Get places within other places -nav_order: 33 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve places contained within other places - -## General information about this endpoint - -**Signature**: - -```python -datacommons.get_places_in(dcids, place_type) -``` - -**Required arguments**: - -* `dcids`: A list of nodes to query, identified by their DCID. -* `place_type`: The type of the contained child `Places` within the given - DCIDs to filter by. - -## Assembling the information you will need for a call to the get_places_in method - -Going into more detail on how to assemble the values for the required arguments: - - - `dcids`: Data Commons uniquely identifies nodes by assigning them DCIDs, or Data Commons IDs. Your query will need to specify the DCIDs for the nodes of interest. More information about DCIDs is available in [the glossary](/glossary.html). - - - `place_type`: This argument specifies the type of place sought in the response. For example, when examining places contained within American `States`, you would be able to select `City` or `County` (among others). For a full list of available types, see [the place types page](/place_types.html). - -## What to expect in the function return - -The method's return value will always be a `dict` in the following form: - -```python -{ - "": ["string", ...] - ... -} -``` - -## Example requests and responses - -### Example 1: Retrieve a list of all counties in Delaware. - -#### Method call - -```python -datacommons.get_places_in(["geoId/10"], "County") -``` - -#### Response - -```python -{'geoId/10': ['geoId/10001', 'geoId/10003', 'geoId/10005']} -``` - -### Example 2: Retrieve a list of congressional districts in Alaska and Hawaii. - -```python -datacommons.get_places_in(["geoId/15","geoId/02"], "CongressionalDistrict") -``` - -#### Response - -```python -{'geoId/15': ['geoId/1501', 'geoId/1502'], 'geoId/02': ['geoId/0200']} -``` - -## Error Returns - -If there is no value associated with the requested property, an empty list is returned: - -```python ->>> datacommons.get_places_in(["geoId/1021"], "CongressionalDistrict") -{'geoId/1021': []} -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons.get_places_in(["geoId/1021"]) -Traceback (most recent call last): - File "", line 1, in -TypeError: get_places_in() missing 1 required positional argument: 'place_type' -``` diff --git a/api/python/property_label.md b/api/python/property_label.md deleted file mode 100644 index b237cd69c..000000000 --- a/api/python/property_label.md +++ /dev/null @@ -1,85 +0,0 @@ ---- -layout: default -title: Get property labels of nodes -nav_order: 36 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve property labels of nodes - -Returns the labels of properties defined for the given node DCIDs. - -## General information about this method - -**Signature**: -```python -datacommons.get_property_labels(dcids, out=True) -``` - -**Required arguments**: - -* `dcids`: A list of nodes to query, identified by their DCID. - -**Optional arguments**: - -* `out`: The label's direction. Defaults to `True` (only returning response nodes directed towards the requested node). If set to `False`, will only return response nodes directed away from the request node. - -## Assembling the information you will need for a call to the get_property_values method - -Going into more detail on how to assemble the values for the required argument: - - - `dcids`: Data Commons uniquely identifies nodes by assigning them DCIDs, or Data Commons IDs. Your query will need to specify the DCIDs for the nodes of interest. More information about DCIDs is available in [the glossary](/glossary.html). - -In addition to this required property, this endpoint also allows for an additional, optional argument: - - - `out`: This is a boolean value that refers to the orientation, or direction, of the edge. You can specify this argument as `True` to indicate that you desire the response to only include nodes with the value of the property equivalent to one or more of the specified `DCIDs`, or `False` to only return nodes equivalent to one or more of the values of the properties of the specified `DCIDs`. (To visualize this, Figure 1 illustrates the directions for the property `containedInPlace` of the node for Argentina.) - -![](/assets/images/rest/property_value_direction_example.png) - -*Figure 1. Relationship diagram for the property `containedInPlace` of the country Argentina. Note the directionality of the property `containedInPlace`: the API returns both nodes with direction `in` (Buenos Aires is `containedInPlace` of Argentina) and nodes with direction `out` (Argentina is `containedInPlace` of South America).* - -## What to expect in the function return - -The method's return value will always be a `dict` in the following form: - -```python -{ - "": ["string", ...] - ... -} -``` - -## Example requests and responses - -### Example 1: Retrieve the outwardly directed property labels of Wisconsin's eighth congressional district. - -```python ->>> datacommons.get_property_labels(['geoId/5508']) -{'geoId/5508': ['containedInPlace', 'geoId', 'geoJsonCoordinates', 'geoOverlaps', 'kmlCoordinates', 'landArea', 'latitude', 'longitude', 'name', 'provenance', 'typeOf', 'waterArea']} -``` - -### Example 2: Retrieve the inwardly directed property labels of two different leukocyte cell lines. - -```python ->>> datacommons.get_property_labels(['dc/c3j78rpyssdmf','dc/7hfhd2ek8ppd2'],out=False) -{'dc/c3j78rpyssdmf': ['biosampleOntology'], 'dc/7hfhd2ek8ppd2': ['biosampleOntology']} -``` - -## Error Returns - -If there are no properties associated with the node, an empty list is returned: - -```python ->>> datacommons.get_property_labels(['geoId/123123123123123']) -{'geoId/123123123123123': []} -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons.get_property_labels() -Traceback (most recent call last): - File "", line 1, in -TypeError: get_property_labels() missing 1 required positional argument: 'dcids' -``` diff --git a/api/python/property_value.md b/api/python/property_value.md deleted file mode 100644 index 5b3a30adc..000000000 --- a/api/python/property_value.md +++ /dev/null @@ -1,131 +0,0 @@ ---- -layout: default -title: Get property values of nodes -nav_order: 37 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve property values of nodes - -Given a list of nodes and a property label, returns values associated with the -given property for each node. - -## General information about this endpoint - -**Signature**: -```python -datacommons.get_property_values(dcids, prop, out=True, value_type=None, limit=datacommons.utils._MAX_LIMIT) -``` - -**Required arguments**: - -* [`dcids`](/glossary.html): A list of nodes to query, identified by their Data Commons identifiers. -* `prop`: The property to query for. - -**Optional arguments**: - -* `value_type`: The type of the property value to filter by. Defaults to `NONE`. Only applicable if - the value refers to a node. -* `out`: The label's direction. Defaults to `True` (only returning response nodes directed towards the requested node). If set to `False`, will only return response nodes directed away from the request node. -* `limit`: (≤ 500) Maximum number of values returned per node. Defaults to `datacommons.utils._MAX_LIMIT`. - -## Assembling the information you will need for a call to the get_property_values method - -Going into more detail on how to assemble the values for the required arguments: - - - `dcids`: Data Commons uniquely identifies nodes by assigning them DCIDs, or Data Commons IDs. Your query will need to specify the DCIDs for the nodes of interest. More information about DCIDs is available in [the glossary](/glossary.html). - - - `prop`: The property whose value you are interested in, such as "name" for the name of a node, or "typeOf" for the type of a node. - -In addition to these required properties, this endpoint also allows for other, optional arguments. Here are helpful arguments in regular use by Data Commons developers: - - - `value_type`: If the property queried only takes on node values, you can use this argument to filter nodes in the response, ensuring the response only contains nodes with the specified type. For example, if you query the property `containedInPlace` on the DCID `geoId/06085`, your response will contain many results that may not be relevant to your question. If you instead specify the `value_type` as `City`, your result list will be shorter, narrower, and easier to parse. - - - `out`: This is a boolean value that refers to the orientation, or direction, of the edge. You can specify this argument as `True` to indicate that you desire the response to only include nodes with the value of the property equivalent to one or more of the specified `DCIDs`, or `False` to only return nodes equivalent to one or more of the values of the properties of the specified `DCIDs`. (To visualize this, Figure 1 illustrates the directions for the property `containedInPlace` of the node for Argentina.) - -![](/assets/images/rest/property_value_direction_example.png) - -*Figure 1. Relationship diagram for the property `containedInPlace` of the country Argentina. Note the directionality of the property `containedInPlace`: the API returns both nodes with direction `in` (Buenos Aires is `containedInPlace` of Argentina) and nodes with direction `out` (Argentina is `containedInPlace` of South America).* - -## What to expect in the function return - -The method's return value will always be a `dict` in the following form: - -```json -{ - "": ["string", ...] - ... -} -``` - -## Examples - -### Example 1: Retrieve the common names of the country of Côte d'Ivoire. - -#### Method call - -```python ->>> datacommons.get_property_values(['country/CIV'],'name') -{'country/CIV': ["Côte d'Ivoire", 'Ivory Coast']} -``` - -### Example 2: Retrieve the order to which the plant _Austrobaileya scandens_ belongs. - -```python ->>> datacommons.get_property_values(['dc/bsmvthtq89217'],'order') -{'dc/bsmvthtq89217': ['Austrobaileyales']} -``` - -### Example 3: Retrieve the addresses of Stuyvesant High School in New York and Gunn High School in California. - -```python ->>> datacommons.get_property_values(["nces/360007702877","nces/062961004587"],'address') -{'nces/360007702877': ['345 Chambers St, New York, New York'], 'nces/062961004587': ['780 Arastradero Rd., Palo Alto, California']} -``` - -### Example 4: Retrieve a list of earthquake events in Madagascar. - -> **NOTE:** -> Unlike in the REST API, the Python endpoint returns only one direction. Hence, you must specify `out` as False to obtain results _in_ Madagascar. - -```python ->>> datacommons.get_property_values(dcids=["country/MDG"],prop='affectedPlace',out=False,value_type='EarthquakeEvent') -{'country/MDG': ['earthquake/us200040me', 'earthquake/us60003r15', 'earthquake/usc000evr6', 'earthquake/usp00005zf', 'earthquake/usp00006yt', 'earthquake/usp0000afz', 'earthquake/usp0001fcd', 'earthquake/usp0001ss5', 'earthquake/usp00020ud', 'earthquake/usp0002kfd', 'earthquake/usp0004qn4', 'earthquake/usp0005gu9', 'earthquake/usp0007k9j', 'earthquake/usp0008vc6', 'earthquake/usp000dckw', 'earthquake/usp000fu24', 'earthquake/usp000gmuf', 'earthquake/usp000h6zw', 'earthquake/usp000jgbb']} -``` - -### Example 5: Retrieve just one cyclone event in India. - -```python ->>> datacommons.get_property_values(dcids=["country/IND"],prop='affectedPlace',out=False,value_type='CycloneEvent',limit=1) -{'country/IND': ['cyclone/ibtracs_2018314N12093']} -``` - -### Example 6: Retrieve the country in which Buenos Aires is located. - -![](/assets/images/rest/property_value_direction_example.png) - -*Figure 2. Relationship diagram for the property `containedInPlace` of the country Argentina. Note the directionality of the property `containedInPlace`: the API returns both nodes with direction `in` (Buenos Aires is `containedInPlace` of Argentina) and nodes with direction `out` (Argentina is `containedInPlace` of South America).* - -```python ->>> datacommons.get_property_values(dcids=["country/ARG"],prop='containedInPlace') -{'country/ARG': ['southamerica']} -``` - -> **NOTE:** -> If there is no value associated with the property, an empty list is returned: -> ```python -> >>> dc.get_property_values(["geoId/06", "geoId/21"], "foo") -> {'geoId/06': [], 'geoId/21': []} -> ``` - -## Error Returns - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> dc.get_property_values('address') -Traceback (most recent call last): - File "", line 1, in -TypeError: get_property_values() missing 1 required positional argument: 'prop' -``` \ No newline at end of file diff --git a/api/python/query.md b/api/python/query.md deleted file mode 100644 index 165c32f2e..000000000 --- a/api/python/query.md +++ /dev/null @@ -1,189 +0,0 @@ ---- -layout: default -title: Query using SPARQL -nav_order: 38 -parent: Python (V1) -grand_parent: Retrieve statistics value for a place ---- - -# Query the Data Commons knowledge graph using SPARQL - -Returns the results of running a graph query on the Data Commons knowledge graph -using [SPARQL](https://www.w3.org/TR/rdf-sparql-query/){: target="_blank"}. Note that Data Commons is only -able to support a limited subsection of SPARQL functionality at this time: specifically only the keywords `ORDER BY`, `DISTINCT`, and `LIMIT`. - -Note: The Python SPARQL library currently only supports the [V1](/api/v1/query.html) version of the API. - -## General information about the query() method - -**Signature**: - -```python -datacommons.query(query_string, select=None) -``` - -**Required arguments**: - -* `query_string`: A SPARQL query string. - -## How to construct a call to the query() method - -This method makes it possible to query the Data Commons knowledge graph using SPARQL. SPARQL is a query language developed to retrieve data from websites. It leverages the graph structure innate in the data it queries to return specific information to an end user. For more information on assembling SPARQL queries, check out [the Wikipedia page about SPARQL](https://en.wikipedia.org/wiki/SPARQL){: target="_blank"} and [the W3C specification information](https://www.w3.org/TR/sparql11-query/){: target="_blank"}. - -This method accepts the additional optional argument `select`. This function selects rows to be returned by `query`. Under the hood, the `select` function examines a row in the results of executing `query_string` and returns `True` if and only if the row is to be returned by `query`. The row passed in as an argument is represented as a `dict` that maps a query variable in `query_string` to its value in the given row. - -> **NOTE:** -> - In the query, each variable should have a `typeOf` condition, e.g. `"?var typeOf City ."`. - -## What to expect in the function return - -A correct response will always look like this: - -```json -[ - { - "": "", - ... - }, - ... -] -``` - -The response contains an array of dictionaries, each corresponding to one node matching the conditions of the query. Each dictionary's keys match the variables in the query SELECT clause, and the values in the dictionaries are those associated to the given node's query-specified properties. - -## Examples and error responses - -The following examples and error responses, along with explanations and fixes for the errors, are available in this [Python notebook](https://colab.research.google.com/drive/1Jd0IDHnMdtxhsmXhL5Ib5tL0zgJud1k5?usp=sharing){: target="_blank"}. - -### Example 1: Retrieve the name of the state associated with DCID geoId/06. - -```python ->>> geoId06_name_query = 'SELECT ?name ?dcid WHERE { ?a typeOf Place . ?a name ?name . ?a dcid ("geoId/06" "geoId/21" "geoId/24") . ?a dcid ?dcid }' ->>> datacommons.query(geoId06_name_query) -[{'?name': 'Kentucky', '?dcid': 'geoId/21'}, {'?name': 'California', '?dcid': 'geoId/06'}, {'?name': 'Maryland', '?dcid': 'geoId/24'}] -``` - -### Example 2: Retrieve a list of ten biological specimens in reverse alphabetical order. - -```python ->>> bio_specimens_reverse_alphabetical_order_query = 'SELECT ?name WHERE { ?biologicalSpecimen typeOf BiologicalSpecimen . ?biologicalSpecimen name ?name } ORDER BY DESC(?name) LIMIT 10' ->>> datacommons.query(bio_specimens_reverse_alphabetical_order_query) -[{'?name': 'x Triticosecale'}, {'?name': 'x Silene'}, {'?name': 'x Silene'}, {'?name': 'x Silene'}, {'?name': 'x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey'}, {'?name': 'x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey'}, {'?name': 'x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey'}, {'?name': 'x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey'}, {'?name': 'x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey'}, {'?name': 'x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey'}] -``` - -### Example 3: Retrieve a list of GNI observations by country. - -```python ->>> gni_by_country_query = 'SELECT ?observation ?place WHERE { ?observation typeOf StatVarObservation . ?observation variableMeasured Amount_EconomicActivity_GrossNationalIncome_PurchasingPowerParity_PerCapita . ?observation observationAbout ?place . ?place typeOf Country . } ORDER BY ASC (?place) LIMIT 10' ->>> datacommons.query(gni_by_country_query) -[{'?observation': 'dc/o/syrpc3m8q34z7', '?place': 'country/ABW'}, {'?observation': 'dc/o/bqtfmc351v0f2', '?place': 'country/ABW'}, {'?observation': 'dc/o/md36fx6ty4d64', '?place': 'country/ABW'}, {'?observation': 'dc/o/bm28zvchsyf4b', '?place': 'country/ABW'}, {'?observation': 'dc/o/3nleez1feevw6', '?place': 'country/ABW'}, {'?observation': 'dc/o/x2yg38d0xecnf', '?place': 'country/ABW'}, {'?observation': 'dc/o/7swdqf6yjdyw8', '?place': 'country/ABW'}, {'?observation': 'dc/o/yqmsmbx1qskfg', '?place': 'country/ABW'}, {'?observation': 'dc/o/6hlhrz3k8p5wf', '?place': 'country/ABW'}, {'?observation': 'dc/o/txfw505ydg629', '?place': 'country/ABW'}] -``` - -### Example 4: Retrieve a sample list of observations with the unit InternationalDollar. - -```python ->>> internationalDollar_obs_query = 'SELECT ?observation WHERE { ?observation typeOf StatVarObservation . ?observation unit InternationalDollar } LIMIT 10' ->>> datacommons.query(internationalDollar_obs_query) -[{'?observation': 'dc/o/s3gzszzvj34f1'}, {'?observation': 'dc/o/gd41m7qym86d4'}, {'?observation': 'dc/o/wq62twxx902p4'}, {'?observation': 'dc/o/d93kzvns8sq4c'}, {'?observation': 'dc/o/6s741lstdqrg4'}, {'?observation': 'dc/o/2kcq1xjkmrzmd'}, {'?observation': 'dc/o/ced6jejwv224f'}, {'?observation': 'dc/o/q31my0dmcryzd'}, {'?observation': 'dc/o/96frt9w0yjwxf'}, {'?observation': 'dc/o/rvjz5xn9mlg73'}] -``` - -### Example 5: Retrieve a list of ten distinct annual estimates of life expectancy, along with the year of estimation, for forty-seven-year-old Hungarians. - -```python ->>> life_expectancy_query = 'SELECT DISTINCT ?LifeExpectancy ?year WHERE { ?o typeOf StatVarObservation . ?o variableMeasured LifeExpectancy_Person_47Years . ?o observationAbout country/HUN . ?o value ?LifeExpectancy . ?o observationDate ?year } ORDER BY ASC(?LifeExpectancy) LIMIT 10' ->>> datacommons.query(life_expectancy_query) -[{'?LifeExpectancy': '26.4', '?year': '1993'}, {'?LifeExpectancy': '26.5', '?year': '1992'}, {'?LifeExpectancy': '26.7', '?year': '1990'}, {'?LifeExpectancy': '26.7', '?year': '1994'}, {'?LifeExpectancy': '26.8', '?year': '1991'}, {'?LifeExpectancy': '26.9', '?year': '1995'}, {'?LifeExpectancy': '27.2', '?year': '1996'}, {'?LifeExpectancy': '27.4', '?year': '1999'}, {'?LifeExpectancy': '27.5', '?year': '1997'}, {'?LifeExpectancy': '27.5', '?year': '1998'}] -``` - -### Example 6: Use the `select` function to filter returns based on name. - -```python ->>> names_for_places_query = 'SELECT ?name ?dcid WHERE { ?a typeOf Place . ?a name ?name . ?a dcid ("geoId/06" "geoId/21" "geoId/24") . ?a dcid ?dcid }' ->>> maryland_selector = lambda row: row['?name'] == 'Maryland' ->>> result = datacommons.query(names_for_places_query, select=maryland_selector) ->>> for r in result: -... print(r) -... -{'?name': 'Maryland', '?dcid': 'geoId/24'} -``` - -### Error response 1: Malformed SPARQL query - -```python ->>> gni_by_country_query = 'SELECT ?observation WHERE { ?observation typeOf StatVarObservation . ?observation variableMeasured Amount_EconomicActivity_GrossNationalIncome_PurchasingPowerParity_PerCapita . ?observation observationAbout ?place . ?place typeOf Country . } ORDER BY ASC (?place) LIMIT 10' ->>> datacommons.query(gni_by_country_query) -Traceback (most recent call last): - File "/home/porpentina/miniconda3/lib/python3.7/site-packages/datacommons/query.py", line 102, in query - res = six.moves.urllib.request.urlopen(req) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 222, in urlopen - return opener.open(url, data, timeout) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 531, in open - response = meth(req, response) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 641, in http_response - 'http', request, response, code, msg, hdrs) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 569, in error - return self._call_chain(*args) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain - result = func(*args) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 649, in http_error_default - raise HTTPError(req.full_url, code, msg, hdrs, fp) -urllib.error.HTTPError: HTTP Error 500: Internal Server Error - -During handling of the above exception, another exception occurred: - -Traceback (most recent call last): - File "", line 1, in - File "/home/porpentina/miniconda3/lib/python3.7/site-packages/datacommons/query.py", line 104, in query - raise ValueError('Response error {}:\n{}'.format(e.code, e.read())) -ValueError: Response error 500: -b'{\n "code": 2,\n "message": "googleapi: Error 400: Unrecognized name: place; Did you mean name? at [1:802], invalidQuery",\n "details": [\n {\n "@type": "type.googleapis.com/google.rpc.DebugInfo",\n "stackEntries": [],\n "detail": "internal"\n }\n ]\n}\n' -``` - -### Error response 2: Malformed SPARQL query string - -```python ->>> gni_by_country_query = 'SELECT ?observation WHERE { ?observation typeOf StatVarObservation . ?observation variableMeasured Amount_EconomicActivity_GrossNationalIncome_PurchasingPowerParity_PerCapita . ?observation observationAbout ?place . ?place typeOf Country . } ORDER BY ASC (?place) LIMIT 10' ->>> datacommons.query(gni_by_country_query) -Traceback (most recent call last): - File "/home/porpentina/miniconda3/lib/python3.7/site-packages/datacommons/query.py", line 102, in query - res = six.moves.urllib.request.urlopen(req) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 222, in urlopen - return opener.open(url, data, timeout) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 531, in open - response = meth(req, response) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 641, in http_response - 'http', request, response, code, msg, hdrs) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 569, in error - return self._call_chain(*args) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 503, in _call_chain - result = func(*args) - File "/home/porpentina/miniconda3/lib/python3.7/urllib/request.py", line 649, in http_error_default - raise HTTPError(req.full_url, code, msg, hdrs, fp) -urllib.error.HTTPError: HTTP Error 500: Internal Server Error - -During handling of the above exception, another exception occurred: - -Traceback (most recent call last): - File "", line 1, in - File "/home/porpentina/miniconda3/lib/python3.7/site-packages/datacommons/query.py", line 104, in query - raise ValueError('Response error {}:\n{}'.format(e.code, e.read())) -ValueError: Response error 500: -b'{\n "code": 2,\n "message": "googleapi: Error 400: Unrecognized name: place; Did you mean name? at [1:802], invalidQuery",\n "details": [\n {\n "@type": "type.googleapis.com/google.rpc.DebugInfo",\n "stackEntries": [],\n "detail": "internal"\n }\n ]\n}\n' ->>> gni_by_country_query = 'SELECT ?observation WHERE { ?observation typeOf StatVarObservation . \\\\\ ?observation variableMeasured Amount_EconomicActivity_GrossNationalIncome_PurchasingPowerParity_PerCapita . ?observation observationAbout ?place . ?place typeOf Country . } ORDER BY ASC (?place) LIMIT 10' -``` - -### Error response 3: Bad selector - -```python ->>> names_for_places_query = 'SELECT ?name ?dcid WHERE { ?a typeOf Place . ?a name ?name . ?a dcid ("geoId/06" "geoId/21" "geoId/24") . ?a dcid ?dcid }' ->>> bad_selector = lambda row: row['?earthquake'] == 'Nonexistent' ->>> result = datacommons.query(names_for_places_query, select=bad_selector) -Traceback (most recent call last): - File "", line 1, in - File "/home/porpentina/miniconda3/lib/python3.7/site-packages/datacommons/query.py", line 127, in query - if select is None or select(row_map): - File "", line 1, in -KeyError: '?earthquake' -``` - - diff --git a/api/python/stat_all.md b/api/python/stat_all.md deleted file mode 100644 index 84ae5555c..000000000 --- a/api/python/stat_all.md +++ /dev/null @@ -1,153 +0,0 @@ ---- -layout: default -title: Get statistical data for multiple places -nav_order: 35 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve a collection of statistical data for multiple places - -Returns a nested `dict` of all time series for [`places`](https://datacommons.org/browser/Place) and [`stat_vars`](https://datacommons.org/browser/StatisticalVariable). -Note that in Data Commons, a `StatisticalVariable` is any type of statistical metric that can be measured at a place and -time. See the [full list of StatisticalVariables](/statistical_variables.html). - -## General information about this method - -**Signature**: - -```python -datacommons.get_stat_all(places, stat_vars) -``` - -**Required arguments** - -- `places`: The [`DCID`](/glossary.html) IDs of the [`Place`](https://datacommons.org/browser/Place) objects to query for. (Here DCID stands for Data Commons ID, the unique identifier assigned to all entities in Data Commons.) -- [`stat_vars`](/glossary.html): The `dcids` of the [`StatisticalVariables`](https://datacommons.org/browser/StatisticalVariable). - -## Assembling the information you will need for a call to the get_stat_all method - -Going into more detail on how to assemble the values for the required arguments: - -- `place`: For this parameter, you will need to specify the DCID (the unique ID assigned by Data Commons to each node in the graph) of the place you are interested in. -- [`stat_var`](/glossary.html): The statistical variable whose value you are interested in. - -> **NOTE:** -> Be sure to initialize the library. Check the [Python library setup guide](/api/python/) for more details. - -## What to expect in the function return - -The method's return value will always be an object in the following form: - -```python -{ - "": { - "stat_var": { - "sourceSeries": [ - { - "val": { - <"time series"> - } - "measurementMethod": "", - "observationPeriod": "", - "importName": "", - "provenanceDomain": "" - } - ... - ] - } - ... - } - ... -} -``` - -For more information on the key terms in this sample response, see [the glossary](/glossary.html). - -## Examples - -### Example 1: Retrieve the total population as well as the male population of Arkansas. - -```python ->>> import datacommons as dc ->>> dc.get_stat_all(["geoId/05"], ["Count_Person", "Count_Person_Male"]) -{ - 'geoId/05': { - 'Count_Person_Female': { - 'sourceSeries': [ - { - 'val': { - '2001': 1376360 - '2002': 1382090, - ... - '2017': 1521170, - '2018': 1527580, - }, - 'measurementMethod': 'OECDRegionalStatistics', - 'observationPeriod': 'P1Y', - 'importName': 'OECDRegionalDemography', - 'provenanceDomain': 'oecd.org' - }, - { - 'val': { - '2011': 1474641, - '2012': 1485120 - ... - '2017': 1516293, - '2018': 1522259, - }, - 'measurementMethod': 'CensusACS5yrSurvey', - 'importName': 'CensusACS5YearSurvey', - 'provenanceDomain': 'census.gov' - } - ] - }, - 'Count_Person_Male': { - 'sourceSeries': [ - { - 'val': { - '2001': 1315210, - '2002': 1323840, - ... - '2017': 1475420, - '2018': 1480140, - }, - 'measurementMethod': 'OECDRegionalStatistics', - 'observationPeriod': 'P1Y', - 'importName': 'OECDRegionalDemography', - 'provenanceDomain': 'oecd.org' - }, - { - 'val': { - '2011': 1421287 - '2012': 1431252, - ... - '2017': 1461651, - '2018': 1468412, - }, - 'measurementMethod': 'CensusACS5yrSurvey', - 'importName': 'CensusACS5YearSurvey', - 'provenanceDomain': 'census.gov' - } - ] - } - } -} -``` - -### Example 2: Retrieve the populations of people with doctoral degrees in Minnesota and Wisconsin. - -```python ->>> datacommons.get_stat_all(["geoId/27","geoId/55"], ["Count_Person_EducationalAttainmentDoctorateDegree"]) -{'geoId/27': {'Count_Person_EducationalAttainmentDoctorateDegree': {'sourceSeries': [{'val': {'2016': 50039, '2017': 52737, '2018': 54303, '2012': 40961, '2013': 42511, '2014': 44713, '2015': 47323}, 'measurementMethod': 'CensusACS5yrSurvey', 'importName': 'CensusACS5YearSurvey', 'provenanceDomain': 'census.gov', 'provenanceUrl': 'https://www.census.gov/'}]}}, 'geoId/55': {'Count_Person_EducationalAttainmentDoctorateDegree': {'sourceSeries': [{'val': {'2017': 43737, '2018': 46071, '2012': 38052, '2013': 38711, '2014': 40133, '2015': 41387, '2016': 42590}, 'measurementMethod': 'CensusACS5yrSurvey', 'importName': 'CensusACS5YearSurvey', 'provenanceDomain': 'census.gov', 'provenanceUrl': 'https://www.census.gov/'}]}}} -``` - -## Error returns - -When no data is found, the API returns a dictionary with no values: - -```python ->>> import datacommons as dc ->>> dc.get_stat_all(["bad value"],["another bad value"]) -{'bad value': {'another bad value': {}}} -``` diff --git a/api/python/stat_series.md b/api/python/stat_series.md deleted file mode 100644 index bf3aa6411..000000000 --- a/api/python/stat_series.md +++ /dev/null @@ -1,115 +0,0 @@ ---- -layout: default -title: Get time series for a place -nav_order: 34 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve statistical time series for a place - -Returns a `dict` mapping date to value for a `place` based on the -[`stat_var`](https://datacommons.org/browser/StatisticalVariable), with optional -filter parameters. See the [full list of `StatisticalVariable` classes](/statistical_variables.html). - -## General information about this endpoint - -**Signature**: - -```python -datacommons.get_stat_series(place, stat_var, measurement_method=None,observation_period=None, unit=None, scaling_factor=None) -``` - -**Required arguments**: - -* `place`: The `dcid` of the [`Place`](https://datacommons.org/browser/Place) to query for. -* `stat_var`: The `dcid` of the [`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable). - -> **NOTE:** -> - In Data Commons, [`dcid`](/glossary.html) stands for Data Commons ID and indicates the unique identifier assigned to every node in the knowledge graph. - -## Assembling the information you will need for a call to the get_stat_series method - -Going into more detail on how to assemble the values for the required arguments: - -- [`place`](/glossary.html): For this parameter, you will need to specify the DCID (the unique ID assigned by Data Commons to each node in the graph) of the place you are interested in. - -- [`stat_var`](/glossary.html): The statistical variable whose value you are interested in. - -In addition to these required properties, this endpoint also allows for other, optional arguments. Here are helpful arguments in regular use by Data Commons developers: - -- [`measurement_method`](https://docs.datacommons.org/glossary.html): The technique used for measuring a statistical variable. - -- [`observation_period`](https://docs.datacommons.org/glossary.html): The time period over which an observation is made. - -- [`unit`](https://docs.datacommons.org/glossary.html): The unit of measurement. - -- [`scaling_factor`](https://docs.datacommons.org/glossary.html): Property of statistical variables indicating factor by which a measurement is multiplied to fit a certain format. - -Note that specifying arguments that do not exist for the target place and variable will result in an empty response. - -## What to expect in the function return - -The method's return value will always be a `dict` in the following form: - -```python -{ - "": - ... -} -``` - -## Examples - -### Example 1: Retrieve the count of men in the state of California. - -```python ->>> datacommons.get_stat_series("geoId/05", "Count_Person_Male") -{'2011': 1421287, '2012': 1431252, '2013': 1439862, '2014': 1447235, '2015': 1451913, '2016': 1456694, '2017': 1461651, '2018': 1468412} -``` - -### Example 2: Retrieve the number of people in Bosnia and Herzegovina as counted by the Bosnian census. - -```python ->>> datacommons.get_stat_series("country/BIH", "Count_Person", measurement_method="BosniaCensus") -{'2013': 3791622} -``` - -### Example 3: Retrieve the death count in Miami-Dade County over a period of one year. - -```python ->>> datacommons.get_stat_series("geoId/12086", "Count_Death", observation_period="P1Y") -{'1999': 19170, '2001': 19049, '2002': 18176, '2009': 17806, '2012': 18621, '2015': 19542, '2005': 18400, '2008': 18012, '2010': 18048, '2017': 20703, '2000': 18540, '2003': 18399, '2006': 18261, '2013': 18473, '2014': 19013, '2004': 18384, '2007': 17982, '2011': 17997, '2016': 20277} -``` - -### Example 4: Retrieve the distrubtion of naloxone in Miami-Dade County in grams. - -```python ->>> datacommons.get_stat_series("geoId/12086", "RetailDrugDistribution_DrugDistribution_Naloxone", unit="Grams") -{'2007-07': 80.34, '2007-10': 118.79, '2006-01': 44.43, '2006-04': 48.28, '2006-07': 54.98, '2006-10': 55.21, '2007-01': 59.63, '2007-04': 65.98} -``` - -### Example 5: Retrieve the percentage of nominal GDP spent by the government of the Gambia on education. - -```python ->>> datacommons.get_stat_series("country/GMB", "Amount_EconomicActivity_ExpenditureActivity_EducationExpenditure_Government_AsFractionOf_Amount_EconomicActivity_GrossDomesticProduction_Nominal", scaling_factor="100.0000000000") -{'1986': 3.48473, '1996': 2.56628, '2000': 1.46587, '2010': 4.1561, '2014': 2.17849, '2012': 4.10118, '2013': 1.82979, '1999': 1.56513, '1985': 4.29515, '1992': 1.16984, '1995': 2.55356, '2002': 1.44292, '2015': 2.13528, '2005': 1.13919, '2018': 2.43275, '2008': 3.52738, '2016': 2.05946, '1989': 2.97409, '1990': 2.82584, '1991': 3.78061, '2011': 3.92511, '2004': 1.0345, '2007': 1.30849, '2009': 3.07235, '2001': 1.1581, '2003': 1.36338, '2006': 1.20949} -``` - -## Error Returns - -If there is no series associated with the requested property, an empty list is returned: - -```python ->>> datacommons.get_stat_series("geoId/1001", "Count_Person_Male") -{} -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons.get_stat_series("geoId/1001") -Traceback (most recent call last): - File "", line 1, in -TypeError: get_stat_series() missing 1 required positional argument: 'stat_var' -``` diff --git a/api/python/stat_value.md b/api/python/stat_value.md deleted file mode 100644 index 02051089c..000000000 --- a/api/python/stat_value.md +++ /dev/null @@ -1,124 +0,0 @@ ---- -layout: default -title: Get a single statistical value for a place -nav_order: 33 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve statistics value for a place - -Returns a statistical value for a place based on the -[`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable). -See the [full list of `StatisticalVariable` classes](/statistical_variables.html). - -When there are multiple sources for the same statistical variable, a preferred -source with more recent data or more authority is selected. - -## General information about this endpoint - -**Signature**: - -```python -datacommons.get_stat_value(place, stat_var, date=None, measurement_method=None,observation_period=None, unit=None, scaling_factor=None) -``` - -**Required arguments**: - -* `place`: The [DCID](https://docs.datacommons.org/glossary.html) of the [`Place`](https://datacommons.org/browser/Place) to query for. -* `stat_var`: The DCID of the [`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable). - -You can find a list of StatisticalVariables with human-readable names [here](/statistical_variables.html). - -**Optional arguments**: - -* `date`: The preferred date of observation in ISO 8601 format. If not specified, returns the latest observation. -* `measurement_method`: The DCID of the preferred `measurementMethod` value. -* `observation_period`: The preferred `observationPeriod` value. -* `unit`: The DCID of the preferred `unit` value. -* `scaling_factor`: The preferred `scalingFactor` value. - -## Assembling the information you will need for a call to the get_stat_value method - -Going into more detail on how to assemble the values for the required arguments: - -- [`place`]((/glossary.html)): For this parameter, you will need to specify the DCID (the unique ID assigned by Data Commons to each node in the graph) of the place you are interested in. -- [`stat_var`](/glossary.html): The statistical variable whose value you are interested in. - -In addition to these required properties, this method also allows for other, optional arguments. Here are helpful arguments in regular use by Data Commons developers: - -- [`date`](https://docs.datacommons.org/glossary.html): Specified in ISO 8601 format. Examples include `2011` (the year 2011), `2019-06` (the month of June in the year 2019), and `2019-06-05T17:21:00-06:00` (5:17PM on June 5, 2019, in CST). - -- [`measurement_method`](https://docs.datacommons.org/glossary.html): The technique used for measuring a statistical variable. - -- [`observation_period`](https://docs.datacommons.org/glossary.html): The time period over which an observation is made. - -- [`unit`](https://docs.datacommons.org/glossary.html): The unit of measurement. - -- [`scaling_factor`](https://docs.datacommons.org/glossary.html): Property of statistical variables indicating factor by which a measurement is multiplied to fit a certain format. - -## What to expect in the return - -The method will return a simple number, like '1.20949' or '1431252'. - -## Examples - -### Example 1: Retrieve the count of men in the state of California. - -```python ->>> datacommons.get_stat_value("geoId/05", "Count_Person_Male") -1474705 -``` - -### Example 2: Retrieve the count of men in the state of California in the year 2012. - -```python ->>> datacommons.get_stat_value("geoId/05", "Count_Person_Male", date="2012") -1431252 -``` - -### Example 3: Retrieve the number of people in Bosnia and Herzegovina as counted by the Bosnian census. - -```python ->>> datacommons.get_stat_value("country/BIH", "Count_Person", measurement_method="BosniaCensus") -3791622 -``` - -### Example 4: Retrieve the death count in Miami-Dade County over a period of one year. - -```python ->>> datacommons.get_stat_value("geoId/12086", "Count_Death", observation_period="P1Y") -20703 -``` - -### Example 5: Retrieve the distrubtion of the drug naloxone in Miami-Dade County in grams. - -```python ->>> datacommons.get_stat_value("geoId/12086", "RetailDrugDistribution_DrugDistribution_Naloxone", unit="Grams") -118.79 -``` - -### Example 6: Retrieve the percentage of nominal GDP spent by the government of the Gambia on education. - -```python ->>> datacommons.get_stat_value("country/GMB", "Amount_EconomicActivity_ExpenditureActivity_EducationExpenditure_Government_AsFractionOf_Amount_EconomicActivity_GrossDomesticProduction_Nominal", scaling_factor="100.0000000000") -2.43275 -``` - -## Error Returns - -If there is no value associated with the requested property, `nan` is returned: - -```python ->>> datacommons.get_stat_value("geoId/1001", "Count_Person_Male") -nan -``` - -If you do not pass a required positional argument, a TypeError is returned: - -```python ->>> datacommons.get_stat_value("geoId/1001") -Traceback (most recent call last): - File "", line 1, in -TypeError: get_stat_value() missing 1 required positional argument: 'stat_var' -``` \ No newline at end of file diff --git a/api/python/stats.md b/api/python/stats.md deleted file mode 100644 index ed0c87030..000000000 --- a/api/python/stats.md +++ /dev/null @@ -1,81 +0,0 @@ ---- -layout: default -title: Get statistics for places -parent: Python (V1) -grand_parent: API - Query data programmatically -published: false ---- - -# Get statistics for places - -## `datacommons.get_stats(dcids, stats_var, obs_dates='latest')` - -Returns a time series of statistical values for queried places based on the -[`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable) filtered by the specified dates. -See the [full list of StatisticalVariables](/statistical_variables.html). - -**Arguments** - -* `dcids`: A list of dcids of the - [`Place`](https://datacommons.org/browser/Place) to query for. - -* `stats_var`: The dcid of the - [`StatisticalVariable`](https://datacommons.org/browser/StatisticalVariable). - -* `obs_dates`: The dates of [`Observation`](https://datacommons.org/browser/Observation)s to include. Can be 'latest', 'all', or an list of dates in ISO-8601 format, e.g. "YYYY", "YYYY-MM" or "YYYY-MM-DD". - -**Returns** - -A dictionary mapping each `Place` identified by the given `dcid` -to its name and the time series associated with the -`StatisticalVariable` identified by the given `stats_var` -and filtered by `obs_dates`. -See example below for more detail about how the returned dictionary is -structured. - -**Raises** - -* `ValueError` - If the payload returned by the Data Commons REST API is -malformed. - -Be sure to initialize the library. Check the [Python library setup guide](/api/python/) for more details. - -You can find a list of StatisticalVariables with human-readable names [here](/statistical_variables.html). - -## Examples - -We would like to get the time series of the [male population](https://datacommons.org/browser/Count_Person_Male) -in [Arkansas](https://datacommons.org/browser/geoId/05) -and [Santa Clara County](https://datacommons.org/browser/geoId/06085). - -```python ->>> import datacommons as dc ->>> dc.set_api_key(YOUR_API_KEY_HERE) ->>> dc.get_stats(["geoId/05", "geoId/06085"], "Count_Person_Male", obs_dates="all") -{ - 'geoId/05': { - 'data': { - '2011': 1421287, - '2012': 1431252, - '2013': 1439862, - '2014': 1447235, - '2015': 1451913, - '2016': 1456694, - '2017': 1461651, - '2018': 1468412 - } - }, - 'geoId/06085': { - 'data': { - '2011': 885307, - '2012': 898013, - '2013': 910196, - '2014': 924848, - '2015': 939004, - '2016': 949223, - '2017': 963317, - '2018': 970469 - } - } -} -``` diff --git a/api/python/triple.md b/api/python/triple.md deleted file mode 100644 index 1153f4fe2..000000000 --- a/api/python/triple.md +++ /dev/null @@ -1,97 +0,0 @@ ---- -layout: default -title: Get triples associated with nodes -nav_order: 32 -parent: Python (V1) -grand_parent: API - Query data programmatically ---- - -# Retrieve triples associated with nodes - -Given a list of nodes, return [triples](https://docs.datacommons.org/glossary.html) which are associated with the specified -node(s). - -## General information about this method - -**Signature**: - -```python -datacommons.get_triples(dcids, limit=datacommons.utils._MAX_LIMIT) -``` - -**Required arguments**: - -* `dcids` - A list of nodes to query, identified by their [DCID](https://docs.datacommons.org/glossary.html). - -**Optional arguments**: - -* `limit` - The maximum number of triples per combination of - property and type associated with nodes linked by that property to fetch, - ≤ 500. - -## Assembling the information you will need for a call to the get_triples method - -This endpoint requires the argument [`dcids`](https://docs.datacommons.org/glossary.html), which are unique node identifiers defined by Data Commons. Your query will need to specify the DCIDs for the nodes of interest. - -In addition to this required property, this endpoint also allows you to specify a limit on how many triples (up to 500) you would like to see in the response. - -## What to expect in the function return - -The method's return value will always be a `dict` in the following form: - -```python -{ - "": [, ...] - ... -} -``` - -While all triples contain subjects, predicates, and objects, those entities may be specified using any of a few possible fields. Here are possible keys that you may find associated to triples in the JSON response: - - - `SubjectID` - - `SubjectName` - - `SubjectTypes` - - `Predicate` - - `ObjectID` - - `ObjectName` - - `ObjectValue` - - `ObjectTypes` - - `ProvenanceID` - -## Example requests and responses - -### Example 1: Retrieve triples associated with zip code 94043. - -```python ->>> datacommons.get_triples(['zip/94043']) -{'zip/94043': [('dc/p/zx34sdjfl5v75', 'location', 'zip/94043'), ... ]} -``` - -### Example 2: Retrieve triples associated with two American biological research labs. - -```python ->>> datacommons.get_triples(['dc/c3j78rpyssdmf','dc/7hfhd2ek8ppd2']) -{'dc/c3j78rpyssdmf': [('dc/c3j78rpyssdmf', 'provenance', 'dc/h2lkz1'), ('dc/zn6l0flenf3m6', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/tkcknpfwxfrhf', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/jdzbbfhgzghv1', 'biosampleOntology', 'dc/c3j78rpyssdmf'), ('dc/4f9w8lhcwggxc', 'biosampleOntology', 'dc/c3j78rpyssdmf')], 'dc/7hfhd2ek8ppd2': [('dc/4mjs95b1meh1h', 'biosampleOntology', 'dc/7hfhd2ek8ppd2'), ('dc/13xcyzcr819cb', 'biosampleOntology', 'dc/7hfhd2ek8ppd2'), ('dc/7hfhd2ek8ppd2', 'provenance', 'dc/h2lkz1')]} -``` - -## Error Returns - -If a non-existent triple is passed, a KeyError is thrown: - -```python ->>> datacommons.get_triples(['geoId/123']) -Traceback (most recent call last): - File "", line 1, in - File "/home/porpentina/miniconda3/lib/python3.7/site-packages/datacommons/core.py", line 251, in get_triples - for t in payload[dcid]: -KeyError: 'geoId/123' -``` - -If you do not pass the required positional argument, a TypeError is returned: - -```python ->>> datacommons.get_triples() -Traceback (most recent call last): - File "", line 1, in -TypeError: get_triples() missing 1 required positional argument: 'dcids' -``` diff --git a/api/python/v2/index.md b/api/python/v2/index.md index d6db2ddb4..8a42afe73 100644 --- a/api/python/v2/index.md +++ b/api/python/v2/index.md @@ -33,7 +33,6 @@ Here are just some of the changes from the previous version of the libraries: - Requests to base datacommons.org require an [API key](/api/index.html#get-key). - The primary interface is a set of classes representing the REST V2 API endpoints. - Each class provides a `fetch` method that takes an API [_relation expression_](/api/rest/v2/index.md#relation-expressions) as an argument as well as several convenience methods for commonly used operations. -- There is no SPARQL endpoint. {: #install} ## Install the Python Data Commons V2 API diff --git a/api/rest/v2/index.md b/api/rest/v2/index.md index 9583f1287..fe7576425 100644 --- a/api/rest/v2/index.md +++ b/api/rest/v2/index.md @@ -49,7 +49,6 @@ The URIs for the V2 API are below: | Observation | [/observation](/api/rest/v2/observation) | Fetches statistical observations | | Node | [/node](/api/rest/v2/node) | Fetches information about edges and neighboring nodes | | Resolve entities | [/resolve](/api/rest/v2/resolve) | Returns a Data Commons ID ([`DCID`](/glossary.html#dcid)) for entities in the graph | -| SPARQL | [/v2/sparql](/sparql) | Returns matches to a [SPARQL](https://www.w3.org/TR/rdf-sparql-query/){: target="_blank"} graph query | ### Base URL for custom instances @@ -198,7 +197,7 @@ You can combine multiple properties together within `[]`. For example, to reques ### Filters -V2 supports limited filtering of result candidates. Currently the only support is to restrict candidates by entity type. The format of this filter (for non-SPARQL queries) is: +V2 supports limited filtering of result candidates. Currently the only support is to restrict candidates by entity type. The format of this filter is:
 {typeOf:VALUE}
@@ -210,7 +209,6 @@ Here are the contexts where this filter is currently supported:
 |-----|--------------------------------------|-------------|
 | Node and Observation | Incoming property `<-containedInPlace+`  | Return entities of the specified type, that are contained in the selected place entity (or entities). **Note:** the `+` character is required between the property and filter. |
 | Resolve entity | Incoming properties `<-description` 
`<-wikiId`
`<-geoCoordinate` | Return entities of the specified type, that match a selected name, wiki ID, or geocoordinate. | -| SPARQL | In a `WHERE` clause, for any entity being queried | Return only entities of the specified type. | See the endpoint pages for examples. diff --git a/api/rest/v2/migration.md b/api/rest/v2/migration.md index 583ed24f0..c8372c503 100644 --- a/api/rest/v2/migration.md +++ b/api/rest/v2/migration.md @@ -29,7 +29,6 @@ The Data Commons [REST API V2](index.md) is significantly different from V1. Thi | APIs for node information | Multiple endpoints: `find/entities`, `info/place`, `info/variable`, `info/variable-group` and `bulk` versions | Endpoint `node` with `property` parameter and `resolve` endpoint for place DCIDs | | APIs for statistical observations | Endpoints `observations/series` and `observations/point` and `bulk` versions | Single endpoint `observation` | | APIs for statistical variables | Endpoint `variables` and `bulk` equivalent | Endpoint `node` with `property` parameter and relation expressions | -| SPARQL API | Endpoint `query` | Endpoint `sparql` | | HTTP requests | POST requests supported for some bulk endpoints | POST requests supported for all endpoints | ## Examples diff --git a/api/rest/v2/sparql.md b/api/rest/v2/sparql.md deleted file mode 100644 index a6f5d060d..000000000 --- a/api/rest/v2/sparql.md +++ /dev/null @@ -1,291 +0,0 @@ ---- -layout: default -title: Query with SPARQL -nav_order: 5 -parent: REST (V2) -grand_parent: API - Query data programmatically -published: true ---- -{: .no_toc} -# /v2/sparql - -This endpoint makes it possible to query the Data Commons knowledge graph using -[SPARQL](https://www.w3.org/TR/rdf-sparql-query/){: target="_blank"}. SPARQL is a query language developed to retrieve data from [RDF graph](https://en.wikipedia.org/wiki/Resource_Description_Framework){: target="_blank"} content on the web. It leverages the graph structure innate in the data it -queries to return specific information. - -> **Note:** Data Commons only supports a limited subset of SPARQL functionality at this time: specifically, only the keywords `WHERE`, `ORDER BY`, `DISTINCT`, and `LIMIT` are supported. - -* TOC -{:toc} - -## Request - -> Note: GET requests are not provided because they are inconvenient to use with SPARQL. - -

-POST request -

- -

-URL: https://api.datacommons.org/v2/sparql - -Header: X-API-Key: AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI - -JSON data: { "query": "SPARQL_QUERY" } -

- - - - -### Query parameters - -| Name | Type | Description | -| --------------------------------------------------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| key
Required | string | Your API key. See the the section on [authentication](/api/rest/v2/index.html#authentication) for details. | -| query
Required | string | A SPARQL query string.
In the query, all desired entities must be specified; wildcards are not supported. Each node or entity should have a `typeOf` condition, for example, ?ENTITY_NAME typeOf City. | -{: .doc-table } - -## Response - -The response looks like: - -
-{
-  "header": [
-    STRING
-  ],
-  "rows": [
-    {
-      "cells": [
-        {
-          "value": STRING
-        }
-      ]
-    },
-    ...
-  ]
-}
-
-{: .response-signature .scroll} - -### Response fields - -| Name | Type | Description | -| ------ | ------ | -------------------------------------------------------------------------------- | -| header | list | List of strings corresponding to the query variables. | -| rows | list | List of `row` objects, with each containing a list of cells and its cell values. | -| cells | object | Contains string field `value` corresponding to the queried variable. | -{: .doc-table} - -## Examples - -### Example 1: Get a list of all cities with a particular property - -Retrieve a list of the DCIDs of cities that have the property [`unDataLabel`](https://datacommons.org/browser/unDataLabel){: target="_blank"}. - -Request: -{: .example-box-title} - -```bash -curl --request POST \ - --url https://api.datacommons.org/v2/sparql \ - --header 'X-API-Key: AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI' \ - --data '{ - "query": "SELECT DISTINCT ?subject \ - WHERE { \ - ?subject unDataLabel ?object . \ - ?subject typeOf City - } - LIMIT 10" - }' -``` - -Response: -{: .example-box-title} - -```json -{ - "header" : [ - "?subject" - ], - "rows" : [ - { - "cells" : [ - { - "value" : "nuts/DED12" - } - ] - }, - { - "cells" : [ - { - "value" : "wikidataId/Q623736" - } - ] - }, - { - "cells" : [ - { - "value" : "wikidataId/Q33986" - } - ] - }, - { - "cells" : [ - { - "value" : "wikidataId/Q221921" - } - ] - }, - { - "cells" : [ - { - "value" : "wikidataId/Q841241" - } - ] - }, - { - "cells" : [ - { - "value" : "wikidataId/Q200797" - } - ] - }, - { - "cells" : [ - { - "value" : "undata-geo/C16801100" - } - ] - }, - { - "cells" : [ - { - "value" : "undata-geo/C07501077" - } - ] - }, - { - "cells" : [ - { - "value" : "undata-geo/C23400001" - } - ] - }, - { - "cells" : [ - { - "value" : "wikidataId/Q1005090" - } - ] - } - ] -} -``` -{: .example-box-content .scroll} - -### Example 2: Get a list of biological specimens - -Retrieve a list of 10 biological specimens (DCID: `BiologicalSpecimen`) in -reverse alphabetical order. - -Request: -{: .example-box-title} - -```bash -curl --request POST \ - --url https://api.datacommons.org/v2/sparql \ - --header 'X-API-Key: AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI' \ - --data '{ - "query": "SELECT ?name \ - WHERE { \ - ?biologicalSpecimen typeOf BiologicalSpecimen . \ - ?biologicalSpecimen name ?name - } - ORDER BY DESC(?name) - LIMIT 10" -}' -``` -{: .example-box-content .scroll} - -Response: -{: .example-box-title} - -```json -{ - "header": ["?name"], - "rows": [ - { - "cells": [ - { - "value": "x Triticosecale" - } - ] - }, - { - "cells": [ - { - "value": "x Silene" - } - ] - }, - { - "cells": [ - { - "value": "x Silene" - } - ] - }, - { - "cells": [ - { - "value": "x Silene" - } - ] - }, - { - "cells": [ - { - "value": "x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey" - } - ] - }, - { - "cells": [ - { - "value": "x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey" - } - ] - }, - { - "cells": [ - { - "value": "x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey" - } - ] - }, - { - "cells": [ - { - "value": "x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey" - } - ] - }, - { - "cells": [ - { - "value": "x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey" - } - ] - }, - { - "cells": [ - { - "value": "x Pseudelymus saxicola (Scribn. & J.G.Sm.) Barkworth & D.R.Dewey" - } - ] - } - ] -} -``` -{: .example-box-content .scroll} diff --git a/index.md b/index.md index c0f8ac146..094ad37a9 100644 --- a/index.md +++ b/index.md @@ -32,7 +32,7 @@ To find out what data is available in Data Commons, see the [Data sources](/data There are several options for directly querying the data, without accessing the datacommons.org website, both interactive and programmatic: -- **APIs**: Data Commons publishes REST, Python, Pandas, and SPARQL [APIs](/api/index.html). These APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of data. You can call them from any application that supports REST protocols. +- **APIs**: Data Commons publishes REST, Python, and Pandas [APIs](/api/index.html). These APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of data. You can call them from any application that supports REST protocols. The [Python](/api/python/v2) APIs provide convenient wrappers, that you can call programatically or interactively, for example, from a Python virtual environment shell or from [Google Colab](https://colab.sandbox.google.com/){: target="_blank"}. We have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis.