Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about hstore and geometry data types #1093

Closed
sfkeller opened this issue Jun 3, 2019 · 10 comments
Closed

Question about hstore and geometry data types #1093

sfkeller opened this issue Jun 3, 2019 · 10 comments

Comments

@sfkeller
Copy link

sfkeller commented Jun 3, 2019

I'm submitting a feature request, newest PostGraphile version.

I'd like to make a query which is typical for a database derived from OpenStreetMap and I've read the issues, the docs and made several tests without success so far (struggling with #575 ).

Prerequisites: PostgreSQL 11 and PostGIS 2.5. Datatype hstore requires "CREATE EXTENSION hstore;" and geometry/geograph requires "CREATE EXTENSION postgis;" .

The OSM/PostGIS dataset can be generated using https://wiki.openstreetmap.org/wiki/Osm2pgsql#Installation with a OSM PBF file (e.g. Switzerland) from http://download.geofabrik.de/ . The resulting schema in PostgreSQL contains among others a table planet_osm_point (short: osm_point) and the attributes tags (type hstore), and attribute geom (type geography or geometry).

Let's say we want to get all bakeries in Uster (Switzerland) and a resultset in GeoJSON containing osm_id, name, street and geometry.

This is a solution in SQL (Query with PostgreSQL/PostGIS):

with boundary as ( -- bbox of Uster 
	select st_geomfromtext('POLYGON((8.677501 47.383873, 8.760421 47.383873, 8.760421 47.320153, 8.677501 47.320153, 8.677501 47.383873))',4326) as geom
)
select osm_id, name, tags->'addr:street' as street, geom 
from osm_point
where name is not null
and (tags->'shop') = 'bakery'
and st_intersects(geom, (select geom from boundary))
--and st_within(geom::geometry, (select geom from boundary)::geometry)

This is the GraphQL query I think which would come close to what we want:

Query:

query osm_point($boundary: geometry){
  osm_point(
    where: {
      name: {_is_null: false},
      tags: {_contains: {"shop": "bakery"}},
      geom: {_st_within: $polygon}
    }
  ){
    id
    name
    tags  # want not all kv pairs only value_of (tags->'addr:street'): how?
    geom
  }
}

Query variables:

{
  "boundary": {
    "type": "Polygon",
    "coordinates": [
      [
        [8.677501,47.320153],
        [8.760421,47.320153],
        [8.760421,47.383873],
        [8.677501,47.383873],
        [8.677501,47.320153]
      ]
    ]
  }
}

This is a typical GeoJSON dataset containing bakeries in Uster: uster_bakeries_geojson.txt

@sfkeller
Copy link
Author

sfkeller commented Jun 7, 2019

@Murthy10 and I made some more tests and investigations with Postgraphile and this is the current (untested) query to get what we (and GeoJSON) want:

query osm_polygon_query($boundary: geometry){
  osm_polygon(
    first: 1, 
    filter: {
      and: [
        {name: {isNull: false}},
        tags: {contains: {shop: "bakery"}},        
        geom: {st_within: $polygon}
      ]
    }
  )
  {
    geom{
      geojson{
        properties{
          id
          name
          tags{
            addr.street
          }
        }
      }
    }
  }
}

Expecting following valid GeoJSON output (see http://geojsonlint.com/ ) edited by hand:

{
  "type": "FeatureCollection",
  "name": "osm_polygon",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "id": "1111111",
        "name": "My Bakery",
        "tags": {
          "addr.street": "Main Str."
        }
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          888937.17,
          5994550.27
        ]
      }
    }
  ]
}

@benjie
Copy link
Member

benjie commented Jun 13, 2019

I think the geojson field is a scalar, so you can just stop at the geojson field; it doesn’t need (and will not accept) a selection set. I advise writing queries like this in GraphiQL, it should give you guidance.

@sfkeller
Copy link
Author

sfkeller commented Jun 13, 2019

Thx for the advice.

If you look at the desired GeoJSON output given above, it becomes obvious that it's currently not possible that the client can tell the GraphQL server to output the mandatory standardized outer JSON block, which looks like this: {"type": "FeatureCollection", "name": "osm_polygon", "features": [...]}.

Can you give some hints on how and where to extend the Postgraphile resolvers?

@benjie
Copy link
Member

benjie commented Jun 13, 2019

Sure. I don't use PostGIS so you're going to have to connect the dots yourself; but here's some options:

  1. (preferred) write an SQL function that returns the data you desire and exposes it sensibly
create function bakeries_within_geom(geom geometry) returns geometry as $$
  select st_collect(feature) from ...
$$ language sql stable;

This is preferred because it allows you to use all the PostGIS features inside of PostgreSQL whilst only exposing to the user exactly what you want them to see (your GraphQL API shouldn't really be a 1-to-1 map of your database). We call this a "Custom Query" and you can read more about them here: https://www.graphile.org/postgraphile/custom-queries/

  1. Extend your GraphQL schema however you like with makeExtendSchemaPlugin

This enables you to write GraphQL schema language and add fields and resolvers to your GraphQL API. It's a very manual way of doing things, but results in an API that's exactly the shape you specify. Read more here: https://www.graphile.org/postgraphile/make-extend-schema-plugin/

Perhaps @mattbretl or @singingwolfboy could shed more light?

@sfkeller
Copy link
Author

I appreciate solution 1 to write an SQL function that returns the data you desire and exposes it sensibly.

But my goal is to offer an API for spatial data (from OpenStreetMap in this case) which involves two "non-base data types" 'geometry' and 'hstore' including the typical associated functions which are basically handled the same as base data types. So the "responsibility" to resolve 'geometry' and 'hstore' is at data type level (not schema/query type, nor custom query).

What's more tricky is the fact, that GeoJSON mandates in a rather awkward way that the resultset has a fixed outer JSON block {"type": "FeatureCollection", "name": "<some_name>", "features": [...]} . The single key "features" has a JSON array as value. This array finally contains "type": "feature" as "nodes" which have one geometry, and all remaining attributes of each "node" are part of key "properties".

@benjie
Copy link
Member

benjie commented Jun 14, 2019

Is that what ST_collect does - wrap a list of features into a feature collection? PostGraphile has early support for custom aggregates, so you could potentially support this via adding an aggregate field on a connection with a plugin.

The feature collection will still be nested inside the response object, because that’s how GraphQL works. Is that problematic for your use-case?

@sfkeller
Copy link
Author

@benjie
Copy link
Member

benjie commented Jun 14, 2019

I'm not sure what your "not really" relates to; I think we're talking at cross-purposes. I understand the basics of the GeoJSON format, but I'm not super familiar with all the PostGIS operators. Seems that though ST_Collect does form a collection (e.g. from the docs:

SELECT ST_AsText(
  ST_Collect(
    ST_GeomFromText('CIRCULARSTRING(220268 150415,220227 150505,220227 150406)'),
    ST_GeomFromText('CIRCULARSTRING(220227 150406,2220227 150407,220227 150406)')
  )
);
																st_astext
------------------------------------------------------------------------------------
 GEOMETRYCOLLECTION(CIRCULARSTRING(220268 150415,220227 150505,220227 150406),
 CIRCULARSTRING(220227 150406,2220227 150407,220227 150406))

), but you cannot ST_Collect(...) over features since features effectively don't exist in PostGIS?


Imagine I have the following GraphQL schema (nothing to do with PostGraphile):

scalar GeoJSON
type Query {
  geojson: GeoJSON
}

and I query it with this query:

{ geojson }

Then (assuming success) the response I get over HTTP would be:

{
  "data": {
    "geojson": /* SOMETHING HERE */
  }
}

for example:

{
  "data": {
    "geojson": {
      "type": "FeatureCollection",
      "name": "GR_NBK_2_modif_Stefan",
      ...
    }
  }
}

There will never be a root field, "type": "FeatureCollection"; instead you have to access data.geojson to get the object you care about. If this is problematic for your use case, then I don't think GraphQL is the technology you want for your API — REST may be a better fit.


However, if the above is fine for your API (that the GeoJSON is nested inside regular JSON), then what I think you need is a single FeatureCollection somewhere inside your response. I don't know what osm_polygon is or does; but from the GeoJSON output you've said you're expecting, I'm going to assume it returns a set of "features" somehow. From my understanding a "feature" is something that exists in GeoJSON but does not exist in PostGIS (?), so you have to do a bit of work to achieve this.

Judging by this StackOverflow answer you should be able to do something like this to construct a FeatureCollection from a table that represent features (via the gid and geom columns, plus any other columns which go into the properties field):

SELECT json_build_object(
  'type', 'FeatureCollection',
  'features', json_agg(json_build_object(
    'type',       'Feature',
    'id',         YOUR_TABLE_HERE.gid,
    'geometry',   ST_AsGeoJSON(YOUR_TABLE_HERE.geom)::json,
    'properties', (to_jsonb(YOUR_TABLE_HERE) - 'gid' - 'geom')::json
  ))
)
FROM YOUR_TABLE_HERE

(I've changed it from JSONB to JSON for performance reasons, and rewritten the query so that it will work in our aggregates system.)

So you can either write a function that returns this (as JSON), or you could implement it as a custom aggregate. Our custom aggregations does not have any helpers yet, but you can base it on our totalCount column which uses it:

https://github.com/graphile/graphile-engine/blob/44785dec481826f29bfab9e50e7bcd32df7c4a96/packages/graphile-build-pg/src/plugins/PgConnectionTotalCount.js#L33-L55

You might customise it to something like this (untested):

export default function PgConnectionFeatureCollectionPlugin(builder) {
  builder.hook(
    "GraphQLObjectType:fields",
    (fields, build, context) => {
      const {
        extend,
        graphql: { GraphQLInt, GraphQLNonNull },
        pgSql: sql,
        getTypeByName,
      } = build;
      const {
        scope: { isPgRowConnectionType, pgIntrospection: table },
        fieldWithHooks,
        Self,
      } = context;

      // Return if this isn't a table connection
      if (
        !isPgRowConnectionType ||
        !table ||
        table.kind !== "class" ||
        !table.namespace
      ) {
        return fields;
      }

      // Return if this table doesn't have `gid` and `geom` columns
      if (!table.attributes.some(attr => attr.name === 'gid')
        || !table.attributes.some(attr => attr.name === 'geom')) {
        return fields;
      }

      const GraphQLJSONType = getTypeByName('JSON');

      return extend(
        fields,
        {
          totalCount: fieldWithHooks(
            "featureCollection",
            ({ addDataGenerator }) => {
              addDataGenerator(() => {
                return {
                  pgAggregateQuery: aggregateQueryBuilder => {
                    const sqlTable = aggregateQueryBuilder.getTableAlias();
                    aggregateQueryBuilder.select(
                      sql.fragment`

json_build_object(
  'type', 'FeatureCollection',
  'features', json_agg(json_build_object(
    'type',       'Feature',
    'id',         ${sqlTable}.gid,
    'geometry',   ST_AsGeoJSON(${sqlTable}.geom)::json,
    'properties', (to_jsonb(${sqlTable}) - 'gid' - 'geom')::json
  ))
)

`,
                      "featureCollection"
                    );
                  },
                };
              });
              return {
                description: `Feature collection from the features in this table.`,
                type: GraphQLJSONType,
                resolve(parent) {
                  return (
                    (parent.aggregates && parent.aggregates.featureCollection) || null
                  );
                },
              };
            },
            {
              isPgConnectionFeatureCollectionAggregateField: true,
            }
          ),
        },
        `Adding featureCollection field to connection '${Self.name}'`
      );
    }
  );
}

(Note this requires PostGraphile 4.4.0 or higher)

This should add the featureCollection field to any connection which represents a table that has gid and geom columns.

@sfkeller
Copy link
Author

sfkeller commented Jun 18, 2019

Many thanks for your reply. Will take a while for us to digest.

You wrote

Seems that though ST_Collect does form a collection (e.g. from the docs:

and

Judging by this StackOverflow answer you should be able to do something like this to construct a FeatureCollection from a table

Correct. See the most recent solution here: https://giswiki.hsr.ch/PostGIS_-_Tipps_und_Tricks#Export_a_table_with_a_geometry_attribute_to_GeoJSON which is based on this blog post http://blog.cleverelephant.ca/2019/03/geojson.html .

@benjie benjie changed the title Decent support of hstore and geometry data types Question about hstore and geometry data types Jun 19, 2019
@benjie
Copy link
Member

benjie commented Sep 28, 2020

I'm not sure what action is necessary on this, and it's not been active for over a year, so I'm going to close it. Hopefully the discussion above is useful though :)

@benjie benjie closed this as completed Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants