Replies: 4 comments
-
Hi @e-lo! One thing I should be upfront about is that we're viewing GeoJSON as a "mental model" or "canonical data format" that allows us to describe the schema, but not necessarily as a viable final sharing format. That being said, in our current line of thinking, we're looking at data formats that would be losslessly convertible to/from the GeoJSON "canonical model", so your core concerns would probably still remain regardless of the data format. Can I ask you a couple of questions that would help us put your feedback in context?
|
Beta Was this translation helpful? Give feedback.
-
That's a good question. I don't know of a good definition (one probably exists!) but what I do know is that my initial test of some of the example data provided got icky quickly. For example a few rows of this: pd.DataFrame([flatten_json.flatten_json(row['properties']) for row in data_json]) Turns into a really ugly dataframe where each level of nesting ends up as a variable in the column name...in a way that isn't simplistically predictable for a random dataset.
...when I started looking at writing some simple code to make it more usable, it got complex fast – especially for array items. Rather than invest more time in that (for now), I thought I would write this discussion issue instead ;-) (if nothing else, I would love to learn about an straightforward solution to handling the deep nesting even if the schema doesn't change) |
Beta Was this translation helpful? Give feedback.
-
I don't have any terrific answers to your other questions as I think if there were....there would be less of a need/attention on what you all are doing!
Something similar to
In general, I see a lot of useful usage of |
Beta Was this translation helpful? Give feedback.
-
I can imagine that being very useful! |
Beta Was this translation helpful? Give feedback.
-
As somebody potentially developing tools to analyze and edit roadway networks, I'd like to have a more straightforward conversion between the json-flavored data and a table-based schema which can be more efficiently analyzed and manipulated.
The advent of
osmnx
as a tool which took OSM data and put it in pandas GeoDataFrames has unlocked a great deal of research and development and subsequent tooling in analyzing roadway networks around the world.I'm concerned that the deep-nesting in the draft schema would effectively prevent straightforward translation of the json-based data to a series of related tables. While I understand why json-based data format was chosen, it can (and should 🤞) be structured so that there can be clear breaking out of sub-tables of relationships.
See GMNS for a rough example of a table-based format that you might hope to achieve from manipulating/summarizing the json-based data. A goal could (should 🤞 ) be to be able to summarize several example datasets in overture's schema into GMNS or a similar format without significant loss of data or hard-coding.
Beta Was this translation helpful? Give feedback.
All reactions