Schema Discovery Tool for SpaceCurve System

The Schema Discovery Tool reads JSON data to create a data definition file for SpaceCurve System.

Prerequisite: Python 2.7
Note: This program uses recursion.

Parameters

Parameter & Alternative	Value	Description
-f, --input_path	inputPathName	Required. GeoJSON input filename, pathname, or partial path. Wildcards OK when all schemas match.
-o, --output_path	outputPathName	DDL output filename, pathname, or partial path. If omitted, no DDL is saved.
-c, --schema_name	schemaName	Schema name to use in DDL output.
-t, --table_name	tableName	Table name to use in DDL output.
-s, --sample_freq	sampleFrequency	Only sample every nth record. Default: 1
-l, --limit	sampleLimit	Only sample the first n records. Default: 100,000,000
-a, --attribs_to_lower		Flag. Convert all attributes to lower-case (implies data conversion).
-v, --verbose		Flag. Show verbose log of tool activity.

Usage

Use this tool to analyze existing JSON and GeoJSON data. This tool can create a file in data definition language (DDL) that defines a database schema. You can use the scctl tool to import the DDL file into SpaceCurve System. For information about importing a DDL file, see Creating Databases and Tables in the SpaceCurve documentation.

This tool infers data types and value distribution statistics about fields in the source data. This information appears in a comment for each data type in the DDL output.

Choose Data Types

The DDL file produced by this tool infers data types where it's able. However, you must review this DDL file to confirm it suits your data. This tool inserts the word Choose where you can or must choose a precise data type. For example, SpaceCurve System can accept geometry (flat) and geography (globe-based) geospatial data. If this tool cannot determine any data type based on your source data, you will see <<Choose in the comment. For these fields, you must choose a data type that will adequately handle your source data values.

Partion and Index

Creating a DDL file with correct datatypes is just one step in importing data to SpaceCurve System. Your database also needs partitioning and indexing that reflects the kinds of queries you will make. Find System Data Management in the SpaceCurve documentation for guidance about optimizing queries.

JSON Format

SpaceCurve System uses a data format similar to GeoJSON. You can see the GeoJSON specification at http://geojson.org/. SpaceCurve uses GeoJSON that does not include a FeatureCollection array. Instead, GeoJSON objects appear sequentially, with no FeatureCollection wrapper, and without commas between records. Also, each record must appear on a single line, with a carriage return delimiting between records.

This call uses the jq tool to convert standard GeoJSON into a format readable by schema-discovery.py and SpaceCurve System:

jq --compact-output -c '.features[]' standard.json > spacecurve.json

See the radar.json data file included in the SpaceCurve documentation for an example of data in ingestible GeoJSON format.

Example

These lines of bash script scan a GeoJSON file, create a DDL file, and create a database instance on the master node.

python schema-discovery.py --input_path spacecurve.json --table_name places --output_path places_schema.sql
scctl shell --ddl --instance_name=places --file=places_schema.sql

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
LICENSE		LICENSE
README.md		README.md
sc-to-geojson.py		sc-to-geojson.py
schema-discovery.py		schema-discovery.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Schema Discovery Tool for SpaceCurve System

Parameters

Usage

JSON Format

Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Schema Discovery Tool for SpaceCurve System

Parameters

Usage

JSON Format

Example

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages