HOCON schema is a type-safe data validation framework inspired by basho cuttlefish
There are two high level kinds of data types defined in HOCON schema: primitive and complex.
If we think of Erlang data structure as a 'tree', then primitive types denote the 'leaves'
of the 'tree'. aka the terminal values.
While the complex types denote the values which enclose either other complex or primitive values.
Most of the primitive types are provided by (and also can be extended from) the typerefl library.
Typerefl is highly composible hence can be used to define complex types.
However in HOCON schema we only use it to define primitive types.
Here is a list of the primitive types for reference:
- enum: Enum is a list of Erlang atoms
- singleton: singleton is an Erlang atom
- integer: typerefl:integer()
- string: typerefl:string()
And an extended primitive type example: ip_port
-type ip_port() :: tuple().
-typerefl_from_string({ip_port/0, this_module, to_ip_port}).
to_ip_port(String) ->
case string:tokens(String) of
....
end.
HOCON schema supports 3 different complex types: struct, array, and union.
NOTE: to make it easier for future extensions, it's recommended to use hoconsc module APIs to define schema.
NOTE: HOCON schema does not support non-struct root level data types. e.g. it is not allowed to
define a root level schema with just a integer() type.
Structs consist of data fields, which can be defined using hocon_schema behaviour callbacks.
roots/0: This callback returns all the root level fields.fields/1: This callback returns the schema for each data field (in a list, so order matters).
For example, to define a struct named foo having one integer field, the schema module may look like:
-export([roots/0, fields/1]).
roots() -> ["foo"]. %% 'exported' root names, equivalent to `[{"foo", hoconsc:ref("foo")}].`
fields("foo") -> [{"field1", typerefl:integer()}].
In this case, the schema for use in hocon_schema APIs is the module name. There is another way to
define a struct as a Erlang map(), so we do not have to implement the behaviour callbacks
(this is however mostly for test cases):
#{roots => ["foo"], %% 'exported' root names
fields => #{"foo" => [{"field1", typerefl:integer()}]}
}
In order to promote code abstraction and prevent copy-paste as much as possible, in HOCON schema, there is no way to define structs nested (child struct nested in a parent struct). The parent-children relationship has to be defined as struct 'referencing'.
e.g. if the type of parent-struct's field is another struct, the field's type should be defined as:
[ ...,
{field_N, hoconsc:ref("field_struct_name")},
...
].
The root struct name exported in the roots/0 API serves as top level struct's field names.
like listener, zone and broker in etc/emqx.conf.
Array is a sequence of other types which is defined as {array, Type}.
A union type is in some contexts one_of types.
When data is validated against the schema (recursively), the code enumerates
the union member types in the defined order until the given data matches any of the union member.
When starting a Erlang node it usually requires a system configuration file, (usually named sys.config),
see Erlang doc for more details.
When using HOCON config format, we need a tool to transform a HOCON file to a config file of sys.config format.
hocon_schema is such a tool.
The content of the above mentioned config file for Erlang node to bootstrap is essentially an Erlang expression which evaluates to an Erlang term (the 'object' in Erlang).
To map HOCON objects (or their fields) to Erlang terms, we need to define a set of rules, such rules in HOCON schema is called 'mapping' rules.
This is when we need to introduce metadata to struct fields' schema.
The way to define a 'mapping' metadata is like below:
fields("struct_foo") ->
[ {field1, #{type => integer(),
mapping => "app_foo.field1"
}
]
This should map HOCON config {struct_foo: {field1: 12}} to sys.config like [{app_foo, [{field1, 12}]}].
Sometimes it's impossible to perform a perfect mapping from HOCON object to Erlang term.
This is when translation is used.
Translations are defined as callback too, for example, if we want to translate
to config entries named 'min' and 'max' into a range tuple in sys.config,
this schema below should do it.
-module(myapp_schema).
translation("foo") ->
[{"range", fun range/1}].
range(Conf) ->
Min = hocon_maps:get("foo.min", Conf),
Max = hocon_maps:get("foo.max", Conf),
case Min < Max of
true ->
{Min, Max};
_ ->
undefined
end.
As in the example, a translation callback is provided with the global config,
specific field values can be retrieved with hocon_maps:get API.
Inter-field or even inter-object config validation can be done by implementing
the validations optional callback.
Validations work similar to translations, only the OK (ok or true) return value is discarded
and failures are raised as exception in the map call.
NOTE: the integrity validation is performed after all fields are checked and converted.
Below is an example to ensure that the min field is never greater than max field.
-module(myapp_schema).
validations() ->
[{"min =< max", fun min_max/1}].
min_max(Conf) ->
Min = hocon_maps:get("foo.min", Conf),
Max = hocon_maps:get("foo.max", Conf),
case Min =< Max of
true -> ok %% return true | ok to pass this validation
false -> "min > max is not allowed" %% or If you need to return early, use throw(Reason)
end.
Besides fields' mapping metadata, which is introduced above, for config mapping,
HOCON schema also supports below field metadata.
converter: an anonymous function evaluated during config generation to convert the field value.validator: field value validator, an anonymous function which should returntrueorokif the value is as expected. NOTE: the input to validator after convert (if present) is applied.default: default value of the field. NOTE that default values are to be treated as raw inputs, meaning they are put through theconverters andvalidators etc, and then type-checked.required: set tofalseif this field is allowed to beundefined. NOTE: there is no point setting it totrueif fields has a default value.sensitive: set totrueif this field's value is sensitive so we will obfuscate the log with********when logging.desc: text for document generationhidden: a boolean flag to hide it from appearing in config document
By default, a field (except for when it's inside an array element) can be overridden by an environment variable the name of which is translated from field's absolute path with dots replaced by double-underscores and then prepended with a prefix.
For example, the value of config entry foo.bar.field1 can be overridden by
PREFIX_FOO__BAR__FIELD1, or PREFIX_foo_bar_field1 (i.e. not case-sensitive), where PREFIX_
is configurable by another environment variable HOCON_ENV_OVERRIDE_PREFIX.
Define override_env in struct field metadata.
Environment variables are not parsed as plain string, rather as HOCON values. This creates the flexibility for overriding config values in different ways:
- Set individual object paths, for example
export EMQX_MY__KEY__name=zz; export EMQX_MY__KEY__fingers=10 - Set the the entire object as escaped HOCON value:
export EMQX_MY__KEY="{name = \"zz\", fingers = 10}" - Load the object from another file
export EMQX_MY__KEY="{\"include /config/my-key-override.conf\"}"
Using {include "path/to/file"} is extremely useful to override a value with large object or an array.
NOTE: currently HOCON schema does not support array index (KEY__1, KEY__2 etc) overrides.