Skip to content

Commit c859a4b

Browse files
authored
Merge pull request #101 from bjhargrave/schema-taxonomy-api
taxonomy API
2 parents 0228952 + b5b8b39 commit c859a4b

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

docs/schema/taxonomy-api.md

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Central API for taxonomy reading and validation
2+
3+
Current there are multiple places where the taxonomy `qna.yaml` files are read, parsed, and validated. There is a `check_yaml.py` script in the `taxonomy` repository and there are methods in the `instructlab` repository in the `src/instructlab/util.py` file.
4+
5+
The methods in `utils` are used by both the `ilab taxonomy diff` command as well as in the SDG code which has been moved to the `sdg` repository. This arrangement results in a circular dependency between the `instructlab` package to access the SDG code and from the SDG code in the `instructlab-sdg` package to access the `utils` methods to read and validate the taxonomy files.
6+
7+
## Use instructlab-schema package for the central API
8+
9+
We now have an `instructlab-schema` package on PyPI which holds the JSON schema files for the taxonomy `qna.yaml` files. This is now used by `instructlab` to access these schema files for taxonomy file validation.
10+
11+
We should relocate the taxonomy reading and validation code from `instructlab` to `instructlab-schema`. This will provide for a central place near to the JSON schema it uses for a shared API for reading, parsing, and validating taxonomy `qna.yaml` files.
12+
13+
Then we can modify the `instructlab` and `instructlab-sdg` packages to depend upon the `instructlab-schema` package for these APIs which will remove a circular dependency. We can also use these APIs in the taxonomy repositories `check_yaml.py` script as well.

0 commit comments

Comments
 (0)