You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Schema salad uses the ruamel.yaml "round trip" YAML parser.
This parser preserves comments and line numbers by using ruamel.yaml.comments.CommentedMapruamel.yaml.comments.CommentedSeq. These objects behave like Python maps/sequences, but have an additional field lc (which stands for "line column" I think), the lc contains information for both where the Map or Seq element started, as well as where each of its contained items start as well. In addition, we set our own filename field to track what file an object came from.
This information is used to give better CWL errors, so it is possible to communicate what part of the file contains a warning or error. Specifically, look at the SourceLine class, which is used to wrap a code block such that any uncaught exceptions will be re-thrown with additional line number information added to the message.
The purpose of Schema salad is to validate documents based against a schema. The primary user is CWL but the schema salad is intended to be general purpose.
Schema salad supports two ways of parsing and validating documents. The original way is to load the schema into a data structure and then use the ref_resolver.Loader.resolve_all followed by validate.validate methods. The newer way is to use generate Python code from the schema which implements the same logic. The benefit of the code generation approach is that the resulting parser is much, much faster.
However, if you want to "round trip" a CWL document by using the codegen parser (which is based on loading records into objects), then exporting it back to maps and sequences, you lose the line number information.
For this project, we want to preserve the line number and filename information so that if you re-export the document (using save()) it preserves, as best as possible, the original line/column and filename annotations for use by CWL. As a stretch goal, it would also be neat if it preserved the YAML comments (which are also recorded by the "CommentedMap" / "CommentedSeq" classes) so that using the ruamel round trip exporter included all the comments from the original document.
The code generator code can be found in python_codegen.py. The parsers are ultimately released in the cwl-utils project. Here's how the CWL parsers are generated:
We're currently retaining the original CommentedMap in the _doc field but not doing anything with it, so one approach is to have the save() method use the annotations from _doc to annotate objects that are returned. Among other things, you'll need to return CommentedMap and CommentedSeq instead of Dict and List.
The text was updated successfully, but these errors were encountered:
Schema salad uses the ruamel.yaml "round trip" YAML parser.
This parser preserves comments and line numbers by using
ruamel.yaml.comments.CommentedMap
ruamel.yaml.comments.CommentedSeq
. These objects behave like Python maps/sequences, but have an additional fieldlc
(which stands for "line column" I think), thelc
contains information for both where the Map or Seq element started, as well as where each of its contained items start as well. In addition, we set our ownfilename
field to track what file an object came from.This information is used to give better CWL errors, so it is possible to communicate what part of the file contains a warning or error. Specifically, look at the
SourceLine
class, which is used to wrap a code block such that any uncaught exceptions will be re-thrown with additional line number information added to the message.The purpose of Schema salad is to validate documents based against a schema. The primary user is CWL but the schema salad is intended to be general purpose.
Schema salad supports two ways of parsing and validating documents. The original way is to load the schema into a data structure and then use the
ref_resolver.Loader.resolve_all
followed byvalidate.validate
methods. The newer way is to use generate Python code from the schema which implements the same logic. The benefit of the code generation approach is that the resulting parser is much, much faster.However, if you want to "round trip" a CWL document by using the codegen parser (which is based on loading records into objects), then exporting it back to maps and sequences, you lose the line number information.
For this project, we want to preserve the line number and filename information so that if you re-export the document (using
save()
) it preserves, as best as possible, the original line/column and filename annotations for use by CWL. As a stretch goal, it would also be neat if it preserved the YAML comments (which are also recorded by the "CommentedMap" / "CommentedSeq" classes) so that using the ruamel round trip exporter included all the comments from the original document.The code generator code can be found in
python_codegen.py
. The parsers are ultimately released in thecwl-utils
project. Here's how the CWL parsers are generated:https://github.com/common-workflow-language/cwl-utils#development
We're currently retaining the original CommentedMap in the
_doc
field but not doing anything with it, so one approach is to have thesave()
method use the annotations from_doc
to annotate objects that are returned. Among other things, you'll need to returnCommentedMap
andCommentedSeq
instead of Dict and List.The text was updated successfully, but these errors were encountered: