Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verifier points to old draft #12

Open
anuradhawick opened this issue Jul 13, 2022 · 10 comments
Open

verifier points to old draft #12

anuradhawick opened this issue Jul 13, 2022 · 10 comments
Assignees

Comments

@anuradhawick
Copy link

Hi,

It seems the beacon-verifier code base has the following defaults.

model: default_value = https://github.com/MrRobb/beacon-v2-Models/BEACON-V2-draft4-Model
framework: default_value = https://github.com/MrRobb/beacon-framework-v2

I am referring to the latest schemas for model and framework, hence the verifier wasn't running correctly. Changing model to;
default_value = https://github.com/ga4gh-beacon/beacon-v2-Models/BEACON-V2-Model fixed the issue. However, I cannot point to the latest framework as it breaks some JSON schema references.

If it helps, the exact error I get is as follows;

{
  "Info": [
    {
      "name": "CSIRO Serverless Beacon",
      "url": "https://ip35zfam3d.execute-api.us-east-1.amazonaws.com/prod/info",
      "valid": false,
      "error": "Response does not match the schema: failed to resolve json-schema:///sections/beaconInformationalResponseMeta.json: cannot resolve relative external schema without root schema ID ()\nfailed to resolve json-schema:///sections/beaconInfoResults.json: cannot resolve relative external schema without root schema ID ()\n"
    }
  ],
}

It would be great if this issue could be addressed or a proper workaround can be shown.

Thanks in advance.

cheers,
Anuradha

@mbaudis
Copy link
Member

mbaudis commented Jul 13, 2022

@anuradhawick Thanks for the note! I think there is some cleanup to be done ... Can you give some feedback about the errors when using the new value? Should be

.. at least concerning the correct repositories. However, there have been internal changes, e.g. a general witch to relative references which apparently haven't vetted for verifier compatibility.

@redmitry
Copy link

Hello,

I executed the verifier and it passes the /info test.

It should be the json schema library bug, as it can't resolve the referred schema:
failed to resolve json-schema:///sections/beaconInformationalResponseMeta.json

Could it be because of the old RUST version (rustc --version) ? I have 1.59.0

The background:
In the beaconInfoResponse.json:

"properties": {
  "meta": {
    "description": "Information about the response that could be relevant for the Beacon client in order to interpret the results.",
    "$ref": "./sections/beaconInformationalResponseMeta.json"
},

IMO the fail in the json schema validation library is in the chain of relative "$ref"s. The library should resolve the "$ref" against current document.

BEACON-V2-Model/endpoints.json uses the absolute reference:
"$ref":"https://raw.githubusercontent.com/ga4gh-beacon/beacon-framework-v2/main/responses/beaconInfoResponse.json"
while
beacon-framework-v2/blob/main/endpoints.json not:
"$ref":"../beacon-framework-v2/responses/beaconInfoResponse.json"

Kind regards,

Dmitry

@mbaudis
Copy link
Member

mbaudis commented Jul 13, 2022

@redmitry But can you try https://github.com/ga4gh-beacon/beacon-v2/tree/main/framework/json - which is the valid one now? There we went full-out relative refs. I've only checked this through materializing the schemas in Python; and there the only error I've encountered was when using a ref from models to the framework/common/ontologyTerm.json, where the the ontologyTerm then used itself a relative reference (to CURIE). This all looks like "the schemas are fine, the tools are ... not so much".

So this is a larger AI: Make sure & document that the current reference repository version checks out w/ various tools. https://github.com/ga4gh-beacon/beacon-v2 ... json for model and framework.

@redmitry
Copy link

redmitry commented Jul 13, 2022

Verifier doesn't report which version(s?) of model(s) it uses...
https://ip35zfam3d.execute-api.us-east-1.amazonaws.com/prod passes the default model (not all, but /info is valid).

I am unable to specify any --model link as it always fails :-(

I checked the https://ip35zfam3d.execute-api.us-east-1.amazonaws.com/prod/info against the beaconInfoResponse.json with my own schema library and it is valid.

D.

@anuradhawick
Copy link
Author

anuradhawick commented Jul 13, 2022

@mbaudis I get the same error @redmitry gets. For some strange reason, the links you put do not work. We need to remove /tree/main part from the URL. Probably something to do with how strings are parsed.

@redmitry this URL is just a dev in progress and not all are implemented. I realised that verifier from cargo install (rustc version 1.62.0 (a8314ef7d 2022-06-27)) points to some old repositories, so I recompiled it with the new model path I mentioned in the original post. However, schema references are broken in the new framework repo (at least as I understand.).

The straightforward fix, I believe, would be to resolve schema errors and update the verifier repository to point to the latest framework and model URLs.

Just going overboard, but in compiled JSONs, fully dereferenced JSON schemas would be great. Makes it quite easy to read, without having to navigate many files (and better support some parsers that have relative path issues).

Anyway thanks heaps for getting into this so soon! Awesome job!

@mbaudis
Copy link
Member

mbaudis commented Jul 14, 2022

@anuradhawick @redmitry IMO a general problem is to have a complex schema with a gazillion of references pointing to its own parts on GH, like it was originally. AFAIK one should be able to pull the schemas into a local storage and use this... In Progenetix / bycon the local version (in the package) works well w/ the relative $ref values (same beacon-v2 directory structure). As part of instantiating data objects we de-reference the respective schemas (materialize). As said, the only problem ATM is the multiple-hops dereferencing from models -> framework/common/ontologyTerm -> ... CURIE which may be due to a double local de-referencing???

In general I hate to have a standard where such $ref to Github raw paths are sprinkled throughout.

So: Should't the option be to pull the schema repos and then verify against this localized version? Really asking; not a software engineer ¯\_(ツ)_/¯

CAVE: The variant schema uses now references to the VRS repo (as only external).

Suggestions?

1 similar comment
@mbaudis
Copy link
Member

mbaudis commented Jul 14, 2022

@anuradhawick @redmitry IMO a general problem is to have a complex schema with a gazillion of references pointing to its own parts on GH, like it was originally. AFAIK one should be able to pull the schemas into a local storage and use this... In Progenetix / bycon the local version (in the package) works well w/ the relative $ref values (same beacon-v2 directory structure). As part of instantiating data objects we de-reference the respective schemas (materialize). As said, the only problem ATM is the multiple-hops dereferencing from models -> framework/common/ontologyTerm -> ... CURIE which may be due to a double local de-referencing???

In general I hate to have a standard where such $ref to Github raw paths are sprinkled throughout.

So: Should't the option be to pull the schema repos and then verify against this localized version? Really asking; not a software engineer ¯\_(ツ)_/¯

CAVE: The variant schema uses now references to the VRS repo (as only external).

Suggestions?

@redmitry
Copy link

redmitry commented Jul 14, 2022

In general I hate to have a standard where such $ref to Github raw paths are sprinkled throughout.

IMO standard shouldn't have links to the github. In this case, local copies would inevitably link to "other" piece of schema outside.

@mbaudis
Copy link
Member

mbaudis commented Jul 14, 2022

So we should have a verifier which pulls / rsyncs whatever schema root it is pointed at & verifies against this, but accept external $ref values in it, from a general POV, right?

@anuradhawick
Copy link
Author

@mbaudis only issue is multiple hops, it creates confusion for the program, whether the relative path is from the program itself or the referencing JSON file or relative to the remote URL. This could make the different behaviour depending on the language (python vs rustc).

Having external URLs within schema makes it impossible to use within other programs directly. This is mainly due to security and we would typically block outbound connections from containers/clusters.

I like the idea of strictly local verification with zero HTTP calls to GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants