Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to set the source location to an URL pointing to an archive #801

Open
sschuberth opened this issue Aug 15, 2019 · 9 comments
Open

Comments

@sschuberth
Copy link
Member

Currently, setting the source location triggers a wizard to browse for a GitHub repository. While a VCS location certainly is to be preferred, some sources are only available as source artifacts. Please support setting those as the source location.

@storrisi
Copy link
Contributor

@fossygirl @geneh what do you think about this suggestion?

@fossygirl
Copy link
Member

@storrisi Let's please bring all of these questions to the biweekly meeting so everyone in the community can weigh in.

@sschuberth
Copy link
Member Author

Just to be clear, as I read the specs, sourceLocation can already be a sourcearchive, so this is not a new spec feature I'm asking for, but just to expose it through the web UI.

@tmarble
Copy link
Contributor

tmarble commented Dec 3, 2019

@sschuberth with the knowledge that sourceLocation must point to an exact commit level of detail can you clarify what "source artifacts" actually means? I suppose, for example, it could point to a tarball? Can you elaborate?

I've included a screen shot of the current prompt which shows:
GitHub [ user / organization ] [ repo ]
serviceLocation

Do you have some thoughts about how the UI should look for sourcearchive?
Specifically I think we might need a new provider to respresent the kind of thing(s) you want to point to.

I note that despite the big cross product of all combinations of type and provider currently the data only represents a small subset -- and often a provider and type are associated
(the URI scheme is /curations/{type}/{provider}/{namespace}/{name}):

tmarble@avenir 244 :) pwd
/home/tmarble/src/github/clearlydefined/curated-data
tmarble@avenir 245 :) find curations -mindepth 2 -maxdepth 2 -type d 
curations/pypi/pypi
curations/nuget/nuget
curations/git/github
curations/gem/rubygems
curations/sourcearchive/mavencentral
curations/npm/npmjs
curations/maven/mavencentral
tmarble@avenir 246 :) 

@sschuberth
Copy link
Member Author

I suppose, for example, it could point to a tarball? Can you elaborate?

Indeed, sourceLocation should be able to point to any type of archive like

Do you have some thoughts about how the UI should look for sourcearchive?

I guess there's no way around adding something like "Custom URL" next to "GitHub" where any URL could be entered. Whether it's valid must be verified by a human as part of reviewing the curation before merging.

Specifically I think we might need a new provider to represent the kind of thing(s) you want to point to.

Correct, at least the last of my examples (antlr-2.7.3) would require a new "custom" provider. For the former two examples (mime-types-2.1.25 and semver4j-3.1.0) the npmjs and mavencentral provider should be used, respectively, with a type of sourcearchive each.

Note that for cases like npmjs where the ("binary") package being pulled in by the package manager for building is actually equal to the source package (i.e. the language is interpreted and not compiled), you could argue that for an URL like https://registry.npmjs.org/mime-types/-/mime-types-2.1.25.tgz both a type / provider pair of npm / npmjs and sourcearchive / npmjs could be valid. Does ClearlyDefined specify clearly (no pun intended) how to handle such cases?

@tmarble
Copy link
Contributor

tmarble commented Dec 4, 2019

Thank you @sschuberth for your comments!

Based on your examples I propose that the type/provider be chosen for your examples (respectively):

  • sourcearchive/npmjs
  • sourcearchive/mavencentral
  • sourcearchive/archive <- as you point out "archive" as a provider is new

My initial thought about the UI is there should be a combo box or some choice for type.
Then, based on the type chosen, the list of providers (in an adjacent combo box) would be narrowed to the only ones that apply for that type. Then, depending on the type/provider text boxes would allow entry for namespace, url, and/or path.

Your last paragraph highlights a question that I have: "How are the curations interpreted based on type/provider"? I'm not sure where to look for this code in either the website or services repos... For example

Finally allow me to add a new question: How much validation of a sourceLocation should the website do while entering a curation? Based on the fully elaborated URL (calculated using the logic from the code that interprets type/provider) should we try to start downloading the artifact (to ensure we don't get a 404 or some other error)? Allow me to note two example curations:

@geneh can you help answer these questions?

@geneh
Copy link
Contributor

geneh commented Dec 4, 2019

@tmarble This is a lower priority issue. Let's please bring all of these questions to the biweekly meeting so everyone in the community can weigh in.

@bduranc
Copy link

bduranc commented Feb 14, 2020

Thank you @sschuberth for your comments!

Based on your examples I propose that the type/provider be chosen for your examples (respectively):

  • sourcearchive/npmjs
  • sourcearchive/mavencentral
  • sourcearchive/archive <- as you point out "archive" as a provider is new

My initial thought about the UI is there should be a combo box or some choice for type.
Then, based on the type chosen, the list of providers (in an adjacent combo box) would be narrowed to the only ones that apply for that type. Then, depending on the type/provider text boxes would allow entry for namespace, url, and/or path.

Your last paragraph highlights a question that I have: "How are the curations interpreted based on type/provider"? I'm not sure where to look for this code in either the website or services repos... For example

Finally allow me to add a new question: How much validation of a sourceLocation should the website do while entering a curation? Based on the fully elaborated URL (calculated using the logic from the code that interprets type/provider) should we try to start downloading the artifact (to ensure we don't get a 404 or some other error)? Allow me to note two example curations:

@geneh can you help answer these questions?

Hi,
@tmarble or @geneh : Were these questions ever answered on a call by any chance? I have the same questions as Tom and cannot seem to find them documented where I'd expect to see them (CD Docs, minutes, etc.).

Two questions I have that might be similar to what was being asked previously here...

What's the difference between maven/mavencentral and sourcearchive/mavencentral?

If the above picture is accurate, then I'm also curious what happens if one of the jar's don't exist?

For example with multi-module projects... I.e. like https://repo1.maven.org/maven2/com/google/inject/guice-parent/4.0/, there is no source or binary jar at the parent level. Just some docs and the parent POM.

And lastly, is the type of "sourcearchive" still exclusively for the "mavencentral" provider?

Thanks and apologies in advance if I'm asking duplicate questions with answers already documented somewhere else :)

@sschuberth
Copy link
Member Author

  • Using your example: do npm/npmjs and sourcearchive/npmjs both expand to https://registry.npmjs.org/mime-types/-/mime-types-2.1.25.tgz ?

Yes.

In other words what is the difference between an npm package and npm source?

At the example of NPM, there is none.

  • What's the difference between maven/mavencentral and sourcearchive/mavencentral?

Like you say, the former is the binary artifact whereas the latter is the sources artifact.

How much validation of a sourceLocation should the website do while entering a curation?

I believe a simple file existence check should suffice.

Correct. Which is why I would want to set sourcearchive/mavencentral for antlr/2.7.7 to https://www.antlr2.org/download/antlr-2.7.7.tar.gz.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants