-
Notifications
You must be signed in to change notification settings - Fork 449
feat(hub): list collections #1568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(hub): list collections #1568
Conversation
|
||
const search = new URLSearchParams([ | ||
...Object.entries({ | ||
limit: String(Math.min(totalToFetch, 100)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the doc, max number of collections par page is 100.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically it can be 10000 if all of:
owner
is setexpand
is set to false explicitly
but not sure we want to bother
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
I will look into providing a schema for the existing collection APIs
If we’re using Without the schema, it's very difficult to guess what it is. By reverse-engineering 10,000 collections, I identified the type, but not 100% sure it's correct. |
Hey @quanghuynt14 , here's an OpenAPI JSON spec for some of the hub endpoints, including collection ones: https://gist.github.com/coyotte508/ec7d12713d94a4fe4988afa3fffc23f1 It includes json schema for the various responses & requests, and you can also load the openapi schema on https://editor-next.swagger.io/ for example do you think it's helpful for your PRs? |
Yes, it's very complete and helpful. I will check if the type in this PR is correct. |
I checked the schema and found a mismatch in the schema of We have:
in the After verifying {
"slug": "google/gemma-3n-685065323f5984ef315c93f4",
"title": "Gemma 3n",
"description": "",
"gating": false,
"lastUpdated": "2025-06-26T15:55:44.512Z",
"owner": {
...
},
"items": [
...
],
"theme": "purple",
"private": false,
"upvotes": 146,
"isUpvotedByUser": false
}, These 2 props ( {
"slug": "kristenq/reasoning-684330e0ce0c4e30fe59456a",
"title": "Reasoning",
"description": "Advanced reasoning models. ",
"gating": false,
"lastUpdated": "2025-06-06T18:18:26.733Z",
"owner": {
...
},
"items": [
{
"_id": "684330f27ce524322498baa7",
"position": 0,
"type": "collection",
"id": "67ee7145ec3d31f7c7a75cab",
"slug": "Tesslate/synthia-s1-reasoning-model-67ee7145ec3d31f7c7a75cab",
"title": "Synthia-S1 REASONING MODEL",
"description": "Creative, Scientific, and Coding",
"lastUpdated": "2025-04-03T11:31:05.262Z",
"numberItems": 3,
"owner": {
...
},
"theme": "blue",
"shareUrl": "https://huggingface.co/collections/Tesslate/synthia-s1-reasoning-model-67ee7145ec3d31f7c7a75cab",
"upvotes": 3,
"isUpvotedByUser": false
}
],
"theme": "green",
"private": false,
"upvotes": 0,
"isUpvotedByUser": false
}, Here is the corrected schema: https://gist.github.com/quanghuynt14/1c55f44978248bc12889b1bde2359c0c
|
hmm not sure, https://huggingface.co/api/collections/google/gemma-3n-685065323f5984ef315c93f4 I do have |
ah sorry. I assumed that the type of collection returned by The collection object within the collection list does not contain But the |
yes indeed, we'll fix the schema on our side, thanks for pointing out |
You can review the PR. I think the type is now complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
I think just one change on how owner
and item
params are handled and it'll be ready to merge (I'll still re-review)
|
||
const search = new URLSearchParams([ | ||
...Object.entries({ | ||
limit: String(Math.min(totalToFetch, 100)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically it can be 10000 if all of:
owner
is setexpand
is set to false explicitly
but not sure we want to bother
const search = new URLSearchParams([ | ||
...Object.entries({ | ||
limit: String(Math.min(totalToFetch, 100)), | ||
...(params?.search?.owner ? { owner: Array.isArray(params.search.owner) ? params.search.owner.join(",") : params.search.owner } : undefined), | ||
...(params?.search?.item ? { item: Array.isArray(params.search.item) ? params.search.item.join(",") : params.search.item } : undefined), | ||
...(params?.search?.q ? { q: params.search.q } : undefined), | ||
}), | ||
]).toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When params.search.owner
and params.search.item
are arrays, we should just repeat it, eg:
for (const owner of params.search.owner) {
search.append("owner", params.search.owner)
}
Eg https://huggingface.co/api/collections?owner=coyotte508&owner=julien-c if owner=['coyotte508','julien-c']
In constrast, https://huggingface.co/api/collections?owner=coyotte508,julien-c seems to return all collections... which is a bug that's getting fixed now it's been found!
By the way, for the JS lib we can force the user to put arrays for both owner
and item
if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I fixed it ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks looks good!
| { | ||
avatarUrl: string; | ||
fullname: string; | ||
name: string; | ||
isHf: boolean; | ||
isHfAdmin: boolean; | ||
isMod: boolean; | ||
followerCount?: number; | ||
type: "org"; | ||
isEnterprise: boolean; | ||
isUserFollowing?: boolean; | ||
} | ||
| { | ||
avatarUrl: string; | ||
fullname: string; | ||
name: string; | ||
isHf: boolean; | ||
isHfAdmin: boolean; | ||
isMod: boolean; | ||
followerCount?: number; | ||
type: "user"; | ||
isPro: boolean; | ||
_id: string; | ||
isUserFollowing?: boolean; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extract this to ApiAuthor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes ad702e7
Issue: #271
This PR only includes the call of list collections. Why?
It's one of my early contributions to the project, so I think I should split the PR of @hackpk into many small PRs.
I think list collections solve the problem of displaying collections in real time
I’m having a bit of trouble with the type, and I need to search through the project to understand it better. We don't have any public schema API?
Thanks for review it.