Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about index parameters #121

Open
xcu opened this issue Aug 15, 2024 · 2 comments
Open

Question about index parameters #121

xcu opened this issue Aug 15, 2024 · 2 comments
Labels
question Further information is requested

Comments

@xcu
Copy link

xcu commented Aug 15, 2024

I am managing a vector DB and I'm considering switching to pgvectorscale. However, I'm a bit lost regarding what index configuration params I could use.
The table in question contains +50M embeddings of 512 dimensions, but the table is partitioned with partman in tables of 100k embeddings. So we could actually regard it as 500 small tables of 100k embeddings, with 512 dimensions each.

Would default configuration/query params for the diskANN index suit? Or do you think there are some build/query parameters that could be tweaked for better recall/search speed?

@jonatas
Copy link

jonatas commented Aug 19, 2024

Hey @xcu, thanks for asking! @cevian can probably help to answer this, but I also see this question as a great conversation for our discord! Join us and check what other devs are using too: https://discord.gg/KRdHVXAmkp

@cevian
Copy link
Collaborator

cevian commented Aug 22, 2024

@xcu I think the defaults should suffice here

@cevian cevian added the question Further information is requested label Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants