-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to EF type mapping #50
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks @roji! Left a small comment inline.
Edit: Sorry, added another.
? new VectorTypeMapping(mappingInfo.StoreTypeName ?? "vector") | ||
: null; | ||
{ | ||
if (mappingInfo.StoreTypeName is not null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new logic for this method seems a bit complex. Thoughts on checking the ClrType
first, then falling back to StoreTypeName
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that users may provide both a store type and a CLR type - this can be done by annotating a Vector property in the model with a store type (e.g. vector(3)
). In that case we want to go into the top piece of logic, looking at the user-provided string (which may include the dimension), and not just look at the CLR type.
So there's basically three possibilities here:
- Only a store type is provided (ClrType is null) - this is scaffolding. We go into the code above, simple.
- Only a CLR type is provided (StoreType is null) - this is the regular case of a Vector property on the user's type, which isn't otherwise configured for the column type. We go into the code below, also simple.
- Both StoreType and ClrType are provided. We go into the code above, do the same as scaffolding, except that we also validate that the user's CLR type is compatible with their requested store type (so they don't get it wrong and e.g. put
[Column(TypeName = "sparsevec")]
on a Vector property by accident.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the third case, I think it's better to use the CLR type (at least for now) for backwards compatibility.
Also, the error message when there is a mismatch is a bit confusing.
[Column("half_embedding", TypeName = "vector(3)")]
public HalfVector? HalfEmbedding { get; set; }
outputs
The property 'Item.HalfEmbedding' could not be mapped because it is of type 'HalfVector', which is not a supported primitive type or a valid entity type.
(however, HalfVector
is a valid type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the third case, I think it's better to use the CLR type (at least for now) for backwards compatibility.
Are you saying that if a user specifies an incompatible combination, e.g. a Vector (or even string) CLR type with halfvec
as the store type, we should just ignore and return an incorrect type mapping? That is bound to fail later anyway (when you try to actually read/write the thing), only with a more cryptic message - so I don't think there should be a backwards compatibility concern... Or am I missing something?
FWIW this behavior is what's generally done in EF provider and other extensions.
Also, the error message when there is a mismatch is a bit confusing
Yeah, that's something that should be improved on the EF side - can you open an issue on the EF repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests currently pass even with a type mismatch - diff and CI logs (typecasting must be happening somewhere).
That's odd - is there any sort of implicit casting between the vector types? Even if so, try using string
as the CLR type or similar.
In any case, I'd consider this a bug - the user is specifying an invalid combination of store and CLR types. But if you really prefer, I can remove the compatibility check for the CLR type, and make the store type be the only thing that matters once it's specified. Let me know.
* Support scaffolding of vector types * Implemented more proper support for the size facet * Consolidated different mappins and plugins into the same files * A bit of test code cleanup Closes pgvector#44
Hmm, I'm not able to get scaffolding support to work when testing (after adding the changes through a local NuGet source). dotnet ef dbcontext scaffold "Host=localhost;Database=repro" Npgsql.EntityFrameworkCore.PostgreSQL still outputs Could not find type mapping for column 'public.items.embedding' with data type 'vector(3)'. Skipping column. Edit: When adding logging, it doesn't appear to call This is with Npgsql 8.0.7 / Npgsql.EntityFrameworkCore.PostgreSQL 8.0.11. |
Take a look at this doc page - EF's design-time tools are probably not picking up your context with its UseVector() configuration. |
NodaTime works without additional configuration (the |
You're absolutely right... I forgot about some of the design-time integration infra, and was a bit sloppy with my testing. I've pushed an additional commit which takes care of all that, and I can see everything working with a real local nuget etc. (but please verify on your side as well). |
Nevermind, tested again it's now working 🎉 |
Big thanks @roji! Merged a minimal version of this in the commit above. Will look into incorporating some of the other changes in follow-up commits. |
Co-authored-by: Shay Rojansky <[email protected]>
Merged the consolidation changes in the commit above. Does the size change have any impact on users? |
Created an issue for the exception message in dotnet/efcore#35846. |
Great that it all works!
Nothing important comes to mind... Since the mapping is now size-aware, users should now be able to use EF APIs to manage the size separately rather than giving the entire store type (this stuff) - it's not named super well (since in the pgvector case it's a fixed absolute length rather than a max length, but that naming problem exists elsewhere as well...). |
Co-authored-by: Shay Rojansky <[email protected]>
Thanks, I appreciate the help. Merged |
Looks like (it looks like Npgsql.EntityFrameworkCore.PostgreSQL.NodaTime has the same situation, fwiw) |
That shouldn't actually matter (AFAIK netstandard2.0 stuff should get picked up by a net8.0 library/executable etc); but in any case, as long as it works, it works... |
Here you go @ankane, this takes care of scaffolding, and also does a bit of general type mapping improvements and cleanup to simplify the project. Note the added unit tests which ensures that the TypeMappingSourcePlugin returns the correct mappings - that should be enough to ensure that everything is working properly (I also did a manual test).
Closes #44