Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to EF type mapping #50

Closed
wants to merge 2 commits into from
Closed

Conversation

roji
Copy link
Contributor

@roji roji commented Mar 24, 2025

Here you go @ankane, this takes care of scaffolding, and also does a bit of general type mapping improvements and cleanup to simplify the project. Note the added unit tests which ensures that the TypeMappingSourcePlugin returns the correct mappings - that should be enough to ensure that everything is working properly (I also did a manual test).

  • Support scaffolding of vector types
  • Implemented more proper support for the size facet
  • Consolidated different mappings and plugins into the same files
  • A bit of test code cleanup

Closes #44

@roji roji marked this pull request as draft March 24, 2025 11:26
@roji roji marked this pull request as ready for review March 24, 2025 12:03
Copy link
Member

@ankane ankane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks @roji! Left a small comment inline.

Edit: Sorry, added another.

? new VectorTypeMapping(mappingInfo.StoreTypeName ?? "vector")
: null;
{
if (mappingInfo.StoreTypeName is not null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logic for this method seems a bit complex. Thoughts on checking the ClrType first, then falling back to StoreTypeName?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that users may provide both a store type and a CLR type - this can be done by annotating a Vector property in the model with a store type (e.g. vector(3)). In that case we want to go into the top piece of logic, looking at the user-provided string (which may include the dimension), and not just look at the CLR type.

So there's basically three possibilities here:

  • Only a store type is provided (ClrType is null) - this is scaffolding. We go into the code above, simple.
  • Only a CLR type is provided (StoreType is null) - this is the regular case of a Vector property on the user's type, which isn't otherwise configured for the column type. We go into the code below, also simple.
  • Both StoreType and ClrType are provided. We go into the code above, do the same as scaffolding, except that we also validate that the user's CLR type is compatible with their requested store type (so they don't get it wrong and e.g. put [Column(TypeName = "sparsevec")] on a Vector property by accident.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the third case, I think it's better to use the CLR type (at least for now) for backwards compatibility.

Also, the error message when there is a mismatch is a bit confusing.

[Column("half_embedding", TypeName = "vector(3)")]
public HalfVector? HalfEmbedding { get; set; }

outputs

The property 'Item.HalfEmbedding' could not be mapped because it is of type 'HalfVector', which is not a supported primitive type or a valid entity type.

(however, HalfVector is a valid type)

Copy link
Contributor Author

@roji roji Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the third case, I think it's better to use the CLR type (at least for now) for backwards compatibility.

Are you saying that if a user specifies an incompatible combination, e.g. a Vector (or even string) CLR type with halfvec as the store type, we should just ignore and return an incorrect type mapping? That is bound to fail later anyway (when you try to actually read/write the thing), only with a more cryptic message - so I don't think there should be a backwards compatibility concern... Or am I missing something?

FWIW this behavior is what's generally done in EF provider and other extensions.

Also, the error message when there is a mismatch is a bit confusing

Yeah, that's something that should be improved on the EF side - can you open an issue on the EF repo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The tests currently pass even with a type mismatch - diff and CI logs (typecasting must be happening somewhere).
  2. Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests currently pass even with a type mismatch - diff and CI logs (typecasting must be happening somewhere).

That's odd - is there any sort of implicit casting between the vector types? Even if so, try using string as the CLR type or similar.

In any case, I'd consider this a bug - the user is specifying an invalid combination of store and CLR types. But if you really prefer, I can remove the compatibility check for the CLR type, and make the store type be the only thing that matters once it's specified. Let me know.

* Support scaffolding of vector types
* Implemented more proper support for the size facet
* Consolidated different mappins and plugins into the same files
* A bit of test code cleanup

Closes pgvector#44
@ankane
Copy link
Member

ankane commented Mar 25, 2025

Hmm, I'm not able to get scaffolding support to work when testing (after adding the changes through a local NuGet source).

dotnet ef dbcontext scaffold "Host=localhost;Database=repro" Npgsql.EntityFrameworkCore.PostgreSQL

still outputs

Could not find type mapping for column 'public.items.embedding' with data type 'vector(3)'. Skipping column.

Edit: When adding logging, it doesn't appear to call FindMapping. Is there a way to call UseVector() when scaffolding?

This is with Npgsql 8.0.7 / Npgsql.EntityFrameworkCore.PostgreSQL 8.0.11.

@roji
Copy link
Contributor Author

roji commented Mar 25, 2025

When adding logging, it doesn't appear to call FindMapping. Is there a way to call UseVector() when scaffolding?

Take a look at this doc page - EF's design-time tools are probably not picking up your context with its UseVector() configuration.

@ankane
Copy link
Member

ankane commented Mar 25, 2025

NodaTime works without additional configuration (the Npgsql.EntityFrameworkCore.PostgreSQL.NodaTime package just needs to be present), so think this PR might be missing something. Looking through the efcore.pg code, UseNodaTime is referenced in NodaTimeDataSourceConfigurationPlugin and NpgsqlNodaTimeCodeGeneratorPlugin. Do either of those affect scaffolding?

@roji
Copy link
Contributor Author

roji commented Mar 25, 2025

You're absolutely right... I forgot about some of the design-time integration infra, and was a bit sloppy with my testing.

I've pushed an additional commit which takes care of all that, and I can see everything working with a real local nuget etc. (but please verify on your side as well).

@ankane
Copy link
Member

ankane commented Mar 26, 2025

Thanks, feels like it's getting closer, but still not finding the design-time services. With --verbose:

Finding design-time services referenced by assembly 'dotnet-repro2'...
Finding design-time services referenced by assembly 'dotnet-repro2'...
Using design-time services from assembly 'Npgsql.EntityFrameworkCore.PostgreSQL.NodaTime'.
Finding design-time services for provider 'Npgsql.EntityFrameworkCore.PostgreSQL'...
Using design-time services from provider 'Npgsql.EntityFrameworkCore.PostgreSQL'.
Finding IDesignTimeServices implementations in assembly 'dotnet-repro2'...
No design-time services were found.

Nevermind, tested again it's now working 🎉

@ankane ankane closed this in b8e167d Mar 26, 2025
@ankane
Copy link
Member

ankane commented Mar 26, 2025

Big thanks @roji! Merged a minimal version of this in the commit above. Will look into incorporating some of the other changes in follow-up commits.

ankane added a commit that referenced this pull request Mar 26, 2025
@ankane
Copy link
Member

ankane commented Mar 26, 2025

Merged the consolidation changes in the commit above. Does the size change have any impact on users?

@ankane
Copy link
Member

ankane commented Mar 26, 2025

Created an issue for the exception message in dotnet/efcore#35846.

@roji roji deleted the EfTypeMapping branch March 26, 2025 06:41
@roji
Copy link
Contributor Author

roji commented Mar 26, 2025

Great that it all works!

Does the size change have any impact on users?

Nothing important comes to mind... Since the mapping is now size-aware, users should now be able to use EF APIs to manage the size separately rather than giving the entire store type (this stuff) - it's not named super well (since in the pgvector case it's a fixed absolute length rather than a max length, but that naming problem exists elsewhere as well...).

ankane added a commit that referenced this pull request Mar 26, 2025
@ankane
Copy link
Member

ankane commented Mar 26, 2025

Thanks, I appreciate the help. Merged MaxLength support in the commit above, and will push a new release shortly.

@ankane
Copy link
Member

ankane commented Mar 26, 2025

Looks like build file added a netstandard2.0 target, so renamed to net8.0 in 123eeaa.

(it looks like Npgsql.EntityFrameworkCore.PostgreSQL.NodaTime has the same situation, fwiw)

@roji
Copy link
Contributor Author

roji commented Mar 26, 2025

Looks like build file added a netstandard2.0 target, so renamed to net8.0 in 123eeaa.

That shouldn't actually matter (AFAIK netstandard2.0 stuff should get picked up by a net8.0 library/executable etc); but in any case, as long as it works, it works...

@ankane
Copy link
Member

ankane commented Mar 26, 2025

Yeah, it doesn't affect the functionality, just what NuGet reports as supported (0.2.2 vs 0.2.1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support for dotnet ef dbcontext scaffold
2 participants