Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Remove default collection initialization for perf reasons #2284

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

MaggieKimani1
Copy link
Contributor

@MaggieKimani1 MaggieKimani1 commented Mar 24, 2025

  • Uses the Lazy pattern to defer initialization of collections till the first time they are accessed.
  • This helps reduce unnecessary resource/memory allocations

Fixes #1971

@MaggieKimani1 MaggieKimani1 changed the title feat: use lazy get for collections to reduce resource allocation feat: use Lazy<T> to initialize collections on access Mar 25, 2025
@MaggieKimani1 MaggieKimani1 marked this pull request as ready for review March 25, 2025 14:36
Copy link
Member

@baywet baywet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For those sets of changes, it's always better to do it at one place with a draft, have a conversation, and then replicate all over.

@MaggieKimani1
Copy link
Contributor Author

Working on reverting these changes to remove default collection initialization altogether. There are lots of failing tests due to null reference exceptions but I'm working on fixing them

@@ -115,7 +115,7 @@ public void ValidateDefaultShouldNotHaveDataTypeMismatchForComplexSchema()
var schema = new OpenApiSchema
{
Type = JsonSchemaType.Object,
Properties =
Properties = new Dictionary<string, IOpenApiSchema>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the shortcut syntax work where you only put "new" and the type is inferred?

Copy link
Contributor Author

@MaggieKimani1 MaggieKimani1 Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that doesn't work.
Properties is an implementation of the IDictionary interface so you have to explicitly define the concrete type here.

public IDictionary<string, IOpenApiSchema>? Properties { get; set; } = new Dictionary<string, IOpenApiSchema>(StringComparer.Ordinal);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we could change the property type as well. (see my latest comment)

@darrelmiller
Copy link
Member

remove default collection initialization altogether

Looks like I missed this conversation. Is there a summary of why we want to force users to initialize these collections manually? Is ASP.NET team ok with this change?

@baywet
Copy link
Member

baywet commented Mar 26, 2025

If we allocate by default with field initialization (current implementation), we allocate the memory when the object gets created.

If we lazy initialize the collection:

  • we allocate memory we're going to discard right away on deserialization
  • we have no way of telling the difference between empty and null

I believe the reason why we allocated on initialization historically was because:

  • nrt was not a thing, so there was no way for the consumers to get help from the compiler for those scenarios
  • collection expression syntax didn't exist, making the syntax verbose.

In addition to all that, in a lot of cases, the consumers end up doing object initialization syntax, which means they are initializing the collections anyway.

For all those reasons I suggest we stick to the idiomatic expressions of the platform, and leave it null, letting the compiler help consumers while reducing memory usage.

@MaggieKimani1
Copy link
Contributor Author

remove default collection initialization altogether

Looks like I missed this conversation. Is there a summary of why we want to force users to initialize these collections manually? Is ASP.NET team ok with this change?

cc: @captainsafia

@MaggieKimani1 MaggieKimani1 marked this pull request as draft April 1, 2025 09:17
@darrelmiller
Copy link
Member

darrelmiller commented Apr 1, 2025

we allocate memory we're going to discard right away on deserialization

I had not considered this. And I don't think we should change deserialization to leverage existing collections.

we have no way of telling the difference between empty and null

I think most of the time that isn't an important distinction to make. I think there are only a couple of collections in the model where that is significant. e.g. Paths and security requirements.

the reason why we allocated on initialization historically was because:

The reason was so that a consumer editing/creating a document did not have instantiate a type that they may or may not know what that type is. e.g. sometimes we use dictionary<string,foo> sometimes we have dedicated classes OpenAPIResponses.

Eight years ago the object initialization syntax was not as developed and wasn't used nearly as much.

stick to the idiomatic expressions of the platform

Is it? I agree that C# has been introducing new ways to initialize objects/arrays/collections over the past few years, but is there some official design guidance that this is idiomatic?

Anyway, I think because the support for object initialization is better and because we want to optimize perf for the common case of deserialization, and because we now have NRT, I am more comfortable moving to a model where we don't lazy instantiate.

@MaggieKimani1 MaggieKimani1 marked this pull request as ready for review April 3, 2025 10:12
@MaggieKimani1 MaggieKimani1 requested a review from a team as a code owner April 3, 2025 10:12
@MaggieKimani1 MaggieKimani1 changed the title feat: use Lazy<T> to initialize collections on access feat: Remove default collection initialization for perf reasons Apr 3, 2025
Copy link
Member

@baywet baywet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great progress!

One thing I also wanted us to discuss but forgot to mention earlier: Why are collections returning their interface type? What benefit does it provide? the obvious drawback is that it prevents from using the collection initialization syntax [], especially for dictionaries as we can see in the unit tests. CC @darrelmiller

@MaggieKimani1
Copy link
Contributor Author

One thing I also wanted us to discuss but forgot to mention earlier: Why are collections returning their interface type? What benefit does it provide? the obvious drawback is that it prevents from using the collection initialization syntax [], especially for dictionaries as we can see in the unit tests. CC @darrelmiller

I'm not sure what the initial intention was for defining the dictionaries as interface types instead of concrete types.
Maybe @darrelmiller might have more historical context into it?
I'm open to refactoring it to simplify collection initialization syntax if we're all onboard with this change.

@darrelmiller
Copy link
Member

I suspect the reason we implemented these properties as interfaces was to leave the possibility of changing the concrete implementation if necessary. I don't have any objection to changing to the concrete classes now there is a clear benefit.

Copy link

sonarqubecloud bot commented Apr 6, 2025

Quality Gate Failed Quality Gate failed

Failed conditions
63.4% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[API Review] Memory performance improvements
3 participants