Skip to content

Speed up client initialization with lazy loading #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
May 12, 2025

Conversation

jhamon
Copy link
Collaborator

@jhamon jhamon commented Apr 8, 2025

Problem

In the upcoming release we want to take a dependency on pinecone-plugin-assitants. But we don't want this dependency to come at a cost of degraded performance for users who are not using the features in the plugin. Adding it with no modifications to existing code resulted in adding 93 milliseconds to the already sluggish time needed to import the Pinecone class of 230 milliseconds.

With the changes in this PR, we are able to add the assistant functionality and bring the time needed to load and do initial instantiation of the Pinecone client down by about 65%.

Solution

To accomplish these goals I wanted to do some significant refactoring without breaking any existing functionality.

Some functional requirements of the refactor include:

  • Existing integration tests able to pass without major modification.
    • Things should still be importable as before (e.g. from pinecone import Pinecone). This includes many objects which are less-often discussed but still needed by customers such as various data objects. Anyone working heavily with types in python will need access to these both before and after the refactoring, so we don't want to accidentally remove items that used to be importable from the top-level.
    • Existing client methods should continue to work as before (create_index, etc)
  • mypy type-checking still passing even with new lazy-loading approach in the top-level __init__.py

With all that said, this refactor has implemented the following major changes:

  • Define small resource-centric classes. Move the implementation for actions on a resource from pinecone.py into a class with a narrower focus. Then only load and instantiate this class when the user is attempting to call these methods.
    • For example, actions are completed on the index resource with Pinecone methods such as create_index, list_indexes, delete_index that historically were defined inline in the Pinecone class. After this refactoring, the behavior from those functions has been moved into a new IndexResource class that exposes create, list, etc methods. The parent Pinecone class now delegates to this class which is lazily initialized only when needed. This speeds up the time needed to import and instantiate Pinecone significantly.
    • So far these resource classes have been implemented for both the sync and asyncio versions of the Index and Collections classes under pinecone/db_control/resources.
  • Remove unnecessary type imports with TYPE_CHECKING. Use the TYPE_CHECKING boolean from the typing package to avoid importing code at runtime that is only needed for type-checking (which does not occur at runtime.) Type-checking with mypy is a static code inspection, and this TYPE_CHECKING variable will be treated as True when analyzing type information. This allows mypy to understand what types are being used when type checking without the runtime overhead (since during runtime TYPE_CHECKING always evaluates to False). Without using this technique, most of the benefits of refactoring into smaller lazy-loaded classes would be undone by loading classes to use in type signatures.
  • Implemented a proxy module loader in the top-level __init__.py so that every importable item in the the entire package does not have to be loaded in order to gain access to the Pinecone client object and get started.
  • Modify PluginAware class to defer plugin loading. The PluginAware class is something that other classes can extend in order to become pluggable and is currently used by the Pinecone, Index, and Inference classes to implement plugins. In the past, on initialization of a class extending PluginAware, the environment would be scanned for the presence of plugins and if any are available they get installed. This means we could not have a plugin in the environment without incurring an initialization cost on every user. Since we want to ship with the Assistant plugin in the upcoming release, but not every user is using Assistant, a big startup penalty seems highly undesirable. So now the PluginAware class has been reformulated. Now PluginAware implements a __getattr__ method that will install plugins only at the moment a user tries to use them.
  • Removed urllib3 info from user-agent. This seems like it should be inconsequential, but importing the entire urllib3 package just to get the version during initialization of the Pinecone client was contributing significant latency. Since we're not using that info for anything anymore, we can nix it.
  • Added new integration tests: To ensure the backwards compatibility of these changes, most integration tests were left as they were. Some new ones have been added to exercise new usage patterns such as pc.db.index.delete(name='foo'), pc.db.collection.list(), etc. These new tests now run in dedicated CI builds. I need to continue expanding coverage on these, particularly for the async ones, but most of the functionality is implicitly covered by the existing integration tests.
  • Reorganize some folders to align with API spec organization. Along the way to making these changes, it seemed appropriate to create some new folder structures such as pinecone/db_control to mirror the structure of our API definitions. Where things have been moved, I've tried to add alias packages with warning messages so that anyone reaching deeply into packages to import things should not be broken. A warning message is now displayed, for example, if you attempt to import from a legacy subpackage package: The module at pinecone.controlhas moved topinecone.db_control. This warning will become an error in a future version of the Pinecone Python SDK. Very few people will ever see these, I think, but they should help a few people. This is a best-effort thing, but there's no way to ensure that I have covered every possible way that somebody may have tried to import something from our internals in the past.

New dependencies:

  • pinecone-plugin-assistant to bring in assistant functions
  • tuna: a dev dependency for visualizing load performance
  • python-dotenv: dev dependency for managing environment variables more easily in testing

Initialization performance

To assess the load time for the pinecone package, I used a built-in package called importtime.

poetry run python3 -X importtime -c "from pinecone import Pinecone; pc = Pinecone(api_key='foo')" 2> main.log

Then I visualized the results using a new dev-dependency called tuna

poetry run tuna main.log

These steps can be used to show that before any refactoring, the initialization time was more than 300ms(!)

Screenshot 2025-05-08 at 3 20 27 PM

After refactoring to make PluginAware lazy, and also restructure code related to operations on indexes, collections, inference, etc to take advantage of lazy loading we can improve the client initialization time very significantly. This is a big improvement because it means users will no longer need to wait to load a bunch of code for features they are not using.

Screenshot 2025-05-08 at 3 18 16 PM

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)
  • None of the above: (explain here)

Test Plan

Describe specific steps for validating this change.

@jhamon jhamon force-pushed the jhamon/client-reorg branch from b89d903 to cd15bf9 Compare April 25, 2025 18:42
@jhamon jhamon marked this pull request as ready for review May 12, 2025 15:20
@jhamon jhamon requested a review from austin-denoble May 12, 2025 15:20
@@ -72,7 +70,8 @@ def _get_socket_options(
"""
# Source: https://www.finbourne.com/blog/the-mysterious-hanging-client-tcp-keep-alives

socket_params = HTTPConnection.default_socket_options
# urllib3.connection.HTTPConnection.default_socket_options
Copy link
Collaborator Author

@jhamon jhamon May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config gets evaluated during intialization, and loading the entire urllib3 package just to get access to a reference to these socket constants was adding a big penalty to our load time. So I decided to inline it and remove the urllib3 dep here.

logger = logging.getLogger(__name__)
""" @private """

if TYPE_CHECKING:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving these imports under this if TYPE_CHECKING means they won't be loaded except during the static type analysis process. A big performance savings during initialization at runtime. Many of these things used to be needed here at runtime but no longer are because functionality has been pulled out into classes like IndexResource.

""" @private """


class IndexResourceAsyncio:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this and other resource classes was simply moved from the parent client class.

Copy link
Contributor

@austin-denoble austin-denoble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, really impressed with how thorough you were in getting this architected and implemented. The import and initialization gains are huge, great to see that come together. 🚢

from pinecone.db_control.types import CreateIndexForModelEmbedTypedDict


class LegacyPineconeDBControlInterface(ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Legacy" here is just differentiating from asyncio further, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "legacy" adjective really refers to the create_index, describe_index, etc form of method access vs db.index.create and db.index.describe. But I should add some comments or something to make that more clear.

Comment on lines +156 to +159
from pinecone.db_data import _AsyncioInference

self._inference = _AsyncioInference(api_client=self.db._index_api.api_client)
return self._inference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍

@@ -1,10 +1,12 @@
from pinecone.core.openapi.db_control import API_VERSION
def versioned_url(template: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really need to look at adding some kind of utility or script like this to the ts client, there's a lot of static links in that repo at this point.

@jhamon jhamon merged commit e919d41 into release-candidate/2025-04 May 12, 2025
58 checks passed
@jhamon jhamon deleted the jhamon/client-reorg branch May 12, 2025 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants