-
Notifications
You must be signed in to change notification settings - Fork 92
Speed up client initialization with lazy loading #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b89d903
to
cd15bf9
Compare
@@ -72,7 +70,8 @@ def _get_socket_options( | |||
""" | |||
# Source: https://www.finbourne.com/blog/the-mysterious-hanging-client-tcp-keep-alives | |||
|
|||
socket_params = HTTPConnection.default_socket_options | |||
# urllib3.connection.HTTPConnection.default_socket_options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This config gets evaluated during intialization, and loading the entire urllib3 package just to get access to a reference to these socket constants was adding a big penalty to our load time. So I decided to inline it and remove the urllib3 dep here.
logger = logging.getLogger(__name__) | ||
""" @private """ | ||
|
||
if TYPE_CHECKING: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving these imports under this if TYPE_CHECKING
means they won't be loaded except during the static type analysis process. A big performance savings during initialization at runtime. Many of these things used to be needed here at runtime but no longer are because functionality has been pulled out into classes like IndexResource
.
""" @private """ | ||
|
||
|
||
class IndexResourceAsyncio: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this and other resource classes was simply moved from the parent client class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, really impressed with how thorough you were in getting this architected and implemented. The import and initialization gains are huge, great to see that come together. 🚢
from pinecone.db_control.types import CreateIndexForModelEmbedTypedDict | ||
|
||
|
||
class LegacyPineconeDBControlInterface(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Legacy" here is just differentiating from asyncio further, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "legacy" adjective really refers to the create_index
, describe_index
, etc form of method access vs db.index.create
and db.index.describe
. But I should add some comments or something to make that more clear.
from pinecone.db_data import _AsyncioInference | ||
|
||
self._inference = _AsyncioInference(api_client=self.db._index_api.api_client) | ||
return self._inference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice 👍
@@ -1,10 +1,12 @@ | |||
from pinecone.core.openapi.db_control import API_VERSION | |||
def versioned_url(template: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really need to look at adding some kind of utility or script like this to the ts client, there's a lot of static links in that repo at this point.
Problem
In the upcoming release we want to take a dependency on
pinecone-plugin-assitants
. But we don't want this dependency to come at a cost of degraded performance for users who are not using the features in the plugin. Adding it with no modifications to existing code resulted in adding 93 milliseconds to the already sluggish time needed to import thePinecone
class of 230 milliseconds.With the changes in this PR, we are able to add the assistant functionality and bring the time needed to load and do initial instantiation of the
Pinecone
client down by about 65%.Solution
To accomplish these goals I wanted to do some significant refactoring without breaking any existing functionality.
Some functional requirements of the refactor include:
from pinecone import Pinecone
). This includes many objects which are less-often discussed but still needed by customers such as various data objects. Anyone working heavily with types in python will need access to these both before and after the refactoring, so we don't want to accidentally remove items that used to be importable from the top-level.create_index
, etc)__init__.py
With all that said, this refactor has implemented the following major changes:
pinecone.py
into a class with a narrower focus. Then only load and instantiate this class when the user is attempting to call these methods.Pinecone
methods such ascreate_index
,list_indexes
,delete_index
that historically were defined inline in thePinecone
class. After this refactoring, the behavior from those functions has been moved into a newIndexResource
class that exposescreate
,list
, etc methods. The parentPinecone
class now delegates to this class which is lazily initialized only when needed. This speeds up the time needed to import and instantiatePinecone
significantly.Index
andCollections
classes underpinecone/db_control/resources
.TYPE_CHECKING
boolean from thetyping
package to avoid importing code at runtime that is only needed for type-checking (which does not occur at runtime.) Type-checking with mypy is a static code inspection, and thisTYPE_CHECKING
variable will be treated as True when analyzing type information. This allows mypy to understand what types are being used when type checking without the runtime overhead (since during runtimeTYPE_CHECKING
always evaluates to False). Without using this technique, most of the benefits of refactoring into smaller lazy-loaded classes would be undone by loading classes to use in type signatures.__init__.py
so that every importable item in the the entire package does not have to be loaded in order to gain access to thePinecone
client object and get started.Pinecone
,Index
, andInference
classes to implement plugins. In the past, on initialization of a class extendingPluginAware
, the environment would be scanned for the presence of plugins and if any are available they get installed. This means we could not have a plugin in the environment without incurring an initialization cost on every user. Since we want to ship with the Assistant plugin in the upcoming release, but not every user is using Assistant, a big startup penalty seems highly undesirable. So now the PluginAware class has been reformulated. NowPluginAware
implements a__getattr__
method that will install plugins only at the moment a user tries to use them.urllib3
package just to get the version during initialization of the Pinecone client was contributing significant latency. Since we're not using that info for anything anymore, we can nix it.pc.db.index.delete(name='foo')
,pc.db.collection.list()
, etc. These new tests now run in dedicated CI builds. I need to continue expanding coverage on these, particularly for the async ones, but most of the functionality is implicitly covered by the existing integration tests.pinecone/db_control
to mirror the structure of our API definitions. Where things have been moved, I've tried to add alias packages with warning messages so that anyone reaching deeply into packages to import things should not be broken. A warning message is now displayed, for example, if you attempt to import from a legacy subpackage package:The module at
pinecone.controlhas moved to
pinecone.db_control. This warning will become an error in a future version of the Pinecone Python SDK.
Very few people will ever see these, I think, but they should help a few people. This is a best-effort thing, but there's no way to ensure that I have covered every possible way that somebody may have tried to import something from our internals in the past.New dependencies:
pinecone-plugin-assistant
to bring in assistant functionstuna
: a dev dependency for visualizing load performancepython-dotenv
: dev dependency for managing environment variables more easily in testingInitialization performance
To assess the load time for the pinecone package, I used a built-in package called
importtime
.Then I visualized the results using a new dev-dependency called
tuna
These steps can be used to show that before any refactoring, the initialization time was more than 300ms(!)
After refactoring to make
PluginAware
lazy, and also restructure code related to operations on indexes, collections, inference, etc to take advantage of lazy loading we can improve the client initialization time very significantly. This is a big improvement because it means users will no longer need to wait to load a bunch of code for features they are not using.Type of Change
Test Plan
Describe specific steps for validating this change.