-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-5306 - Fix use of public MongoClient attributes before connection #2285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1205,6 +1206,18 @@ def topology_description(self) -> TopologyDescription: | |||
|
|||
.. versionadded:: 4.0 | |||
""" | |||
if self._topology is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the client has not connected yet, return a TopologyDescription
with placeholder servers derived from the seeds passed to the client's constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure the returned TopologyDescription has the same _topology_id. Otherwise it could be confusing because _topology_id will change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can you give an example of what topology_description looks like in the connect=False srv case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<TopologyDescription id: 67ffb68a6e12ca176991f521, topology_type: Unknown, servers: [<ServerDescription (('test1.test.build.10gen.cc', None), 27017) server_type: Unknown, rtt: None>]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include an example of this new/old behavior in the Jira ticket when closing.
@@ -1218,6 +1231,8 @@ def nodes(self) -> FrozenSet[_Address]: | |||
to any servers, or a network partition causes it to lose connection | |||
to all servers. | |||
""" | |||
if self._topology is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the docstring defines, we expect nodes
to return an empty set if the client has not yet connected.
@@ -1205,6 +1206,18 @@ def topology_description(self) -> TopologyDescription: | |||
|
|||
.. versionadded:: 4.0 | |||
""" | |||
if self._topology is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure the returned TopologyDescription has the same _topology_id. Otherwise it could be confusing because _topology_id will change.
test/asynchronous/test_client.py
Outdated
{("unknown", None): ServerDescription(("unknown", None))}, | ||
) | ||
await client.aconnect() | ||
self.assertEqual(await client.address, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a tests for address
and all the helper helpers without calling aconnect
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was mistakenly included: our test SRV URIs aren't real servers, so we can't actually connect to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a larger restriction: the new test needs to be run against a real SRV server, or have accurate responses mocked out, to ensure correct behavior.
For all non-SRV URIs, we immediately create a Topology, making this test on such connections useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have real SRV addresses which resolve to localhost:
>>> client = MongoClient("mongodb+srv://test1.test.build.10gen.cc", tlsInsecure=True)
>>> client.topology_description
<TopologyDescription id: 68000f362bab7f34cc9902c2, topology_type: ReplicaSetWithPrimary, servers: [<ServerDescription ('localhost', 27017) server_type: RSPrimary, rtt: 0.001482416999351699>]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I forgot that SRV implies replica + SSL, thanks!
test/asynchronous/test_client.py
Outdated
# arbiters causes client to block until connected | ||
await c.arbiters | ||
self.assertIsNotNone(c._topology) | ||
c = await self.async_rs_or_single_client(connect=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this kind of test it's a good idea to close each client once it's no longer needed to avoid building up unclosed clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async_rs_or_single_client
already automatically closes all clients at the end of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know but I'm saying it's better to close them explicitly to avoid building up unclosed clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently use that pattern in our testcases, relying on the helper client creation methods to clean up clients for us. If it's better to avoid relying on those, should we rethink our approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying for tests that specifically open multiple/many clients, it's good practice to add an explicit call to close().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not trying to pushback stubbornly here, just want to keep our codebase consistent where possible since we don't currently explicitly close clients created with the helpers. Do we have a measurable improvement or benefit from explicitly closing clients that will be cleaned up by the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not strictly a performance issue. It's also about easier debugging. Leaving open clients around unnecessarily like this makes debugging harder because each client has many threads/tasks causing noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense! I'll open a ticket to make sure we follow the same pattern elsewhere too.
@@ -1205,6 +1206,18 @@ def topology_description(self) -> TopologyDescription: | |||
|
|||
.. versionadded:: 4.0 | |||
""" | |||
if self._topology is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can you give an example of what topology_description looks like in the connect=False srv case?
pymongo/synchronous/mongo_client.py
Outdated
None, | ||
None, | ||
None, | ||
TopologySettings(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing certain settings that are publicly accessible, like TopologyDescription.heartbeat_frequency.
In the ticket I suggested making the MongoClient constructor always create the Topology instance, and then mutate it later once the SRV info is resolved. Did you consider that approach? It would avoid the problem of attempting to mock out these attributes and avoid the type-ignores since everything would always be non-None.
(It could be that this approach is a better fit, I just want to know if an alternative was considered.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started with the approach you suggested, but switched to the current implementation for a few reasons:
topology_description
is the only public attribute that requires changes of meaningful size. The rest either already do IO or arenodes
, which has a trivial workaround.- We don't currently mutate
Topology
instances orTopologySettings
instances directly. To do so, we'd need to make more code changes and be confident that we haven't overlooked any problematic edge cases. - I foresee more potential bugs/confusion caused by introducing a new state specifically for SRV Topologies where we know that the Topology we currently have is not actually correct, and will be mutated upon SRV resolution.
This approach has fewer changes that could cause errors, especially since it isolates changes to resolving the specific issue at hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, could you update TopologySettings()
here to be more accurate then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TopologySettings
sets defaults for each attribute, is passing no arguments to it invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like I said above, client.topology_description.heartbeat_frequency
is one example of an attribute we need to set here to avoid heartbeat_frequency appearing to change from one value to another. There may be others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
>>> client = MongoClient(..., connect=False, heartbeatFrequencyMS=99999)
>>> client.topology_description.heartbeat_frequency
10
>>> client.admin.command("ping")
...
>>> client.topology_description.heartbeat_frequency
99.999
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misunderstood what you meant. The example makes it clear, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we always initialize a TopologySettings
in the constructor. If SRV is enabled, we create a new one after resolution but keep the existing topology id.
test/asynchronous/test_client.py
Outdated
@@ -849,6 +849,54 @@ async def test_init_disconnected_with_auth(self): | |||
with self.assertRaises(ConnectionFailure): | |||
await c.pymongo_test.test.find_one() | |||
|
|||
@async_client_context.require_replica_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can remove require_replica_set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it causes failures on any non-replica topology: https://spruce.mongodb.com/task/mongo_python_driver_mongodb_latest_test_python3.9_auth_ssl_standalone_cov_patch_846b1fc25ced5979ade934bb27d6ffbb60f68267_68010865cc419c000710589f_25_04_17_13_56_05?execution=0&sortBy=STATUS&sortDir=ASC
I assume this is due to SRV requiring SSL + replica to function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. "mongodb+srv://test1.test.build.10gen.cc" returns 2 hosts so it can't connect to standalone. It can connect to Mongos though so we should run this test on both replica set and sharded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require_no_standalone
, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Load balancers probably dont work here so it would need to be @require_no_standalone+@require_no_load_balancer+@require_no_load_balancer
test/asynchronous/test_client.py
Outdated
# arbiters causes client to block until connected | ||
await c.arbiters | ||
self.assertIsNotNone(c._topology) | ||
c = await self.async_rs_or_single_client(connect=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying for tests that specifically open multiple/many clients, it's good practice to add an explicit call to close().
pymongo/asynchronous/mongo_client.py
Outdated
@@ -767,6 +768,7 @@ def __init__( | |||
self._timeout: float | None = None | |||
self._topology_settings: TopologySettings = None # type: ignore[assignment] | |||
self._event_listeners: _EventListeners | None = None | |||
self._initial_topology_id: Optional[ObjectId] = None # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove the type ignore here?
@@ -1205,6 +1206,18 @@ def topology_description(self) -> TopologyDescription: | |||
|
|||
.. versionadded:: 4.0 | |||
""" | |||
if self._topology is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include an example of this new/old behavior in the Jira ticket when closing.
pymongo/synchronous/mongo_client.py
Outdated
None, | ||
None, | ||
None, | ||
TopologySettings(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, could you update TopologySettings()
here to be more accurate then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a changelog entry and call it a day!
doc/changelog.rst
Outdated
@@ -10,6 +10,7 @@ Version 4.12.1 is a bug fix release. | |||
- Fixed a bug that could raise ``UnboundLocalError`` when creating asynchronous connections over SSL. | |||
- Fixed a bug causing SRV hostname validation to fail when resolver and resolved hostnames are identical with three domain levels. | |||
- Fixed a bug that caused direct use of ``pymongo.uri_parser`` to raise an ``AttributeError``. | |||
- Fixed a bug that could cause public ``pymongo.MongoClient`` and ``pymongo.AsyncMongoClient`` attributes to raise errors when accessed before a connection was established. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's describe it a bit more specifically, something like:
Fixed a bug where clients created with connect=False and a "mongodb+srv://" connection string
could cause public ``pymongo.MongoClient`` and ``pymongo.AsyncMongoClient`` attributes (topology_description,
nodes, address, primary, secondaries, arbiters) to incorrectly return a Database, leading to type
errors such as: "NotImplementedError: Database objects do not implement truth value testing or bool()".
doc/changelog.rst
Outdated
- Fixed a bug where clients created with connect=False and a "mongodb+srv://" connection string | ||
could cause public ``pymongo.MongoClient`` and ``pymongo.AsyncMongoClient`` attributes (topology_description, | ||
nodes, address, primary, secondaries, arbiters) to incorrectly return a Database, leading to type | ||
errors such as: "NotImplementedError: Database objects do not implement truth value testing or bool(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing trailing "
for "NotImplementedError.
The public attributes of
MongoClient
aretopology_description
,nodes
,address
,primary
,secondaries
, andarbiters
. Of these, onlytopology_description
andnodes
do not already perform network I/O. The rest can safely ensure connection as part of their execution without modifying their previous behavior.