-
Notifications
You must be signed in to change notification settings - Fork 575
Multithreaded, concurrent and intensive use often cause stale containers to remain up forever #808
Comments
Hi @mad-p, Could you please try with the latest release? https://github.com/zalando/zalenium/releases/tag/3.141.59e These issues should have been solved in that release. |
OK, I'll try. |
No worries, Closing due to lack of activity, please let us know if things didn't work out. |
I tried it twice. Unfortunately, in one of the two tries, some containers remained unreclaimed. I'll try some more. Upgrading jersey wiped out the problem completely in our side. Can you have a look at it? |
How did you upgrade the Jersey version? |
Pls look at the "Tentative workaround" above. |
In the permanent workaround you mention:
But the current version we are using is |
The Problem has been fixed with the version 2.29 of jersey-apache-connector. But The last version of com.spotify/docker-client is still using the version 2.22. |
…operations see also: - eclipse-ee4j/jersey#3772 - zalando#808
…operations see also: - eclipse-ee4j/jersey#3772 - #808
…operations see also: - eclipse-ee4j/jersey#3772 - #808
Zalenium Image Version(s): 3.14.0g ( also reproducible with 3.14.0c )
Docker Version: 18.09.0, build 4d60db4
If using docker-compose, version: 1.23.1, build b02f1306
OS: OSX High Sierra ( also reproducible on CentOS 7.2.1511 and latest Arch Linux )
Docker Command to start Zalenium: Executing through docker-compose.yml
Expected Behavior
Stale containers will get removed after idle timeout.
Thread
AutoStartProxyPoolPoller
works as expected.Actual Behavior
Stale containers won't get removed and remain up forever even after idle timeout.
Thread
AutoStartProxyPoolPoller
hangs forever.Note that those containers can still be reused as normal.
Minimal code to reproduce the problem
docker-compose.yml
--desiredContainers 0
helps us to tell whetheridle timeout
is working or not. The number of node containers stays above zero when the problem occurs.--debugEnabled true
helps us to tell whether the debug log of 'Checking containers...' is constantly printed to standard output or not , i.e.; threadAutoStartProxyPoolPoller
is working or not.Ruby script
idleTimeout
i.e. higher frequency of stale containers getting removed seems to make the problem more reproducible in lesser period of time.In a "real" UI testing environment where idleTimeout defaults to 90 seconds, each UI test takes a dozen of seconds and the concurrency of tests is about four, it usually takes a couple of hours for the problem to occur.
Java thread dump taken after the problem occurs.
https://gist.github.com/mad-p/6082c9ee556ad84d1304be1c9f91b562
The Java thread dump was taken as follows.
Root cause
The root cause seems to be the issue below.
Properly close the Apache response so that connections can be reused
eclipse-ee4j/jersey#3861
Tentative workaround
Use patched version of
jersey-apache-connector
.Permanent workaround
Please consider upgrading
com.spotify/docker-client:8.11.7
to a newer version(not released yet as of Nov 2018) where docker-client usesjersey-apache-connector:2.29
(scheduled to be released on spring 2019).Zalenium:3.14.0g uses docker-client:8.11.7.
https://github.com/zalando/zalenium/blob/3.14.0g/pom.xml#L62
docker-client:8.11.7 uses jersey-apache-connector:2.22.2.
https://github.com/spotify/docker-client/blob/v8.11.7/pom.xml#L109-L113
See also
com.spotify/docker-client
https://github.com/spotify/docker-client
https://mvnrepository.com/artifact/com.spotify/docker-client
jersey-apache-connector
https://github.com/jersey/jersey/ (old repo)
https://github.com/eclipse-ee4j/jersey/
https://mvnrepository.com/artifact/org.glassfish.jersey.connectors/jersey-apache-connector
Issues at com.spotify/docker-client
spotify/docker-client#727
spotify/docker-client#727 (comment)
spotify/docker-client#727 (comment)
Issues at jersey-apache-connector
https://github.com/jersey/jersey/issues/3772 (old repo)
eclipse-ee4j/jersey#3772
eclipse-ee4j/jersey#3772 (comment)
Jersey release schedule and roadmap
https://projects.eclipse.org/projects/ee4j.jersey
https://github.com/eclipse-ee4j/jersey/wiki/Road-Map
The text was updated successfully, but these errors were encountered: