Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jersey threads stuck in WAITING #3800

Open
jerseyrobot opened this issue Apr 5, 2018 · 6 comments
Open

Jersey threads stuck in WAITING #3800

jerseyrobot opened this issue Apr 5, 2018 · 6 comments

Comments

@jerseyrobot
Copy link
Contributor

As part of the upgrade path from Dropwizard 0.x -> 1.x the jersey version was bumped to 2.25.1

What we've noticed is that across a number of our production services, we're seeing threads randomly moving from a RUNNABLE to a WAITING state and staying in that waiting state. This seems to occur randomly, but mostly centered around higher throughput paths that can sometimes incur higher latency. As these blocked threads increase over time, the thread pool is eventually starved and the service topples over.

This seems somewhat related to this previous issue (targeted for 2.26+ milestone) https://github.com/jersey/jersey/issues/3207

Here's an example from a recent thread dump. We see something similar for each stuck thread and the entries in the thread dump slowly increase over time as they're not evicted from the thread pool.

"dw-2113 - GET /path/being/called..." id=2113 state=WAITING
    - waiting on <0x1f2f7a6b> (a jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync)
    - locked <0x1f2f7a6b> (a jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync)
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
    at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
    at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
    at org.glassfish.jersey.servlet.internal.ResponseWriter.getResponseContext(ResponseWriter.java:299)
    at org.glassfish.jersey.servlet.internal.ResponseWriter.callSendError(ResponseWriter.java:215)
    at org.glassfish.jersey.servlet.internal.ResponseWriter.commit(ResponseWriter.java:194)
    at org.glassfish.jersey.server.ContainerResponse.close(ContainerResponse.java:413)
    at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:784)
    at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:444)
    at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:490)
    at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:334)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
    at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
    at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:34)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
    at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
    at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
    at com.amazonaws.xray.javax.servlet.AWSXRayServletFilter.doFilter(AWSXRayServletFilter.java:151)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:241)
    at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:527)
    at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:68)
    at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
    at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at org.eclipse.jetty.server.Server.handle(Server.java:561)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:334)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:104)
    at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:289)
    at org.eclipse.jetty.io.ssl.SslConnection$3.succeeded(SslConnection.java:149)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:104)
    at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:243)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:679)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:597)
    at java.lang.Thread.run(Thread.java:748)
@jerseyrobot
Copy link
Contributor Author

@joekarl Commented
And to be clear, all of these requests are synchronous. They do reach out to other network resources (other services/database/etc...) but those are generally done synchronously within the request thread (sometimes we'll spawn futures internally to fetch resources in parallel, but we resolve those futures synchronously in the request thread).

@jerseyrobot
Copy link
Contributor Author

@daniloradenovic Commented
Could this be related to #3772 ?

@jerseyrobot
Copy link
Contributor Author

@joekarl Commented
We've discovered that while we can't seem to figure out why the threads never return, we can at least forcefully clean them up using Jetty's blockingTimeout config option http://download.eclipse.org/jetty/stable-9/apidocs/org/eclipse/jetty/server/HttpConfiguration.html#getBlockingTimeout--

Though seems more of a bandaid than an actual fix.

@jerseyrobot
Copy link
Contributor Author

@joekarl Commented
Reporting back, the blockingTimeout setting didn't actually help the situation, still seeing threads getting in a stuck state.

@jerseyrobot
Copy link
Contributor Author

@jansupol
Copy link
Contributor

jansupol commented May 31, 2018

To understand this issue correctly:

  1. When you run a dropwizard 1.x
  2. with a Jetty Server
  3. and Jersey 2.25.1 application
  4. you see a Lock?

Well, I admit a reproducer would help here....
Can you try to upgrade Jersey to 2.26+? There has been quite some development between these versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants