You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to create some scripts (maybe as a part of presto-admin), which will help to identify issues with presto cluster.
Ideally we should be able to detect:
long GC pauses based on GC log if enabled
jvm crashes
It would create timeline of events which happened in given time period:
{code}
presto-admin show-events 24h
2015-01-01 00:00:000 Node 10.10.0.1 started
2015-01-01 01:00:000 Node 10.10.0.2 crashed (Out of memory error)
2015-01-01 02:00:000 Node 10.10.0.3 long STW GC pause (22.003 seconds)
{code}
We should be able to do this based on gc and launcher logs.
The text was updated successfully, but these errors were encountered:
This is an extension to the existing collect logs presto-admin command. Basically, it would look through the logs (and maybe also jmx stats) to produce a timeline of what's happening on the cluster.
This seems to me something that would be a fun hackathon project, but not something that's essential to work on right now.
We need to create some scripts (maybe as a part of presto-admin), which will help to identify issues with presto cluster.
Ideally we should be able to detect:
It would create timeline of events which happened in given time period:
{code}
presto-admin show-events 24h
2015-01-01 00:00:000 Node 10.10.0.1 started
2015-01-01 01:00:000 Node 10.10.0.2 crashed (Out of memory error)
2015-01-01 02:00:000 Node 10.10.0.3 long STW GC pause (22.003 seconds)
{code}
We should be able to do this based on gc and launcher logs.
The text was updated successfully, but these errors were encountered: