-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This PR tries to reduce the memory cost of Neurow subscriptions. **TL;DR** By hibernating SSE monitors and manually running the garbage collector in SSE processes, the PR decreases the memory cost of SSE subscriptions from about 63ko per subscription to 30ko per subscription. In a server that handles 80k SSE subscription, it should save about **2.5Go of RAM**. This optimization should however slightly increase the CPU usage because garbage collection run more often. Note: This PR not actually change the max memory required by a Neurow server. The server would eventually run garbage collection in order to free memory to re-use it just after. It just release the memory without having to wait for a gc when the server needs more RAM. This makes its behavior more predictable, and allows to set auto-scaling rules based on the actual RAM usage. ## Analysis The function `Neurow.Observability.System.process_groups` analyses the process memory usage: It groups processes by name and count their memory cost, number of process and number of messages. It can be used in the IEX console by calling `Neurow.Observability.System.process_groups`, or by calling `GET /process_stats` on the internal API. Locally, by starting 2k concurrent subscriptions that subscribe to 10 shared topics, and then by continuously sending messages on these 10 topics, here are some interesting outliers: | name / initial function | current function | process count | memory | cost per subscription (ko) | cost for 80k subscriptions (mo) | -- | -- | -- | -- | -- | -- | cowboy_stream_h:request_process/3 | Elixir.Neurow.PublicApi.Endpoint:loop/7 | 2000 | 80791512 | 39.45 | 3,081.95 cowboy_clear:connection_process/4 | cowboy_http:loop/1 | 2000 | 43334224 | 21.16 | 1,653.07 Elixir.Neurow.PublicApi.SSEMonitor:init/1 | gen_server:loop/7 | 2000 | 5561152 | 2.72 | 212.14 | | | | | | | | **Total** | **63.32** | **4,947.16** | No issue were found in the number of messages. ## Optimizations ### Hibernate SSE Monitors Erlang allows to manually put processes in [hibernation](https://www.erlang.org/doc/apps/erts/erlang.html#hibernate/3): > Puts the calling process into a wait state where its memory allocation has been reduced as much as possible. This is useful if the process does not expect to receive any messages soon. SSEMonitor is a genserver, putting it in hibernation is simple: It just requires to return `{:ok, state, :hibernate}` from the init function. ### Manual garbage collection in SSE subscriptions Because SSE requests handlers are not genservers, it is much more complex to hibernate them. `:erlang.hibernate` exists and use it with `Process.send_after` could allow to implement a ticker, but `:erlang.hibernate` erases the process call stack. That means all instructions that should run after that `Neurow.PublicApi.Endpoint#subscribe` returns will actually not run if the SSE process hibernates. So instead of hibernating we just first force a garbage collection during each SSE passive loop. Compare to an actual hibernation the call stack is not deleted, and all live data are not moved to a continuous heap. But it still provides significant improvements. ### Results | name / initial function | Current function | process count | memory | cost per subscription (Ko) | cost for 80k subscriptions (Mo) | -- | -- | -- | -- | -- | -- | cowboy_stream_h:request_process/3 | Elixir.Neurow.PublicApi.Endpoint:loop/7 | 2000 | 14767616 | 7.21 | 563.34 cowboy_clear:connection_process/4 | cowboy_http:loop/1 | 2000 | 43316544 | 21.15 | 1,652.40 Elixir.Neurow.PublicApi.SSEMonitor:init/1 | gen_server:loop/7 | 2000 | 2656000 | 1.30 | 101.32 | | | | | | | | Total | 29.66 | 2,317.05 | The subscription cost is reduced from 63ko to 30ko. In a server that handles 80k SSE subscriptions, it should save about **2.5Go of RAM** A debounce is added to the manual garbage collection to have max 1 manual GC per SSE process every 60s. Actually the SSE passive loop is triggered much less often, the amount of manual GC should be much lower. A manual GC is always triggered during the first run of the SSE passive loop.
- Loading branch information
Showing
14 changed files
with
338 additions
and
200 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
defmodule Neurow.Observability.HttpInterfacesStats do | ||
use Prometheus.Metric | ||
|
||
def setup() do | ||
Summary.declare( | ||
name: :http_request_duration_ms, | ||
labels: [:interface], | ||
help: "HTTP request duration" | ||
) | ||
|
||
Counter.declare( | ||
name: :http_request_count, | ||
labels: [:interface, :status], | ||
help: "HTTP request count" | ||
) | ||
|
||
Summary.reset(name: :http_request_duration_ms, labels: [:public_api]) | ||
Summary.reset(name: :http_request_duration_ms, labels: [:internal_api]) | ||
|
||
# Please read https://github.com/beam-telemetry/cowboy_telemetry | ||
:telemetry.attach_many( | ||
"cowboy_telemetry_handler", | ||
[ | ||
[:cowboy, :request, :stop] | ||
], | ||
&Neurow.Observability.HttpInterfacesStats.handle_event/4, | ||
nil | ||
) | ||
end | ||
|
||
def handle_event([:cowboy, :request, :stop], measurements, metadata, _config) do | ||
endpoint = | ||
case metadata[:req][:ref] do | ||
Neurow.PublicApi.Endpoint.HTTP -> :public_api | ||
Neurow.InternalApi.Endpoint.HTTP -> :internal_api | ||
end | ||
|
||
duration_ms = System.convert_time_unit(measurements[:duration], :native, :millisecond) | ||
resp_status = metadata[:resp_status] | ||
|
||
Counter.inc(name: :http_request_count, labels: [endpoint, resp_status]) | ||
Summary.observe([name: :http_request_duration_ms, labels: [endpoint]], duration_ms) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
defmodule Neurow.Observability.MessageBrokerStats do | ||
use Prometheus.Metric | ||
|
||
def setup() do | ||
Gauge.declare( | ||
name: :concurrent_subscription, | ||
labels: [:issuer], | ||
help: "Amount of concurrent topic subscriptions" | ||
) | ||
|
||
Counter.declare( | ||
name: :subscription_lifecycle, | ||
labels: [:kind, :issuer], | ||
help: "Count subscriptions and unsubscriptions" | ||
) | ||
|
||
Summary.declare( | ||
name: :subscription_duration_ms, | ||
labels: [:issuer], | ||
help: "Duration of topic subscriptions" | ||
) | ||
|
||
Counter.declare( | ||
name: :message, | ||
labels: [:kind, :issuer], | ||
help: "Messages sent through topic subscriptions" | ||
) | ||
|
||
Counter.declare( | ||
name: :history_rotate, | ||
help: "History rotate counter" | ||
) | ||
|
||
Gauge.declare( | ||
name: :topic_count, | ||
help: "Number of topics in the message history" | ||
) | ||
|
||
Counter.reset(name: :history_rotate) | ||
|
||
Gauge.set([name: :topic_count], 0) | ||
|
||
Enum.each(Neurow.Configuration.issuers(), fn issuer -> | ||
Gauge.set([name: :concurrent_subscription, labels: [issuer]], 0) | ||
Counter.reset(name: :subscription_lifecycle, labels: [:created, issuer]) | ||
Counter.reset(name: :subscription_lifecycle, labels: [:released, issuer]) | ||
Counter.reset(name: :message, labels: [:published, issuer]) | ||
Counter.reset(name: :message, labels: [:sent, issuer]) | ||
Summary.reset(name: :subscription_duration_ms, labels: [issuer]) | ||
end) | ||
|
||
Periodic.start_link( | ||
run: fn -> | ||
Gauge.set([name: :topic_count], Neurow.Broker.ReceiverShardManager.topic_count()) | ||
end, | ||
every: :timer.seconds(10) | ||
) | ||
end | ||
|
||
def inc_subscriptions(issuer) do | ||
Counter.inc(name: :subscription_lifecycle, labels: [:created, issuer]) | ||
Gauge.inc(name: :concurrent_subscription, labels: [issuer]) | ||
end | ||
|
||
def dec_subscriptions(issuer, duration_ms) do | ||
Counter.inc(name: :subscription_lifecycle, labels: [:released, issuer]) | ||
Gauge.dec(name: :concurrent_subscription, labels: [issuer]) | ||
Summary.observe([name: :subscription_duration_ms, labels: [issuer]], duration_ms) | ||
end | ||
|
||
def inc_message_published(issuer) do | ||
Counter.inc(name: :message, labels: [:published, issuer]) | ||
end | ||
|
||
def inc_message_sent(issuer) do | ||
Counter.inc(name: :message, labels: [:sent, issuer]) | ||
end | ||
|
||
def inc_history_rotate() do | ||
Counter.inc(name: :history_rotate) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
defmodule Neurow.Observability.MetricsPlugExporter do | ||
use Prometheus.PlugExporter | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
defmodule Neurow.Observability do | ||
use Prometheus.Metric | ||
|
||
def setup() do | ||
Neurow.Observability.MessageBrokerStats.setup() | ||
Neurow.Observability.HttpInterfacesStats.setup() | ||
Neurow.Observability.MetricsPlugExporter.setup() | ||
Neurow.Observability.SecurityStats.setup() | ||
Neurow.Observability.SystemStats.setup() | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
defmodule Neurow.Observability.SecurityStats do | ||
use Prometheus.Metric | ||
|
||
def setup() do | ||
Counter.declare( | ||
name: :jwt_errors, | ||
labels: [:interface], | ||
help: "JWT Errors" | ||
) | ||
end | ||
|
||
def inc_jwt_errors_public() do | ||
Counter.inc(name: :jwt_errors, labels: [:public]) | ||
end | ||
|
||
def inc_jwt_errors_internal() do | ||
Counter.inc(name: :jwt_errors, labels: [:internal]) | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
defmodule Neurow.Observability.SystemStats do | ||
use Prometheus.Metric | ||
|
||
def setup() do | ||
Gauge.declare( | ||
name: :memory_usage, | ||
help: "Memory usage" | ||
) | ||
|
||
Boolean.declare( | ||
name: :stopping, | ||
help: "The node is currently stopping" | ||
) | ||
|
||
Gauge.set([name: :memory_usage], 0) | ||
Boolean.set([name: :stopping], false) | ||
|
||
Periodic.start_link( | ||
run: fn -> Gauge.set([name: :memory_usage], :recon_alloc.memory(:usage)) end, | ||
every: :timer.seconds(10) | ||
) | ||
end | ||
|
||
def report_shutdown() do | ||
Boolean.set([name: :stopping], true) | ||
end | ||
|
||
defmodule ProcessesStats do | ||
defstruct [ | ||
:name_or_initial_func, | ||
:current_func, | ||
process_count: 0, | ||
memory: 0, | ||
message_queue_len: 0 | ||
] | ||
end | ||
|
||
# Group process by name or initial function, and current function. | ||
# Then sort by memory usage and return the top consuming processe groups | ||
def process_groups(result_count \\ 50) do | ||
Process.list() | ||
|> Enum.reduce(%{}, fn pid, acc -> | ||
{name_or_initial_func, current_func} = grouping_attributes(pid) | ||
|
||
Map.update( | ||
acc, | ||
{name_or_initial_func, current_func}, | ||
%ProcessesStats{ | ||
name_or_initial_func: name_or_initial_func, | ||
current_func: current_func | ||
}, | ||
fn current_stats -> | ||
process_info = Process.info(pid, [:memory, :message_queue_len]) | ||
|
||
%{ | ||
current_stats | ||
| process_count: current_stats.process_count + 1, | ||
memory: current_stats.memory + (process_info[:memory] || 0), | ||
message_queue_len: | ||
current_stats.message_queue_len + (process_info[:message_queue_len] || 0) | ||
} | ||
end | ||
) | ||
end) | ||
|> Map.values() | ||
|> Enum.sort(&(&1.memory > &2.memory)) | ||
|> Enum.take(result_count) | ||
end | ||
|
||
defp mfa_to_string({module, function, arity}) do | ||
"#{module}:#{function}/#{arity}" | ||
end | ||
|
||
defp grouping_attributes(pid) do | ||
name_or_initial_func = | ||
case Process.info(pid, [:registered_name, :dictionary, :initial_call]) do | ||
[{:registered_name, name} | _rest] when is_atom(name) -> | ||
name | ||
|
||
[{:registered_name, [first_name | _other_names]}, _rest] -> | ||
first_name | ||
|
||
[ | ||
{:registered_name, []}, | ||
{:dictionary, [{:"$initial_call", initial_call} | _rest_dictionary]} | _rest | ||
] -> | ||
mfa_to_string(initial_call) | ||
|
||
[ | ||
{:registered_name, []}, | ||
{:dictionary, _rest_dictionary}, | ||
{:initial_call, initial_call} | ||
] -> | ||
mfa_to_string(initial_call) | ||
|
||
_ -> | ||
:undefined | ||
end | ||
|
||
case Process.info(pid, :current_function) do | ||
{:current_function, current_function} -> | ||
{name_or_initial_func, mfa_to_string(current_function)} | ||
|
||
nil -> | ||
{name_or_initial_func, :undefined} | ||
end | ||
end | ||
end |
Oops, something went wrong.