Skip to content

Commit

Permalink
Memory optimizations (#39)
Browse files Browse the repository at this point in the history
This PR tries to reduce the memory cost of Neurow subscriptions.

**TL;DR** 
By hibernating SSE monitors and manually running the garbage collector
in SSE processes, the PR decreases the memory cost of SSE subscriptions
from about 63ko per subscription to 30ko per subscription. In a server
that handles 80k SSE subscription, it should save about **2.5Go of
RAM**. This optimization should however slightly increase the CPU usage
because garbage collection run more often.

Note: This PR not actually change the max memory required by a Neurow
server. The server would eventually run garbage collection in order to
free memory to re-use it just after. It just release the memory without
having to wait for a gc when the server needs more RAM. This makes its
behavior more predictable, and allows to set auto-scaling rules based on
the actual RAM usage.

## Analysis
The function `Neurow.Observability.System.process_groups` analyses the process memory usage:
It groups processes by name and count their memory cost, number of
process and number of messages.

It can be used in the IEX console by calling
`Neurow.Observability.System.process_groups`, or by calling `GET
/process_stats` on the internal API.

Locally, by starting 2k concurrent subscriptions that subscribe to 10
shared topics, and then by continuously sending messages on these 10
topics, here are some interesting outliers:

| name / initial function | current function | process count | memory |
cost per subscription (ko) | cost for 80k subscriptions (mo)
| -- | -- | -- | -- | -- | -- |
cowboy_stream_h:request_process/3 |
Elixir.Neurow.PublicApi.Endpoint:loop/7 | 2000 | 80791512 | 39.45 |
3,081.95
cowboy_clear:connection_process/4 | cowboy_http:loop/1 | 2000 | 43334224
| 21.16 | 1,653.07
Elixir.Neurow.PublicApi.SSEMonitor:init/1 | gen_server:loop/7 | 2000 |
5561152 | 2.72 | 212.14
  |   |   |   |   |  
  |   |   | **Total** | **63.32** | **4,947.16** |

No issue were found in the number of messages.

## Optimizations

### Hibernate SSE Monitors
Erlang allows to manually put processes in
[hibernation](https://www.erlang.org/doc/apps/erts/erlang.html#hibernate/3):
> Puts the calling process into a wait state where its memory allocation
has been reduced as much as possible. This is useful if the process does
not expect to receive any messages soon.

SSEMonitor is a genserver, putting it in hibernation is simple: It just
requires to return `{:ok, state, :hibernate}` from the init function.

### Manual garbage collection in SSE subscriptions
Because SSE requests handlers are not genservers, it is much more
complex to hibernate them.

`:erlang.hibernate` exists and use it with `Process.send_after` could
allow to implement a ticker, but `:erlang.hibernate` erases the process
call stack. That means all instructions that should run after that
`Neurow.PublicApi.Endpoint#subscribe` returns will actually not run if
the SSE process hibernates.

So instead of hibernating we just first force a garbage collection
during each SSE passive loop. Compare to an actual hibernation the call
stack is not deleted, and all live data are not moved to a continuous
heap. But it still provides significant improvements.

### Results

| name / initial function | Current function | process count | memory |
cost per subscription (Ko) | cost for 80k subscriptions (Mo)
| -- | -- | -- | -- | -- | -- |
cowboy_stream_h:request_process/3 |
Elixir.Neurow.PublicApi.Endpoint:loop/7 | 2000 | 14767616 | 7.21 |
563.34
cowboy_clear:connection_process/4 | cowboy_http:loop/1 | 2000 | 43316544
| 21.15 | 1,652.40
Elixir.Neurow.PublicApi.SSEMonitor:init/1 | gen_server:loop/7 | 2000 |
2656000 | 1.30 | 101.32
  |   |   |   |   |  
  |   |   | Total | 29.66 | 2,317.05 |

The subscription cost is reduced from 63ko to 30ko. In a server that
handles 80k SSE subscriptions, it should save about **2.5Go of RAM**

A debounce is added to the manual garbage collection to have max 1
manual GC per SSE process every 60s. Actually the SSE passive loop is
triggered much less often, the amount of manual GC should be much lower.
A manual GC is always triggered during the first run of the SSE passive
loop.
  • Loading branch information
achouippe authored Nov 6, 2024
1 parent 7fad989 commit 9904d56
Show file tree
Hide file tree
Showing 14 changed files with 338 additions and 200 deletions.
3 changes: 0 additions & 3 deletions neurow/lib/metric_plug_exporter.ex

This file was deleted.

3 changes: 1 addition & 2 deletions neurow/lib/neurow/application.ex
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,7 @@ defmodule Neurow.Application do
[]
end

MetricsPlugExporter.setup()
Neurow.Stats.setup()
Neurow.Observability.setup()
JOSE.json_module(:jiffy)

opts = [strategy: :one_for_one, name: Neurow.Supervisor]
Expand Down
2 changes: 1 addition & 1 deletion neurow/lib/neurow/broker/receiver_shard_manager.ex
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ defmodule Neurow.Broker.ReceiverShardManager do
end

def rotate do
Neurow.Stats.MessageBroker.inc_history_rotate()
Neurow.Observability.MessageBrokerStats.inc_history_rotate()

Enum.each(receiver_shards(), fn {_shard, pid} ->
send(pid, {:rotate})
Expand Down
27 changes: 23 additions & 4 deletions neurow/lib/neurow/internal_api/endpoint.ex
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ defmodule Neurow.InternalApi.Endpoint do
alias Neurow.Broker.Message

use Plug.Router
plug(MetricsPlugExporter)
plug(Neurow.Observability.MetricsPlugExporter)

plug(Neurow.JwtAuthPlug,
credential_headers: ["x-interservice-authorization", "authorization"],
Expand All @@ -16,8 +16,15 @@ defmodule Neurow.InternalApi.Endpoint do
&Neurow.Configuration.internal_api_verbose_authentication_errors/0,
max_lifetime: &Neurow.Configuration.internal_api_jwt_max_lifetime/0,
send_forbidden: &Neurow.InternalApi.Endpoint.send_forbidden/3,
inc_error_callback: &Neurow.Stats.Security.inc_jwt_errors_internal/0,
exclude_path_prefixes: ["/ping", "/nodes", "/cluster_size_above", "/history"]
inc_error_callback: &Neurow.Observability.SecurityStats.inc_jwt_errors_internal/0,
exclude_path_prefixes: [
"/ping",
"/nodes",
"/cluster_size_above",
"/history",
"/process_stats",
"/favicon.ico"
]
)

plug(:match)
Expand Down Expand Up @@ -55,6 +62,18 @@ defmodule Neurow.InternalApi.Endpoint do
)
end

get "/process_stats" do
conn
|> put_resp_header("content-type", "application/json")
|> send_resp(
200,
Jason.encode!(
Neurow.Observability.SystemStats.process_groups()
|> Enum.map(&Map.from_struct/1)
)
)
end

get "/cluster_size_above/:size" do
size = String.to_integer(size)
cluster_size = length(Node.list()) + 1
Expand Down Expand Up @@ -97,7 +116,7 @@ defmodule Neurow.InternalApi.Endpoint do
end)
end)

Neurow.Stats.MessageBroker.inc_message_published(issuer)
Neurow.Observability.MessageBrokerStats.inc_message_published(issuer)

conn
|> put_resp_header("content-type", "application/json")
Expand Down
44 changes: 44 additions & 0 deletions neurow/lib/neurow/observability/http_interfaces_stats.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
defmodule Neurow.Observability.HttpInterfacesStats do
use Prometheus.Metric

def setup() do
Summary.declare(
name: :http_request_duration_ms,
labels: [:interface],
help: "HTTP request duration"
)

Counter.declare(
name: :http_request_count,
labels: [:interface, :status],
help: "HTTP request count"
)

Summary.reset(name: :http_request_duration_ms, labels: [:public_api])
Summary.reset(name: :http_request_duration_ms, labels: [:internal_api])

# Please read https://github.com/beam-telemetry/cowboy_telemetry
:telemetry.attach_many(
"cowboy_telemetry_handler",
[
[:cowboy, :request, :stop]
],
&Neurow.Observability.HttpInterfacesStats.handle_event/4,
nil
)
end

def handle_event([:cowboy, :request, :stop], measurements, metadata, _config) do
endpoint =
case metadata[:req][:ref] do
Neurow.PublicApi.Endpoint.HTTP -> :public_api
Neurow.InternalApi.Endpoint.HTTP -> :internal_api
end

duration_ms = System.convert_time_unit(measurements[:duration], :native, :millisecond)
resp_status = metadata[:resp_status]

Counter.inc(name: :http_request_count, labels: [endpoint, resp_status])
Summary.observe([name: :http_request_duration_ms, labels: [endpoint]], duration_ms)
end
end
82 changes: 82 additions & 0 deletions neurow/lib/neurow/observability/message_broker_stats.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
defmodule Neurow.Observability.MessageBrokerStats do
use Prometheus.Metric

def setup() do
Gauge.declare(
name: :concurrent_subscription,
labels: [:issuer],
help: "Amount of concurrent topic subscriptions"
)

Counter.declare(
name: :subscription_lifecycle,
labels: [:kind, :issuer],
help: "Count subscriptions and unsubscriptions"
)

Summary.declare(
name: :subscription_duration_ms,
labels: [:issuer],
help: "Duration of topic subscriptions"
)

Counter.declare(
name: :message,
labels: [:kind, :issuer],
help: "Messages sent through topic subscriptions"
)

Counter.declare(
name: :history_rotate,
help: "History rotate counter"
)

Gauge.declare(
name: :topic_count,
help: "Number of topics in the message history"
)

Counter.reset(name: :history_rotate)

Gauge.set([name: :topic_count], 0)

Enum.each(Neurow.Configuration.issuers(), fn issuer ->
Gauge.set([name: :concurrent_subscription, labels: [issuer]], 0)
Counter.reset(name: :subscription_lifecycle, labels: [:created, issuer])
Counter.reset(name: :subscription_lifecycle, labels: [:released, issuer])
Counter.reset(name: :message, labels: [:published, issuer])
Counter.reset(name: :message, labels: [:sent, issuer])
Summary.reset(name: :subscription_duration_ms, labels: [issuer])
end)

Periodic.start_link(
run: fn ->
Gauge.set([name: :topic_count], Neurow.Broker.ReceiverShardManager.topic_count())
end,
every: :timer.seconds(10)
)
end

def inc_subscriptions(issuer) do
Counter.inc(name: :subscription_lifecycle, labels: [:created, issuer])
Gauge.inc(name: :concurrent_subscription, labels: [issuer])
end

def dec_subscriptions(issuer, duration_ms) do
Counter.inc(name: :subscription_lifecycle, labels: [:released, issuer])
Gauge.dec(name: :concurrent_subscription, labels: [issuer])
Summary.observe([name: :subscription_duration_ms, labels: [issuer]], duration_ms)
end

def inc_message_published(issuer) do
Counter.inc(name: :message, labels: [:published, issuer])
end

def inc_message_sent(issuer) do
Counter.inc(name: :message, labels: [:sent, issuer])
end

def inc_history_rotate() do
Counter.inc(name: :history_rotate)
end
end
3 changes: 3 additions & 0 deletions neurow/lib/neurow/observability/metric_plug_exporter.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
defmodule Neurow.Observability.MetricsPlugExporter do
use Prometheus.PlugExporter
end
11 changes: 11 additions & 0 deletions neurow/lib/neurow/observability/observability.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
defmodule Neurow.Observability do
use Prometheus.Metric

def setup() do
Neurow.Observability.MessageBrokerStats.setup()
Neurow.Observability.HttpInterfacesStats.setup()
Neurow.Observability.MetricsPlugExporter.setup()
Neurow.Observability.SecurityStats.setup()
Neurow.Observability.SystemStats.setup()
end
end
19 changes: 19 additions & 0 deletions neurow/lib/neurow/observability/security_stats.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
defmodule Neurow.Observability.SecurityStats do
use Prometheus.Metric

def setup() do
Counter.declare(
name: :jwt_errors,
labels: [:interface],
help: "JWT Errors"
)
end

def inc_jwt_errors_public() do
Counter.inc(name: :jwt_errors, labels: [:public])
end

def inc_jwt_errors_internal() do
Counter.inc(name: :jwt_errors, labels: [:internal])
end
end
108 changes: 108 additions & 0 deletions neurow/lib/neurow/observability/system_stats.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
defmodule Neurow.Observability.SystemStats do
use Prometheus.Metric

def setup() do
Gauge.declare(
name: :memory_usage,
help: "Memory usage"
)

Boolean.declare(
name: :stopping,
help: "The node is currently stopping"
)

Gauge.set([name: :memory_usage], 0)
Boolean.set([name: :stopping], false)

Periodic.start_link(
run: fn -> Gauge.set([name: :memory_usage], :recon_alloc.memory(:usage)) end,
every: :timer.seconds(10)
)
end

def report_shutdown() do
Boolean.set([name: :stopping], true)
end

defmodule ProcessesStats do
defstruct [
:name_or_initial_func,
:current_func,
process_count: 0,
memory: 0,
message_queue_len: 0
]
end

# Group process by name or initial function, and current function.
# Then sort by memory usage and return the top consuming processe groups
def process_groups(result_count \\ 50) do
Process.list()
|> Enum.reduce(%{}, fn pid, acc ->
{name_or_initial_func, current_func} = grouping_attributes(pid)

Map.update(
acc,
{name_or_initial_func, current_func},
%ProcessesStats{
name_or_initial_func: name_or_initial_func,
current_func: current_func
},
fn current_stats ->
process_info = Process.info(pid, [:memory, :message_queue_len])

%{
current_stats
| process_count: current_stats.process_count + 1,
memory: current_stats.memory + (process_info[:memory] || 0),
message_queue_len:
current_stats.message_queue_len + (process_info[:message_queue_len] || 0)
}
end
)
end)
|> Map.values()
|> Enum.sort(&(&1.memory > &2.memory))
|> Enum.take(result_count)
end

defp mfa_to_string({module, function, arity}) do
"#{module}:#{function}/#{arity}"
end

defp grouping_attributes(pid) do
name_or_initial_func =
case Process.info(pid, [:registered_name, :dictionary, :initial_call]) do
[{:registered_name, name} | _rest] when is_atom(name) ->
name

[{:registered_name, [first_name | _other_names]}, _rest] ->
first_name

[
{:registered_name, []},
{:dictionary, [{:"$initial_call", initial_call} | _rest_dictionary]} | _rest
] ->
mfa_to_string(initial_call)

[
{:registered_name, []},
{:dictionary, _rest_dictionary},
{:initial_call, initial_call}
] ->
mfa_to_string(initial_call)

_ ->
:undefined
end

case Process.info(pid, :current_function) do
{:current_function, current_function} ->
{name_or_initial_func, mfa_to_string(current_function)}

nil ->
{name_or_initial_func, :undefined}
end
end
end
Loading

0 comments on commit 9904d56

Please sign in to comment.