Skip to content

Add Registry partition_by option for duplicate registries #14654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

studzien
Copy link

@studzien studzien commented Jul 16, 2025

Summary

This PR adds key-based partitioning for duplicate Registry entries, allowing optimization for workloads with many keys and few entries per key (e.g., many topics with few subscribers each).

Background

This feature addresses performance concerns first noticed in phoenixframework/phoenix_pubsub#198, where Registry's PID-based partitioning wasn't optimal for workloads with many topics and relatively few subscribers per topic.

Solution

Enhanced Keys API

  • :unique: Traditional unique registry behavior
  • :duplicate: Traditional duplicate registry with PID-based partitioning (default)
  • {:duplicate, :pid}: Explicit PID-based partitioning for duplicate registries
  • {:duplicate, :key}: New key-based partitioning for duplicate registries

Performance Optimization

  • Key-based lookups with {:duplicate, :key} now only need to check a single partition
  • Reduces lookup complexity from O(partitions) to O(1) for key-based operations

Changes

lib/elixir/lib/registry.ex:

  • Updated keys typespec to include {:duplicate, :key} and {:duplicate, :pid}
  • Enhanced all internal functions to handle key-based partitioning strategy
  • Updated documentation with usage examples and performance guidance
  • Modified lookup/2, values/3, match/3, count/1, count_match/3, select/2 and other functions to support both partitioning strategies

lib/elixir/test/elixir/registry_test.exs:

  • Added comprehensive tests for all partitioning strategies
  • Extended cleanup and functionality tests to cover key-based partitioning
  • Added specific test cases for {:duplicate, :key} behavior

API Changes

New Keys Format

# Traditional duplicate registry (PID partitioning)
Registry.start_link(keys: :duplicate, name: MyRegistry)

# Explicit PID partitioning  
Registry.start_link(keys: {:duplicate, :pid}, name: MyRegistry)

# New key-based partitioning
Registry.start_link(keys: {:duplicate, :key}, name: MyRegistry)

Usage Guidelines

  • Use :duplicate or {:duplicate, :pid}: When you have few keys with many entries (e.g., one topic with many subscribers)
  • Use {:duplicate, :key}: When you have many keys with few entries each (e.g., many topics with few subscribers)

Backward Compatibility

This change is fully backward compatible:

  • Default behavior remains unchanged (:duplicate still uses PID partitioning)
  • Existing code continues to work without modification
  • Only adds new functionality for those who opt into {:duplicate, :key} partitioning
  • Key-based partitioning is only supported for duplicate registries (validated at startup)

Related

🤖 Generated with Claude Code

@@ -255,7 +256,7 @@ defmodule Registry do

defp whereis_name(registry, key) do
case key_info!(registry) do
{:unique, partitions, key_ets} ->
{:unique, partitions, key_ets, _} ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a new element, I wonder if we should introduce a new internal type. So we have :unique, :duplicate and :duplicate_by_key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could even have :duplicate_by_pid and :duplicate_by_key. Or, alternatively, have the type be {:duplicate, :pid | :key}. But basically there is no need to store more information that only applies to duplicates and not unique.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense, considering that the new key doesn't make sense for a unique registry. Just wondering if :duplicate should also have an explicit alias, something like :duplicate_by_pid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we commented at the same time. Let's go with either:duplicate_by_pid and :duplicate_by_key or {:duplicate, partition_by}. Your call. The second one will allow us to partially match when we don't care about the partition.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll also make sure that :duplicate keeps existing behaviour of :duplicate_by_pid (or {:duplicate, :pid}})

@studzien
Copy link
Author

studzien commented Jul 17, 2025

@josevalim I updated it to use {:duplicate, :key} or {:duplicate, :pid}.
I was not certain if the partition supervisor strategy should change, but I think it should stay the same (:one_for_one) since we partition both ets tables by the same key (:key or :pid).

Will update the PR description shortly.

@@ -467,7 +511,7 @@ defmodule Registry do
:error
end

{kind, _, _} ->
{kind, _, _, _} ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This claude needs to be reverted but we need to be careful with #{kind} below, that will no longer work.

Comment on lines +1624 to +1626

arg =
{kind, registry, i, partitions, key_partition, pid_partition, listeners, compressed}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverting:

Suggested change
arg =
{kind, registry, i, partitions, key_partition, pid_partition, listeners, compressed}
arg = {kind, registry, i, partitions, key_partition, pid_partition, listeners, compressed}

Copy link
Member

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have dropped a few comments.

I also think we need to better parameterize the test:

  1. Break the current RegistryTest into Registry.UniqueTest and Registry.DuplicateTest
  2. Make sure the DuplicateTest is parameterized by partition count AND partition by

The new tests you added can be part of the new DuplicateTest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants