Skip to content

Flaky test: publication_manager_test race condition in relay tracker restart #3909

@alco

Description

@alco

Summary

The test "handles relation tracker restart" in publication_manager_test.exs:503 has a race condition that causes intermittent CI failures.

Observed in run 22354924262 on main (2026-02-24).

Error

** (exit) exited in: GenServer.call({:via, Registry, {:"Electric.ProcessRegistry:...", {Electric.Replication.PublicationManager.RelationTracker, nil}}}, {:remove_shape, "36215155-..."}, 5000)
    ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name

Root Cause

The test at test/electric/replication/publication_manager_test.exs:503:

  1. Line 515: GenServer.stop(relation_tracker_name) kills the RelationTracker
  2. Line 519: assert_pub_tables(ctx, [ctx.relation], 2_000) polls Postgres publication tables until they match
  3. Line 522: PublicationManager.remove_shape(ctx.stack_id, shape_handle) does a GenServer.call to the RelationTracker

The problem is that assert_pub_tables checks Postgres state, not whether the RelationTracker GenServer has been re-registered by the supervisor. There's a window where publication tables are correct (from the previous state) but the new RelationTracker process isn't yet alive or hasn't finished handle_continue(:restore_relations, ...).

Suggested Fix

Call RelationTracker.wait_for_restore(ctx.stack_id) before remove_shape on line 522. This function already exists (line 79-82 of relation_tracker.ex) and blocks until handle_continue(:restore_relations) completes, which guarantees the process is registered and ready.

Context: Broader CI Flakiness

While investigating, I looked at all sync-service workflow failures from the last 2 days: 12 failures vs 14 successes (~46% failure rate). The failures are spread across many test files — only 1 of the 12 was this publication_manager_test:

Test file Failures
shape_cache_test.exs:501 4
request_batcher_test.exs:100 2
publication_manager_test.exs:503 1
api_test.exs:925 1
delete_shape_plug_test.exs:100 1
shape_db_test.exs:553 1
shape_cache_test.exs:877 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions