Defending Against Random Subdomain Attacks with dnsdist #16982

johnhtodd · 2026-03-15T04:13:40Z

johnhtodd
Mar 15, 2026

Yup, this was formatted (and gruntwork done) by Claude. It's 21:00 on a Saturday, and had been thinking about this as I did other chores and so proposed the core idea to Claude and let it do the documentation. Not sure if this will work as described or not; haven't yet tried, but wanted to put this down somewhere where others could comment and refute/confirm this method might work. One big question: load.

The other problem I think we have is that queries with rd=0 going back to pdns-recursor won't trigger prefetch routines, meaning that this could allow an attacker to create localized domain-specific cache expiration conditions. If we could at least let rd=0 packets tickle the counters for prefetch on the backends, then "valid" names would be refetched because legitimate traffic would keep them alive (this assumes they were already in the cache; uncached names would just be out of luck, but at least there would be no damage to the remote nameservers due to the RSDA attack.) This would require code work in pdns-recursor and might be somewhat difficult - how do we tag queries that are "sorta-no-recurse"?

Worth noting that some concepts or discussion here might be relevant in https://github.com/orgs/PowerDNS/discussions/15673 also.

The Problem

A Random Subdomain Attack (RSDA), sometimes called a "water torture" attack, floods a recursive resolver with queries for randomly-generated subdomains under a target zone — for example, a1b2c3.attack.example.com, x9y8z7.attack.example.com, and so on. Because these names do not exist, each query results in an NXDOMAIN or SERVFAIL response after the resolver exhausts its upstream query attempts. The attack has two effects: it saturates the recursive resolver with non-cacheable work, and it hammers the authoritative servers for the target zone with a high query rate.

The signal that distinguishes this attack from normal traffic is an elevated rate of NXDOMAIN and SERVFAIL responses aggregated at a common zone cut, across many distinct random labels.

The Approach

dnsdist's DynBlockRulesGroup:setSuffixMatchRule() provides a mechanism that maps directly onto this problem. On each invocation of the maintenance() loop (called approximately once per second), dnsdist walks its response ring buffer and builds a label tree of all recently observed domain names. For each node in that tree, a Lua visitor function receives a StatNodeStats object containing aggregated response counts — including nxdomains and servfails — for that node and all its children. This means that all the random subdomain hits under attack.example.com accumulate naturally at the attack.example.com node, regardless of how many distinct random labels the attacker is using.

When the visitor function returns true, dnsdist installs a dynamic block matching that suffix and all names beneath it, for a configurable duration.

Two-Tier Response

The proposed defense uses two thresholds with different actions, evaluated within the same visitor function.

Tier 1 — Truncate. At a lower threshold (A errors per B seconds), dnsdist replies with DNSAction.Truncate to any query matching the affected zone. For UDP clients this forces a TCP retry; a legitimate resolver will reconnect over TCP while a volumetric attacker using UDP will not. This tier engages quickly and at low cost.

Tier 2 — NoRecurse. At a higher threshold (C errors per D seconds), dnsdist strips the RD bit from queries matching the affected zone (DNSAction.NoRecurse) and forwards them to the upstream resolver. The behavior of the upstream resolver when receiving RD=0 queries depends on the implementation.

PowerDNS Recursor will serve answers from its cache for names it has previously resolved. Names absent from cache will return REFUSED. No new recursive work is generated for the attacked zone. This behavior is native to PowerDNS Recursor and requires no special configuration.

Unbound requires explicit configuration to achieve equivalent behavior. By default, Unbound's allow access-control action refuses to serve dynamic cache entries for non-recursive queries — meaning RD=0 queries that would hit cache are refused rather than answered. To obtain the desired behavior — cache hits served, REFUSED for unknown names — Unbound's access-control for the dnsdist-facing interface must be set to allow_snoop. This permits non-recursive queries to read from the dynamic cache. The allow_snoop setting is safe in this architecture because Unbound is not directly reachable by end clients; only dnsdist forwards to it, so the cache-snooping risk that the setting is named after does not apply.

In both cases, the effect under Tier 2 is the same: previously resolved names under the attacked zone are served from cache, no upstream recursion occurs for the attacked zone, and unknown names receive REFUSED.

The two tiers are intentionally ordered so that the Truncate threshold is lower and faster-triggering than the NoRecurse threshold (A < C, B <= D).

Presumptions

dnsdist ≥ 1.9.0 (required for per-block action selection from the visitor function's return value)
PowerDNS Recursor and/or Unbound are the downstream recursive resolvers
If Unbound is in use, its access-control for the dnsdist-facing address is set to allow_snoop
End clients have no direct access to either resolver — all queries are proxied through dnsdist
setRingBuffersSize() is sized to hold at least C × D entries (the larger of the two threshold products) to ensure the rate calculation has sufficient history

Example Configuration

-- Ring buffer must hold enough entries to support the larger threshold window.
-- With C=500 errors and D=10 seconds, you need at least 5000 response entries.
-- Size generously to accommodate overall traffic volume.
setRingBuffersSize(100000)

-- Thresholds - adjust to suit your traffic baseline
local A = 100   -- truncate trigger: errors per B seconds
local B = 5     -- truncate window: seconds
local C = 500   -- norecurse trigger: errors per D seconds
local D = 10    -- norecurse window: seconds
local Z = 30    -- truncate block duration: seconds
local ZZ = 120  -- norecurse block duration: seconds

local dbr = dynBlockRulesGroup()

-- The outer 'seconds' parameter controls how far back the ring buffer is
-- scanned. Use the larger window (D) so both tiers have sufficient history.
dbr:setSuffixMatchRule(D, "RSDA detection", ZZ, DNSAction.NoRecurse,
  function(parent, current, cumulative)
    -- Combine NXDOMAIN and SERVFAIL counts across all children of this node.
    -- This aggregates all random subdomain hits at their common zone cut.
    local combined = cumulative.nxdomains + cumulative.servfails

    -- Evaluate NoRecurse tier first (higher threshold, longer window)
    local norecurse_rate = combined / D
    if norecurse_rate > C then
      return true, "RSDA-norecurse", ZZ, DNSAction.NoRecurse
    end

    -- Evaluate Truncate tier (lower threshold, shorter window)
    local truncate_rate = combined / B
    if truncate_rate > A then
      return true, "RSDA-truncate", Z, DNSAction.Truncate
    end

    return false
  end
)

function maintenance()
  dbr:apply()
end

Unbound Configuration Supplement

For deployments using Unbound as a backend, the following unbound.conf access-control entry is required on the interface that dnsdist uses to forward queries. Replace the address or prefix with the actual dnsdist-facing address:

server:
    # Allow dnsdist to query cache non-recursively (RD=0).
    # allow_snoop is safe here because Unbound is not directly
    # reachable by end clients - only dnsdist forwards to it.
    access-control: 127.0.0.1/32 allow_snoop

No equivalent configuration change is needed for PowerDNS Recursor, which serves cache hits on RD=0 queries natively.

Notes

The ring buffer in dnsdist is a fixed-size circular structure shared across all traffic. The rate calculations in the visitor derive from counts of entries observed within the lookback window, so if the buffer is undersized relative to total query volume, older entries will be overwritten before the window expires and rate estimates will be lower than actual. Size setRingBuffersSize() conservatively.

The combined / B and combined / D rate estimates use the total error count accumulated over the full scan window D, but divide by B for the Truncate tier. This is an approximation — it assumes the error distribution is roughly uniform across the window. In practice, bursty attacks will still trigger reliably because the absolute count will be high even if the rate calculation is imprecise at burst edges.

Dynamic blocks installed by setSuffixMatchRule match the detected suffix and all names beneath it, which is exactly the right scope for RSDA mitigation: the entire attacked zone is covered without needing to enumerate individual random labels.

johnhtodd · 2026-03-15T04:32:54Z

johnhtodd
Mar 15, 2026
Author

A more complete solution would be the DNSDist Defender platform, but trying to see if this can be done in a "cheap and cheerful" way.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PowerDNS

Defending Against Random Subdomain Attacks with dnsdist #16982

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

PowerDNS

Defending Against Random Subdomain Attacks with dnsdist #16982

Uh oh!

Uh oh!

johnhtodd Mar 15, 2026

The Problem

The Approach

Two-Tier Response

Presumptions

Example Configuration

Unbound Configuration Supplement

Notes

Replies: 1 comment

Uh oh!

johnhtodd Mar 15, 2026 Author

johnhtodd
Mar 15, 2026

johnhtodd
Mar 15, 2026
Author