Replies: 1 comment
-
|
A more complete solution would be the DNSDist Defender platform, but trying to see if this can be done in a "cheap and cheerful" way. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Yup, this was formatted (and gruntwork done) by Claude. It's 21:00 on a Saturday, and had been thinking about this as I did other chores and so proposed the core idea to Claude and let it do the documentation. Not sure if this will work as described or not; haven't yet tried, but wanted to put this down somewhere where others could comment and refute/confirm this method might work. One big question: load.
The other problem I think we have is that queries with rd=0 going back to pdns-recursor won't trigger prefetch routines, meaning that this could allow an attacker to create localized domain-specific cache expiration conditions. If we could at least let rd=0 packets tickle the counters for prefetch on the backends, then "valid" names would be refetched because legitimate traffic would keep them alive (this assumes they were already in the cache; uncached names would just be out of luck, but at least there would be no damage to the remote nameservers due to the RSDA attack.) This would require code work in pdns-recursor and might be somewhat difficult - how do we tag queries that are "sorta-no-recurse"?
Worth noting that some concepts or discussion here might be relevant in https://github.com/orgs/PowerDNS/discussions/15673 also.
The Problem
A Random Subdomain Attack (RSDA), sometimes called a "water torture" attack, floods a recursive resolver with queries for randomly-generated subdomains under a target zone — for example,
a1b2c3.attack.example.com,x9y8z7.attack.example.com, and so on. Because these names do not exist, each query results in an NXDOMAIN or SERVFAIL response after the resolver exhausts its upstream query attempts. The attack has two effects: it saturates the recursive resolver with non-cacheable work, and it hammers the authoritative servers for the target zone with a high query rate.The signal that distinguishes this attack from normal traffic is an elevated rate of NXDOMAIN and SERVFAIL responses aggregated at a common zone cut, across many distinct random labels.
The Approach
dnsdist's
DynBlockRulesGroup:setSuffixMatchRule()provides a mechanism that maps directly onto this problem. On each invocation of themaintenance()loop (called approximately once per second), dnsdist walks its response ring buffer and builds a label tree of all recently observed domain names. For each node in that tree, a Lua visitor function receives aStatNodeStatsobject containing aggregated response counts — includingnxdomainsandservfails— for that node and all its children. This means that all the random subdomain hits underattack.example.comaccumulate naturally at theattack.example.comnode, regardless of how many distinct random labels the attacker is using.When the visitor function returns
true, dnsdist installs a dynamic block matching that suffix and all names beneath it, for a configurable duration.Two-Tier Response
The proposed defense uses two thresholds with different actions, evaluated within the same visitor function.
Tier 1 — Truncate. At a lower threshold (
Aerrors perBseconds), dnsdist replies withDNSAction.Truncateto any query matching the affected zone. For UDP clients this forces a TCP retry; a legitimate resolver will reconnect over TCP while a volumetric attacker using UDP will not. This tier engages quickly and at low cost.Tier 2 — NoRecurse. At a higher threshold (
Cerrors perDseconds), dnsdist strips the RD bit from queries matching the affected zone (DNSAction.NoRecurse) and forwards them to the upstream resolver. The behavior of the upstream resolver when receiving RD=0 queries depends on the implementation.PowerDNS Recursor will serve answers from its cache for names it has previously resolved. Names absent from cache will return REFUSED. No new recursive work is generated for the attacked zone. This behavior is native to PowerDNS Recursor and requires no special configuration.
Unbound requires explicit configuration to achieve equivalent behavior. By default, Unbound's
allowaccess-control action refuses to serve dynamic cache entries for non-recursive queries — meaning RD=0 queries that would hit cache are refused rather than answered. To obtain the desired behavior — cache hits served, REFUSED for unknown names — Unbound's access-control for the dnsdist-facing interface must be set toallow_snoop. This permits non-recursive queries to read from the dynamic cache. Theallow_snoopsetting is safe in this architecture because Unbound is not directly reachable by end clients; only dnsdist forwards to it, so the cache-snooping risk that the setting is named after does not apply.In both cases, the effect under Tier 2 is the same: previously resolved names under the attacked zone are served from cache, no upstream recursion occurs for the attacked zone, and unknown names receive REFUSED.
The two tiers are intentionally ordered so that the Truncate threshold is lower and faster-triggering than the NoRecurse threshold (
A < C,B <= D).Presumptions
access-controlfor the dnsdist-facing address is set toallow_snoopsetRingBuffersSize()is sized to hold at leastC × Dentries (the larger of the two threshold products) to ensure the rate calculation has sufficient historyExample Configuration
Unbound Configuration Supplement
For deployments using Unbound as a backend, the following
unbound.confaccess-control entry is required on the interface that dnsdist uses to forward queries. Replace the address or prefix with the actual dnsdist-facing address:No equivalent configuration change is needed for PowerDNS Recursor, which serves cache hits on RD=0 queries natively.
Notes
The ring buffer in dnsdist is a fixed-size circular structure shared across all traffic. The rate calculations in the visitor derive from counts of entries observed within the lookback window, so if the buffer is undersized relative to total query volume, older entries will be overwritten before the window expires and rate estimates will be lower than actual. Size
setRingBuffersSize()conservatively.The
combined / Bandcombined / Drate estimates use the total error count accumulated over the full scan windowD, but divide byBfor the Truncate tier. This is an approximation — it assumes the error distribution is roughly uniform across the window. In practice, bursty attacks will still trigger reliably because the absolute count will be high even if the rate calculation is imprecise at burst edges.Dynamic blocks installed by
setSuffixMatchRulematch the detected suffix and all names beneath it, which is exactly the right scope for RSDA mitigation: the entire attacked zone is covered without needing to enumerate individual random labels.Beta Was this translation helpful? Give feedback.
All reactions