Skip to content
Open
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
c8502be
Initial plan
Copilot Mar 12, 2026
3a11714
docs: add RFC-conformance.md and update table-of-contents
Copilot Mar 12, 2026
8a84d37
fix: return REFUSED for out-of-zone queries (RFC 1034 §6.2)
Copilot Mar 13, 2026
f614b76
test: remove duplicate out-of-zone test
Copilot Mar 13, 2026
dcc9831
feat: add apex SOA to authority for NXDOMAIN and NODATA responses (RF…
Copilot Mar 13, 2026
e3fc9e1
refactor: use rcode/authority/answer variables in zone reader block, …
Copilot Mar 13, 2026
e412051
Refactor _update_response
indisoluble Mar 13, 2026
63562bd
Fix test-docker.yml
indisoluble Mar 13, 2026
c40e60c
feat: validate QDCOUNT == 1, return FORMERR for zero or multiple ques…
Copilot Mar 13, 2026
7b89c2e
Refactor test_dns_server_udp_handler.py
indisoluble Mar 13, 2026
9339165
Refactor test_dns_server_udp_handler.py
indisoluble Mar 13, 2026
1c57a64
feat: validate QCLASS == IN, return REFUSED for non-IN queries
Copilot Mar 13, 2026
d372199
feat: validate QCLASS == IN, return REFUSED for non-IN queries
indisoluble Mar 13, 2026
29c7a5d
feat: validate QCLASS == IN, return REFUSED for non-IN queries
indisoluble Mar 13, 2026
285781c
feat: validate opcode, return NOTIMP for non-QUERY opcodes
Copilot Mar 13, 2026
54db0e9
test: assert Level 1 DNS response header fields (QR, ID, RA, TC, AA)
Copilot Mar 14, 2026
ec09227
test: assert answer/authority/additional section shape for all respon…
Copilot Mar 14, 2026
5c862cd
test: add malformed wire input tests and resolve RFC-conformance.md u…
Copilot Mar 14, 2026
c377605
docs: add manual-validation.md and scripts/validate-level1.sh for Lev…
Copilot Mar 14, 2026
b55df57
feat: add wire-level integration tests and clean up docker behavior a…
Copilot Mar 14, 2026
41a20f5
refactor: reposition component integration tests and add Docker healt…
Copilot Mar 14, 2026
580cad9
refactor: rename test-docker.yml to test-integration.yml, add multi-q…
Copilot Mar 14, 2026
5582579
fix: align validate-tests.yml with renamed workflow and add new docs …
Copilot Mar 14, 2026
eae097e
docs: narrow integration-test docstring, fix script wording, remove r…
Copilot Mar 14, 2026
9db039d
ci: fix workflow name case in validate-tests.yml and remove redundant…
Copilot Mar 14, 2026
6f29f87
refactor: rename DnsServerZoneUpdaterThreated to DnsServerZoneUpdater…
Copilot Mar 14, 2026
0a1069e
docs: clarify validate-tests CI trigger model in project-rules.md §7
Copilot Mar 14, 2026
a6b9feb
docs: remove manual-validation artifacts (docs/manual-validation.md a…
Copilot Mar 14, 2026
3e448a2
docs: rewrite RFC-conformance.md as present-state reference document
Copilot Mar 14, 2026
fadf560
docs: promote RFC-conformance.md to Minimum Reading Set and align REA…
Copilot Mar 14, 2026
eb233d7
feat: return FORMERR for malformed wire with recoverable DNS header (…
Copilot Mar 14, 2026
7902153
docs: replace stale "remaining gap" framing with "broader-than-Level-…
Copilot Mar 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: test docker
name: test integration

on:
push:
Expand All @@ -24,11 +24,6 @@ jobs:
run: |
docker build -t a-healthy-dns:test .

- name: test docker image builds successfully
run: |
# Verify the image was created
docker images a-healthy-dns:test

- name: test docker image with alias zone configuration
run: |
# Create an isolated network for deterministic container-to-container checks
Expand Down Expand Up @@ -75,7 +70,6 @@ jobs:
DNS_HOST="127.0.0.1"
DNS_PORT="53053"
BACKEND_IP="172.28.0.10"
EXPECTED_NS="ns1.test.example.com."

wait_for_a_record() {
local fqdn="$1"
Expand All @@ -95,56 +89,83 @@ jobs:
return 1
}

assert_dns_status() {
local fqdn="$1"
local rtype="$2"
local expected_status="$3"
local output
local actual_status
for fqdn in \
"www.test.example.com" \
"www.test.other.com" \
"www.test.another.com"; do
wait_for_a_record "${fqdn}"
done

- name: test health-check-driven dns state transitions
run: |
set -euo pipefail

output="$(dig +time=1 +tries=1 +noall +comments @"${DNS_HOST}" -p "${DNS_PORT}" "${fqdn}" "${rtype}")"
actual_status="$(printf '%s\n' "${output}" | sed -n 's/.*status: \([A-Z]*\).*/\1/p' | head -n 1)"
DNS_HOST="127.0.0.1"
DNS_PORT="53053"
BACKEND_IP="172.28.0.10"
MAX_RETRIES=20

if [ "${actual_status}" = "${expected_status}" ]; then
echo "[OK] ${rtype} ${fqdn} ${actual_status}"
return 0
fi
wait_for_a_record() {
local fqdn="$1"
local answer

echo "[FAIL] ${rtype} ${fqdn} expected=${expected_status} got=${actual_status:-none}"
printf '%s\n' "${output}"
for _ in $(seq 1 "${MAX_RETRIES}"); do
answer="$(dig +short +time=1 +tries=1 @"${DNS_HOST}" -p "${DNS_PORT}" "${fqdn}" A)"
printf '%s\n' "${answer}" | grep -qx "${BACKEND_IP}" && {
echo "[OK] A present: ${fqdn}"
return 0
}
sleep 1
done

echo "[FAIL] A never appeared: ${fqdn}"
dig +nocmd +noall +comments +answer @"${DNS_HOST}" -p "${DNS_PORT}" "${fqdn}" A || true
return 1
}

assert_ns_record() {
local zone="$1"
local ns_answer

ns_answer="$(dig +short +time=1 +tries=1 @"${DNS_HOST}" -p "${DNS_PORT}" "${zone}" NS)"
printf '%s\n' "${ns_answer}" | grep -qx "${EXPECTED_NS}" || {
echo "[FAIL] NS ${zone}"
printf '%s\n' "${ns_answer}"
exit 1
}
echo "[OK] NS ${zone}"
wait_for_a_record_gone() {
local fqdn="$1"
local answer

for _ in $(seq 1 "${MAX_RETRIES}"); do
answer="$(dig +short +time=1 +tries=1 @"${DNS_HOST}" -p "${DNS_PORT}" "${fqdn}" A)"
[ -z "${answer}" ] && {
echo "[OK] A removed: ${fqdn}"
return 0
}
sleep 1
done

echo "[FAIL] A still present after backend went down: ${fqdn}"
dig +nocmd +noall +comments +answer @"${DNS_HOST}" -p "${DNS_PORT}" "${fqdn}" A || true
return 1
}

# Verify state transitions for hosted zone and all alias zones
for fqdn in \
"www.test.example.com" \
"www.test.other.com" \
"www.test.another.com"; do
wait_for_a_record "${fqdn}"
done

for zone in \
"test.example.com" \
"test.other.com" \
"test.another.com"; do
assert_ns_record "${zone}"
# Stop backend — health checks will fail; DNS must remove A records from all zones
docker stop a-healthy-dns-backend
for fqdn in \
"www.test.example.com" \
"www.test.other.com" \
"www.test.another.com"; do
wait_for_a_record_gone "${fqdn}"
done

assert_dns_status "www.test.other.com" "AAAA" "NOERROR"
assert_dns_status "missing.test.other.com" "A" "NXDOMAIN"
assert_dns_status "www.not-configured.example.org" "A" "NXDOMAIN"
# Restart backend — health checks will succeed; DNS must re-add A records to all zones
docker start a-healthy-dns-backend
for fqdn in \
"www.test.example.com" \
"www.test.other.com" \
"www.test.another.com"; do
wait_for_a_record "${fqdn}"
done

- name: test docker-compose configuration
run: |
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/validate-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: validate tests

on:
workflow_run:
workflows: ["test docker", "test python code", "test version"]
workflows: ["test integration", "test python code", "test version"]
types:
- completed
branches:
Expand All @@ -24,7 +24,7 @@ jobs:
COMMIT_SHA="${{ github.event.workflow_run.head_sha }}"
echo "Validating workflows for commit: $COMMIT_SHA"

WORKFLOWS=("Test Docker" "Test Python Code" "Test Version")
WORKFLOWS=("test integration" "test python code" "test version")

for workflow in "${WORKFLOWS[@]}"; do
echo "Checking workflow: $workflow"
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ Requires Python 3.10+.
| [docs/docker.md](docs/docker.md) | Docker deployment: image details, Compose, deployment patterns, container management, security, and orchestration |
| [docs/configuration-reference.md](docs/configuration-reference.md) | All CLI flags and Docker env vars with defaults and examples |
| [docs/troubleshooting.md](docs/troubleshooting.md) | Common issues, debugging, and operational procedures |
| [docs/RFC-conformance.md](docs/RFC-conformance.md) | RFC conformance reference: Level 1 authoritative UDP scope, minimum RFC set, current coverage per RFC, and remaining gap |
| [docs/project-brief.md](docs/project-brief.md) | Goals, non-goals, constraints, requirements |
| [docs/system-patterns.md](docs/system-patterns.md) | Architecture and design patterns |
| [docs/project-rules.md](docs/project-rules.md) | Toolchain, QA commands, CI/CD workflow, naming conventions |
Expand Down
171 changes: 171 additions & 0 deletions docs/RFC-conformance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# RFC Conformance

RFC conformance reference for **A Healthy DNS** — Level 1 authoritative UDP subset.

---

## 1. General purpose and scope

### What this server is

A Healthy DNS is an **authoritative DNS server**: it holds the definitive answers for one or more configured DNS zones and answers queries about names within those zones. It does not perform recursive lookups, does not cache answers from other servers, and does not forward queries.

This document describes RFC conformance for the current implementation scope. Its intended audience is anyone contributing to or planning work on this project — technical readers who are not necessarily DNS specialists.

### What "RFC conformance" means here

DNS behaviour is standardised in a series of documents called RFCs (Request For Comments), published by the IETF. A conformant DNS server must produce responses that match the requirements in those RFCs. Failing to do so can cause resolvers, monitoring tools, or other servers to misinterpret or reject responses.

For this project "RFC conformance" means producing wire-correct responses for every query type within the documented Level 1 scope.

### What Level 1 covers

Level 1 is a deliberately limited scope. It covers the minimum behaviour required to be a correct authoritative UDP server for the record types this project serves (A, SOA, NS, and optionally RRSIG).

| Behaviour | Level 1 target |
|---|---|
| Query is for a name **outside** all hosted zones | Return **REFUSED** |
| Query is for a name **inside** a hosted zone but the owner name does not exist | Return **NXDOMAIN** (name does not exist) |
| Owner name exists but the queried record type is absent | Return **NOERROR** with an empty answer section (a **NODATA** response) |
| NODATA or NXDOMAIN response | Include the apex **SOA** record in the authority section |
| Query cannot be parsed or has an invalid structure | Return **FORMERR** where appropriate |
| Query uses an unsupported opcode | Return **NOTIMP** |
| Query is not for the **IN** (Internet) class | Treat as unsupported |
| Query has more or fewer than exactly one question | Treat as a format error |

### What Level 1 does not cover

- Recursive or iterative resolution
- Zone transfers (AXFR / IXFR)
- EDNS(0) extension processing
- TCP transport
- IPv6 (AAAA records)
- Any record type beyond A, SOA, NS, and RRSIG

### Key term glossary

| Term | Meaning |
|---|---|
| **Authoritative** | The server holds the definitive records for a zone and sets the AA (Authoritative Answer) flag in its responses |
| **NXDOMAIN** | "Non-Existent Domain" — the queried name does not exist in the zone at all |
| **NODATA / NOERROR empty answer** | The queried name exists but has no records of the requested type; the response code is NOERROR (not an error) and the answer section is empty |
| **SOA in authority** | For negative responses (NXDOMAIN and NODATA) the server includes the zone's Start of Authority record in the authority section so that negative caching behaviour is well-defined |
| **REFUSED** | The server refuses to answer because the query is for a zone it does not serve |
| **FORMERR** | "Format Error" — the server cannot interpret the query because it is malformed |
| **Opcode** | A 4-bit field in the DNS message header indicating the type of operation (e.g. standard query, inverse query, notify) |
| **QCLASS / IN** | The class field in a DNS question; IN (Internet, value 1) is the only class used in modern DNS practice |

---

## 2. Minimum RFCs required to fully meet the described scope

The table below identifies the smallest set of RFCs whose requirements must be met to produce correct Level 1 responses. RFC 7766 (DNS over TCP) is not listed because Level 1 uses UDP only.

| RFC | Title | Why it matters here | Link |
|---|---|---|---|
| RFC 1034 | Domain Names — Concepts and Facilities | Defines the authoritative server model, zone concept, NXDOMAIN, and NOERROR semantics | https://www.rfc-editor.org/rfc/rfc1034 |
| RFC 1035 | Domain Names — Implementation and Specification | Defines the DNS wire format, QDCOUNT, opcode field, response codes FORMERR and NOTIMP, and the message header | https://www.rfc-editor.org/rfc/rfc1035 |
| RFC 2181 | Clarifications to the DNS Specification | Clarifies that a DNS message must contain exactly one question (QDCOUNT = 1); tightens several ambiguities in RFC 1035 | https://www.rfc-editor.org/rfc/rfc2181 |
| RFC 2308 | Negative Caching of DNS Queries (DNS NCACHE) | Specifies that NXDOMAIN and NODATA responses must include the apex SOA in the authority section so resolvers can cache negative results correctly | https://www.rfc-editor.org/rfc/rfc2308 |

---

## 3. Current coverage

The assessments below reflect the current implementation in `indisoluble/a_healthy_dns/dns_server_udp_handler.py` and supporting modules. All Level 1 behaviours are implemented except where noted in [§4](#4-remaining-gap).

---

### 3.1 RFC 1034 — Domain Names: Concepts and Facilities

RFC 1034 establishes the conceptual model for authoritative DNS servers: a server is authoritative for one or more zones, answers queries about names in those zones with the AA flag set, and uses defined response codes for names that are absent or that fall outside its zones.

RFC 1034 §6.2 — https://www.rfc-editor.org/rfc/rfc1034 describes the algorithm an authoritative server uses to process a query.

| Behaviour | Status | Notes |
|---|---|---|
| Authoritative Answer (AA) flag set on all responses | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:112` sets `dns.flags.AA` on every response |
| REFUSED for queries outside all served zones | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:67` returns `dns.rcode.REFUSED` when the query name does not fall within any hosted or alias zone |
| NXDOMAIN when owner name is absent from an in-zone query | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:90` returns `dns.rcode.NXDOMAIN` when `txn.get_node(relative_name)` returns nothing |
| NOERROR when owner name exists and matching records are found | **Implemented** | Handler adds the matching RRset to the answer section |
| SOA in authority for NXDOMAIN responses (RFC 2308 §3) | **Implemented** | `_build_authority_with_apex_soa()` appends the apex SOA to `response.authority` in the NXDOMAIN branch (`indisoluble/a_healthy_dns/dns_server_udp_handler.py:91`) |
| SOA in authority for NODATA responses (RFC 2308 §2.1) | **Implemented** | Same helper populates the authority section for NOERROR/empty-answer responses (`indisoluble/a_healthy_dns/dns_server_udp_handler.py:87`) |

---

### 3.2 RFC 1035 — Domain Names: Implementation and Specification

RFC 1035 defines the DNS wire format: the message header structure (including the QDCOUNT field and opcode field), all standard record types, and the FORMERR and NOTIMP response codes.

- RFC 1035 §4.1.1 defines the header format, including QDCOUNT and OPCODE — https://www.rfc-editor.org/rfc/rfc1035
- RFC 1035 §4.1.2 defines the question section format
- RFC 1035 §4.1.3 defines answer, authority, and additional section formats

| Behaviour | Status | Notes |
|---|---|---|
| Wire parsing of incoming queries | **Implemented** | `dns.message.from_wire()` is used; `dns.exception.DNSException` is caught |
| FORMERR when wire parsing fails | **Not implemented** | When `dns.message.from_wire()` raises `DNSException` the handler logs a warning and returns without sending any response (`indisoluble/a_healthy_dns/dns_server_udp_handler.py:107-109`). RFC 1035 §4.1.1 expects a FORMERR response to be sent when possible. See §4. |
| Opcode validation — NOTIMP for non-QUERY opcodes | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:114-119`: `query.opcode() != dns.opcode.QUERY` check returns `dns.rcode.NOTIMP`. Tested with STATUS (opcode 2) and NOTIFY (opcode 4). UPDATE messages (opcode 5) are rejected by dnspython's wire parser before this check is reached. |
| QDCOUNT validation — FORMERR for ≠ 1 question | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:120-141`: `len(query.question) == 1` check; zero or more-than-one questions return `dns.rcode.FORMERR`. Confirmed: dnspython preserves all questions for QDCOUNT > 1 wire messages. See also RFC 2181 §5.1. |
| QCLASS / IN class validation | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:122-135`: `question.rdclass != dns.rdataclass.IN` check; non-IN queries return `dns.rcode.REFUSED`. Project decision: REFUSED because the server exclusively serves IN-class data. |
| Wire serialisation of responses | **Implemented** | `response.to_wire()` is called before every `sendto()` |
| A, SOA, NS record types in responses | **Implemented** | All three record types are populated by the zone updater |

Note on malformed-wire inputs: dnspython raises `dns.exception.DNSException` (via subclasses `ShortHeader`, `FormError`, or `BadPointer`) for all tested malformed inputs — empty bytes, truncated packets, header-only with a missing question section, self-referential compression pointers, and fully garbage payloads. In all cases the handler's `except dns.exception.DNSException` branch fires and the packet is dropped silently with no response sent. This is confirmed by `test_handle_malformed_wire_input_drops_silently` in `tests/indisoluble/a_healthy_dns/test_dns_server_udp_handler.py`.

---

### 3.3 RFC 2181 — Clarifications to the DNS Specification

RFC 2181 corrects and tightens several ambiguities in RFC 1035. The requirement most relevant to Level 1 is found in §5.1: a DNS query must contain exactly one question; a server receiving a message with QDCOUNT ≠ 1 should return FORMERR — https://www.rfc-editor.org/rfc/rfc2181.

RFC 2181 §4 also clarifies that the AA flag applies to the entire response when the server is authoritative.

| Behaviour | Status | Notes |
|---|---|---|
| AA flag set correctly | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:112` |
| FORMERR for QDCOUNT ≠ 1 (RFC 2181 §5.1) | **Implemented** | `indisoluble/a_healthy_dns/dns_server_udp_handler.py:120-141`: `len(query.question) == 1` check. Verified: dnspython preserves all questions for QDCOUNT > 1 wire messages; the check is necessary and effective. |

No remaining Level 1 gaps in RFC 2181 coverage.

---

### 3.4 RFC 2308 — Negative Caching of DNS Queries

RFC 2308 defines how negative responses (NXDOMAIN and NODATA) must be structured so that resolvers can cache them correctly. Both response types must include the zone's apex SOA record in the authority section — RFC 2308 §3 (NXDOMAIN) and RFC 2308 §2.1 (NODATA/NOERROR) — https://www.rfc-editor.org/rfc/rfc2308.

Without the SOA in the authority section, resolvers either cannot cache the negative result or cache it with an undefined TTL, leading to repeated unnecessary queries.

RFC 2308 §5 defines the SOA minimum TTL field as the negative caching TTL; this project populates `SOA MINIMUM` via `calculate_soa_min_ttl()` in `records/time.py`.

| Behaviour | Status | Notes |
|---|---|---|
| SOA record with correct `MINIMUM` field exists in zone | **Implemented** | `soa_record.py` populates the minimum TTL from `calculate_soa_min_ttl()` |
| SOA in authority section for NXDOMAIN (RFC 2308 §3) | **Implemented** | `_build_authority_with_apex_soa()` at `indisoluble/a_healthy_dns/dns_server_udp_handler.py:29-41` retrieves the apex SOA via `txn.get(dns.name.empty, dns.rdatatype.SOA)` and appends it to `response.authority` |
| SOA in authority section for NODATA (RFC 2308 §2.1) | **Implemented** | Same helper populates the authority section for NOERROR/empty-answer responses |

No remaining Level 1 gaps in RFC 2308 coverage.

---

## 4. Remaining gap

One Level 1 behaviour is not yet fully implemented:

| Behaviour | RFC | Status | Notes |
|---|---|---|---|
| FORMERR when wire parsing fails | RFC 1035 §4.1.1 | **Not implemented** | When `dns.message.from_wire()` raises `dns.exception.DNSException` the handler logs a warning and returns without sending any response (`indisoluble/a_healthy_dns/dns_server_udp_handler.py:107-109`). A conformant server should send a FORMERR response when the header is readable. Constructing a valid FORMERR from a fully unparseable message is constrained by what dnspython exposes from a partial parse. |

All other Level 1 behaviours are implemented and covered by automated tests.

### Out of scope for Level 1

The following are explicitly not part of Level 1 scope:

- Additional out-of-zone handling such as referrals to delegated zones
- Full EDNS(0) handling (OPT pseudo-RR)
- Additional record types (AAAA, MX, TXT, etc.)
- RFC 2308 §4 referral responses — this server does not delegate sub-zones
- RFC 2308 §6 server-side negative caching — this server is authoritative and does not cache resolver results
- RFC 2181 §8 (class-in-data semantics) and §9 (TTL semantics) — informational for this scope
- RFC 7766 (DNS over TCP) — Level 1 uses UDP only
Loading