Skip to content

Commit 1d3f8d2

Browse files
routing table
1 parent 6265f16 commit 1d3f8d2

File tree

1 file changed

+175
-60
lines changed

1 file changed

+175
-60
lines changed

src/routing/kad-dht.md

+175-60
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,11 @@ description: >
66
InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT
77
specification, adapting and adding features to support IPFS-specific
88
requirements.
9-
date: 2022-08-26
9+
date: FIXME
1010
maturity: reliable
1111
editors:
1212
- name: Guillaume Michel
13+
url: https://guillaume.michel.id
1314
github: guillaumemichel
1415
affiliation:
1516
name: Shipyard
@@ -28,6 +29,10 @@ requirements.
2829

2930
FIXME:
3031

32+
Distributed Key-Value Store
33+
34+
Goal of DHT is to find the closest peers to some key (in a specific geometry). Once this routing to the closest nodes is possible, nodes can interact with these nodes in various ways, including in asking them to store and serve data.
35+
3136
### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht)
3237

3338
The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT.
@@ -39,112 +44,218 @@ in the [Amino DHT](#relation-to-the-amino-dht). If you're designing a new
3944
Kademlia-based DHT for use with IPFS, some details in this specification may
4045
appear overly specific or prescriptive.
4146

42-
### Relation to the [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino)
47+
### Relation to the [Amino DHT](#amino-dht)
4348

44-
The Amino DHT is the swarm of peers also referred to as the _Public IPFS DHT_.
45-
It implements the IPFS Kademlia DHT specification and uses the protocol
46-
identifier `/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino
47-
DHT
48-
Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers).
49+
Nodes participating in the [Amino DHT Swarm](#amino-dht) MUST implement the
50+
IPFS Kademlia DHT specification. The IPFS Kademlia DHT specification MAY be
51+
used in other DHT swarms as well.
4952

50-
The Amino DHT is utilized by multiple IPFS implementations, including
51-
[`kubo`](https://github.com/ipfs/kubo) and
52-
[`helia`](https://github.com/ipfs/helia). Multiple DHT swarms can coexist and
53-
nodes MAY participate in multiple DHT swarms. DHT swarms can be either public
54-
or private.
53+
## DHT Swarms
5554

56-
Note that there could be multiple distinct DHT swarms using the same protocol
57-
identifier as long as they don't have any common peers. This practice is
58-
discouraged as networks will immediately merge if they enter in contact. Each
59-
DHT swarm SHOULD have a dedicated protocol identifier.
55+
A DHT swarm is a group of interconnected nodes running the IPFS Kademlia DHT protocol, collectively identified by a unique protocol identifier. IPFS nodes MAY participate in multiple DHT swarms simultaneously. DHT swarms can be either public or private.
6056

61-
## Protocol Parameters
57+
### Protocol Identifier
6258

63-
FIXME: move parameters to appropriate sections
59+
All nodes participating in the same DHT swarm MUST use the same libp2p protocol
60+
identifier. The libp2p protocol identifier uniquely identifies a DHT swarm. It
61+
follows the format `/<swarm-prefix>/kad/<version>`, e.g `/ipfs/kad/1.0.0` for
62+
the Amino DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local
63+
DHT swarm.
6464

65-
The IPFS Kademlia DHT defines a number of Client and Server parameters that
66-
need to be set to ensure the DHT operates correctly as a system.
65+
Note that there could be multiple distinct DHT swarms using the same libp2p
66+
protocol identifier as long as they don't have any common peers. This practice
67+
is discouraged as networks will immediately merge if they enter in contact.
68+
Each DHT swarm SHOULD have a dedicated protocol identifier.
6769

68-
### Protocol Identifier
70+
### Amino DHT
6971

70-
All nodes participating in the same DHT swarm MUST use the same protocol
71-
identifier. The protocol identifier uniquely identifies a DHT swarm. It follows
72-
the format `/<swarm-prefix>/kad/<version>`, e.g `/ipfs/kad/1.0.0` for the Amino
73-
DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local DHT swarm.
72+
The [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) is
73+
the swarm of peers also referred to as the _Public IPFS DHT_. It implements the
74+
IPFS Kademlia DHT specification and uses the protocol identifier
75+
`/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino DHT
76+
Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers).
7477

75-
### Routing Table Bucket Size
78+
The Amino DHT is utilized by multiple IPFS implementations, including
79+
[`kubo`](https://github.com/ipfs/kubo) and
80+
[`helia`](https://github.com/ipfs/helia).
81+
82+
### Client and Server Mode
83+
84+
A node operating in Server Mode (or DHT Server) is responsible for responding
85+
to lookup queries from other nodes and storing records. It stores a share of
86+
the global DHT state, and needs to ensure that this state is up-to-date.
87+
88+
A node operating in Client Mode (or DHT Client) is simply a client able to make
89+
requests to DHT Servers. DHT Client don't answer to queries and don't store
90+
records.
91+
92+
Having a large number of reliable DHT servers benefits the network by
93+
distributing the load of handling queries and storing records. Nodes SHOULD
94+
operate in Server Mode if they are publicly reachable and have sufficient
95+
resources. Conversely, nodes behind NATs or firewalls, or with intermittent
96+
availability, low bandwidth, or limited CPU, RAM, or storage resources, SHOULD
97+
operate in Client Mode. Operating a DHT server without the capacity to respond
98+
quickly to queries negatively impacts network performance.
99+
100+
DHT Servers advertise the libp2p Kademlia protocol identifier via the [libp2p
101+
identify
102+
protocol](https://github.com/libp2p/specs/blob/master/identify/README.md). In
103+
addition DHT Servers accept incoming streams using the Kademlia protocol
104+
identifier. DHT Clients do not advertise support for the libp2p Kademlia
105+
protocol identifier. In addition they do not offer the Kademlia protocol
106+
identifier for incoming streams.
107+
108+
## Kademlia Keyspace
109+
110+
Kademlia [0] operates on a binary keyspace defined as $\{0, 1\}^m$. In
111+
particular, the IPFS Kademlia DHT uses a keyspace of length $m=256, containing
112+
all bitstrings of 256 bits. The distance between any pair of keys is defined as
113+
the bitwise XOR of the two keys, resulting in a new key representing the
114+
distance between the two keys. This keyspace is used for indexing both nodes
115+
and content.
116+
117+
The Kademlia node identifier is derived from the node's [Peer
118+
ID](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md). The
119+
Kademlia node identifier is computed as the digest of the SHA256 hash function
120+
of the binary representation of the Peer ID. The Kademlia identifier is a
121+
256-bit number, which is used as the node's identifier in the Kademlia
122+
keyspace.
123+
124+
Example:
125+
126+
```sh
127+
PeerID b58 representation: 12D3KooWKudojFn6pff7Kah2Mkem3jtFfcntpG9X3QBNiggsYxK2
128+
PeerID hex representation: 0024080112209e3b433cbd31c2b8a6ebbdca998bd0f4c2141c9c9af5422e976051b1e63af14d
129+
Kademlia identifier (hex): e43d28f0996557c0d5571d75c62a57a59d7ac1d30a51ecedcdb9d5e4afa56100
130+
```
76131

77-
DHT Servers MUST have a routing table bucket size of `20` (see [Routing
78-
Table](#routing-table)). This corresponds to the `k` value as defined in the
79-
original Kademlia paper [0]. The `k` value is also used as a replication factor
80-
and defines how many peers are returned to a lookup request.
132+
## Routing Table
81133

82-
While DHT Client technically don't need to store a routing table, DHT Clients
83-
MUST nonetheless use a replication factor of `20`. If Client implementations
84-
decide to include a routing table, they SHOULD use a bucket size of `20`.
134+
The Kademlia Routing Table maintains contact information about other DHT
135+
Servers in the network. It has knowledge about all nearby nodes and
136+
progressively fewer nodes as the XOR distance increases. This structure allows
137+
efficient and rapid navigation of the network during lookups.
85138

86-
### Provide Validity
139+
The Routing Table MUST contain information about at least `k` DHT Servers whose
140+
Kademlia Identifier shares a common prefix of length `l` with the local node,
141+
for every `l` in `[0, 255]`, provided such nodes exist. The set of `k` peers
142+
sharing a common prefix of length `l` with the local node is called the
143+
_bucket_ `l`.
87144

88-
Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT
89-
Server. DHT Servers MUST implement a Provide Validity of `48h`.
145+
In practice, buckets with smaller indices will typically be full, as many nodes
146+
in the network share shorter prefix lengths with the local node. Conversely,
147+
buckets beyond a certain index usually remain empty, since it's statistically
148+
unlikely that any node will have an identifier sharing a very long common
149+
prefix with the local node. For more information see [bucket population
150+
measurements](https://github.com/probe-lab/network-measurements/blob/master/results/rfm19-dht-routing-table-health.md#peers-distribution-in-the-k-buckets).
90151

91-
### Provider Record Republish Interval
152+
The IPFS Kademlia DHT uses a bucket size of `k = 20`. This corresponds to the
153+
`k` value as defined in the original Kademlia paper [0]. The `k` value is also
154+
used as a replication factor and defines how many peers are returned to a
155+
lookup request.
92156

93-
Because of the churn in the network, Provider Records need to be republished
94-
more often than their validity period. DHT Clients SHOULD republish Provider
95-
Records every `22h`
96-
([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)).
157+
Note that DHT Clients are never included in a Routing Table.
97158

98-
### Provider Addresses TTL
159+
Each DHT Server MUST store the public
160+
[multiaddresses](https://github.com/libp2p/specs/blob/master/addressing/README.md)
161+
for every node in its Routing Table. DHT Servers MUST discard nodes with only
162+
private and/or relay multiaddresses. Additionally, DHT Servers must verify that
163+
these nodes are reachable and replace any nodes that are no longer accessible.
99164

100-
DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the
101-
`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the
102-
content provider alongside the provide record, avoiding an additional DHT walk
103-
for the Client
104-
([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)).
165+
### Replacement Policy
105166

106-
### Concurrency
167+
Nodes MUST NOT be removed from the Routing Table as long as they remain online.
168+
Therefore, the bucket replacement policy is based on seniority, ensuring that
169+
the most stable peers are eventually retained in the Routing Table.
107170

108-
Implementation specific. Recommendation is `10`
171+
#### IP Diversity Filter
109172

110-
### Resiliency
173+
SHOULD implement
111174

112-
Implementation specific. Recommendation is `3`
175+
FIXME:
113176

114-
### Routing Table Refresh Interval
177+
### Routing Table Refresh
115178

116-
SHOULD `10min`. Only peers that have been seen in the last 10 minutes should remain in the routing table. If peer hasn't been seen recently, try to ping it to see if it's still alive.
179+
There are several strategies a DHT Server can use to verify that nodes in its
180+
Routing Table remain reachable. Implementations may choose their own methods,
181+
provided they avoid serving unresponsive nodes. One recommended strategy is to
182+
periodically refresh the Routing Table.
117183

118-
## DHT Swarm
184+
DHT Servers SHOULD perform a Routing Table Refresh every `10` minutes. During
185+
this process, the server sends a ping request to all nodes it hasn’t heard from
186+
recently (e.g in the last 5 minutes). Any peer that fails to respond MUST be
187+
removed from the Routing Table.
119188

120-
## Routing Table
189+
After removing unresponsive peers, any buckets that are not full MUST be
190+
replenished with fresh, online peers. This can be accomplished by either adding
191+
recently connected peers or by executing a `FIND_NODE` request with a randomly
192+
generated Peer ID matching the bucket. `FIND_NODE` requests should only be run
193+
for buckets up to the last non-empty bucket.
121194

122-
### Routing Table Refresh
195+
Finally, the refresh process concludes by executing a `FIND_NODE` request for
196+
the local node's Peer ID, ensuring the DHT Server maintains up-to-date
197+
information on its closest peers.
123198

124-
### Public addresses
199+
## Lookup Process
125200

126-
### IP Diversity Filter
201+
Iterative vs Recursive
127202

128-
SHOULD implement.
203+
### Server behavior
129204

130-
## Lookup Process
205+
In public DHT swarms, DHT Servers MUST never respond with private or loopback multiaddresses.
206+
207+
Should Server tell Client about Server? And about Client?
208+
209+
### Concurrency
210+
211+
Implementation specific. Recommendation is `10`
131212

132213
### Lookup termination
133214

134215
This is hard
135216

217+
#### Resiliency
218+
219+
Implementation specific. Recommendation is `3`
220+
136221
## Peer Routing
137222

138223
DHT Clients that want to be routable must make sure they are in the peerstore of the closest DHT servers to their own PeerID.
139224

140225
When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The
141226

227+
### Routing to non-DHT Servers
228+
142229
### Signed Peer Records
143230

144231
## Content Routing
145232

233+
### Content Kademlia Identifier
234+
235+
sha256
236+
146237
### Provider Records
147238

239+
#### Provide Validity
240+
241+
Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT
242+
Server. DHT Servers MUST implement a Provide Validity of `48h`.
243+
244+
#### Provider Record Republish Interval
245+
246+
Because of the churn in the network, Provider Records need to be republished
247+
more often than their validity period. DHT Clients SHOULD republish Provider
248+
Records every `22h`
249+
([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)).
250+
251+
#### Provider Addresses TTL
252+
253+
DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the
254+
`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the
255+
content provider alongside the provide record, avoiding an additional DHT walk
256+
for the Client
257+
([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)).
258+
148259
### IPNS
149260

150261
### Validators
@@ -161,6 +272,10 @@ TBD
161272

162273
## Client Optimizations
163274

275+
### LAN DHT Swarms
276+
277+
Fine to store private multiaddresses in the routing table and serve them to other nodes in the same LAN DHT swarm.
278+
164279
### Checking peer behaviour before adding to routing table
165280

166281
Make a `FIND_NODE` request and inspect response before adding node to RT. Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/

0 commit comments

Comments
 (0)