Skip to content

Commit 6265f16

Browse files
dht: initial draft
1 parent 0e5c2f7 commit 6265f16

File tree

1 file changed

+176
-0
lines changed

1 file changed

+176
-0
lines changed

src/routing/kad-dht.md

+176
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
---
2+
title: Kademlia DHT
3+
description: >
4+
The IPFS Distributed Hash Table (DHT) specification defines a structured
5+
overlay network used for peer routing and content routing in the
6+
InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT
7+
specification, adapting and adding features to support IPFS-specific
8+
requirements.
9+
date: 2022-08-26
10+
maturity: reliable
11+
editors:
12+
- name: Guillaume Michel
13+
github: guillaumemichel
14+
affiliation:
15+
name: Shipyard
16+
url: https://ipshipyard.com
17+
tags: ['routing']
18+
order: 1
19+
---
20+
21+
The IPFS Distributed Hash Table (DHT) specification defines a structured
22+
overlay network used for peer routing and content routing in the
23+
InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT
24+
specification, adapting and adding features to support IPFS-specific
25+
requirements.
26+
27+
## Introduction
28+
29+
FIXME:
30+
31+
### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht)
32+
33+
The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT.
34+
35+
It is possible to use an alternative DHT specification alongside an IPFS
36+
implementation, rather than the one detailed here. This document specifically
37+
outlines all protocol customizations and adaptations required for participation
38+
in the [Amino DHT](#relation-to-the-amino-dht). If you're designing a new
39+
Kademlia-based DHT for use with IPFS, some details in this specification may
40+
appear overly specific or prescriptive.
41+
42+
### Relation to the [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino)
43+
44+
The Amino DHT is the swarm of peers also referred to as the _Public IPFS DHT_.
45+
It implements the IPFS Kademlia DHT specification and uses the protocol
46+
identifier `/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino
47+
DHT
48+
Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers).
49+
50+
The Amino DHT is utilized by multiple IPFS implementations, including
51+
[`kubo`](https://github.com/ipfs/kubo) and
52+
[`helia`](https://github.com/ipfs/helia). Multiple DHT swarms can coexist and
53+
nodes MAY participate in multiple DHT swarms. DHT swarms can be either public
54+
or private.
55+
56+
Note that there could be multiple distinct DHT swarms using the same protocol
57+
identifier as long as they don't have any common peers. This practice is
58+
discouraged as networks will immediately merge if they enter in contact. Each
59+
DHT swarm SHOULD have a dedicated protocol identifier.
60+
61+
## Protocol Parameters
62+
63+
FIXME: move parameters to appropriate sections
64+
65+
The IPFS Kademlia DHT defines a number of Client and Server parameters that
66+
need to be set to ensure the DHT operates correctly as a system.
67+
68+
### Protocol Identifier
69+
70+
All nodes participating in the same DHT swarm MUST use the same protocol
71+
identifier. The protocol identifier uniquely identifies a DHT swarm. It follows
72+
the format `/<swarm-prefix>/kad/<version>`, e.g `/ipfs/kad/1.0.0` for the Amino
73+
DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local DHT swarm.
74+
75+
### Routing Table Bucket Size
76+
77+
DHT Servers MUST have a routing table bucket size of `20` (see [Routing
78+
Table](#routing-table)). This corresponds to the `k` value as defined in the
79+
original Kademlia paper [0]. The `k` value is also used as a replication factor
80+
and defines how many peers are returned to a lookup request.
81+
82+
While DHT Client technically don't need to store a routing table, DHT Clients
83+
MUST nonetheless use a replication factor of `20`. If Client implementations
84+
decide to include a routing table, they SHOULD use a bucket size of `20`.
85+
86+
### Provide Validity
87+
88+
Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT
89+
Server. DHT Servers MUST implement a Provide Validity of `48h`.
90+
91+
### Provider Record Republish Interval
92+
93+
Because of the churn in the network, Provider Records need to be republished
94+
more often than their validity period. DHT Clients SHOULD republish Provider
95+
Records every `22h`
96+
([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)).
97+
98+
### Provider Addresses TTL
99+
100+
DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the
101+
`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the
102+
content provider alongside the provide record, avoiding an additional DHT walk
103+
for the Client
104+
([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)).
105+
106+
### Concurrency
107+
108+
Implementation specific. Recommendation is `10`
109+
110+
### Resiliency
111+
112+
Implementation specific. Recommendation is `3`
113+
114+
### Routing Table Refresh Interval
115+
116+
SHOULD `10min`. Only peers that have been seen in the last 10 minutes should remain in the routing table. If peer hasn't been seen recently, try to ping it to see if it's still alive.
117+
118+
## DHT Swarm
119+
120+
## Routing Table
121+
122+
### Routing Table Refresh
123+
124+
### Public addresses
125+
126+
### IP Diversity Filter
127+
128+
SHOULD implement.
129+
130+
## Lookup Process
131+
132+
### Lookup termination
133+
134+
This is hard
135+
136+
## Peer Routing
137+
138+
DHT Clients that want to be routable must make sure they are in the peerstore of the closest DHT servers to their own PeerID.
139+
140+
When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The
141+
142+
### Signed Peer Records
143+
144+
## Content Routing
145+
146+
### Provider Records
147+
148+
### IPNS
149+
150+
### Validators
151+
152+
## Wire format
153+
154+
Currently same as libp2p kad-dht
155+
156+
Profobuf
157+
158+
## Backpressure
159+
160+
TBD
161+
162+
## Client Optimizations
163+
164+
### Checking peer behaviour before adding to routing table
165+
166+
Make a `FIND_NODE` request and inspect response before adding node to RT. Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/
167+
168+
## libp2p Kademlia DHT Implementations
169+
170+
* Go: [`libp2p/go-libp2p-kad-dht`](https://github.com/libp2p/go-libp2p-kad-dht)
171+
* JS: [libp2p/kad-dht](https://github.com/libp2p/js-libp2p/tree/main/packages/kad-dht)
172+
* Rust: [libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad)
173+
174+
## References
175+
176+
[0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) [pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf)

0 commit comments

Comments
 (0)