|
| 1 | +--- |
| 2 | +title: Kademlia DHT |
| 3 | +description: > |
| 4 | + The IPFS Distributed Hash Table (DHT) specification defines a structured |
| 5 | + overlay network used for peer routing and content routing in the |
| 6 | + InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT |
| 7 | + specification, adapting and adding features to support IPFS-specific |
| 8 | + requirements. |
| 9 | +date: 2022-08-26 |
| 10 | +maturity: reliable |
| 11 | +editors: |
| 12 | + - name: Guillaume Michel |
| 13 | + github: guillaumemichel |
| 14 | + affiliation: |
| 15 | + name: Shipyard |
| 16 | + url: https://ipshipyard.com |
| 17 | +tags: ['routing'] |
| 18 | +order: 1 |
| 19 | +--- |
| 20 | + |
| 21 | +The IPFS Distributed Hash Table (DHT) specification defines a structured |
| 22 | +overlay network used for peer routing and content routing in the |
| 23 | +InterPlanetary File System (IPFS). It extends the libp2p Kademlia DHT |
| 24 | +specification, adapting and adding features to support IPFS-specific |
| 25 | +requirements. |
| 26 | + |
| 27 | +## Introduction |
| 28 | + |
| 29 | +FIXME: |
| 30 | + |
| 31 | +### Relation to [libp2p kad-dht](https://github.com/libp2p/specs/tree/master/kad-dht) |
| 32 | + |
| 33 | +The IPFS Kademlia DHT specification is a specialization of the libp2p Kademlia DHT. |
| 34 | + |
| 35 | +It is possible to use an alternative DHT specification alongside an IPFS |
| 36 | +implementation, rather than the one detailed here. This document specifically |
| 37 | +outlines all protocol customizations and adaptations required for participation |
| 38 | +in the [Amino DHT](#relation-to-the-amino-dht). If you're designing a new |
| 39 | +Kademlia-based DHT for use with IPFS, some details in this specification may |
| 40 | +appear overly specific or prescriptive. |
| 41 | + |
| 42 | +### Relation to the [Amino DHT](https://blog.ipfs.tech/2023-09-amino-refactoring/#why-amino) |
| 43 | + |
| 44 | +The Amino DHT is the swarm of peers also referred to as the _Public IPFS DHT_. |
| 45 | +It implements the IPFS Kademlia DHT specification and uses the protocol |
| 46 | +identifier `/ipfs/kad/1.0.0`. The Amino DHT can be joined by using the [Amino |
| 47 | +DHT |
| 48 | +Bootstrappers](https://docs.ipfs.tech/concepts/public-utilities/#amino-dht-bootstrappers). |
| 49 | + |
| 50 | +The Amino DHT is utilized by multiple IPFS implementations, including |
| 51 | +[`kubo`](https://github.com/ipfs/kubo) and |
| 52 | +[`helia`](https://github.com/ipfs/helia). Multiple DHT swarms can coexist and |
| 53 | +nodes MAY participate in multiple DHT swarms. DHT swarms can be either public |
| 54 | +or private. |
| 55 | + |
| 56 | +Note that there could be multiple distinct DHT swarms using the same protocol |
| 57 | +identifier as long as they don't have any common peers. This practice is |
| 58 | +discouraged as networks will immediately merge if they enter in contact. Each |
| 59 | +DHT swarm SHOULD have a dedicated protocol identifier. |
| 60 | + |
| 61 | +## Protocol Parameters |
| 62 | + |
| 63 | +FIXME: move parameters to appropriate sections |
| 64 | + |
| 65 | +The IPFS Kademlia DHT defines a number of Client and Server parameters that |
| 66 | +need to be set to ensure the DHT operates correctly as a system. |
| 67 | + |
| 68 | +### Protocol Identifier |
| 69 | + |
| 70 | +All nodes participating in the same DHT swarm MUST use the same protocol |
| 71 | +identifier. The protocol identifier uniquely identifies a DHT swarm. It follows |
| 72 | +the format `/<swarm-prefix>/kad/<version>`, e.g `/ipfs/kad/1.0.0` for the Amino |
| 73 | +DHT protocol version `1.0.0`, or `/ipfs/lan/kad/1.0.0` for a local DHT swarm. |
| 74 | + |
| 75 | +### Routing Table Bucket Size |
| 76 | + |
| 77 | +DHT Servers MUST have a routing table bucket size of `20` (see [Routing |
| 78 | +Table](#routing-table)). This corresponds to the `k` value as defined in the |
| 79 | +original Kademlia paper [0]. The `k` value is also used as a replication factor |
| 80 | +and defines how many peers are returned to a lookup request. |
| 81 | + |
| 82 | +While DHT Client technically don't need to store a routing table, DHT Clients |
| 83 | +MUST nonetheless use a replication factor of `20`. If Client implementations |
| 84 | +decide to include a routing table, they SHOULD use a bucket size of `20`. |
| 85 | + |
| 86 | +### Provide Validity |
| 87 | + |
| 88 | +Provide Validity defines the time-to-live (TTL) of a Provider Record on a DHT |
| 89 | +Server. DHT Servers MUST implement a Provide Validity of `48h`. |
| 90 | + |
| 91 | +### Provider Record Republish Interval |
| 92 | + |
| 93 | +Because of the churn in the network, Provider Records need to be republished |
| 94 | +more often than their validity period. DHT Clients SHOULD republish Provider |
| 95 | +Records every `22h` |
| 96 | +([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17-provider-record-liveness.md#42-alternative-k-values-and-their-performance-comparison)). |
| 97 | + |
| 98 | +### Provider Addresses TTL |
| 99 | + |
| 100 | +DHT Servers SHOULD persist the multiaddresses of providers for `24h` after the |
| 101 | +`PROVIDE` operation. This allows DHT Servers to serve the multiaddresses of the |
| 102 | +content provider alongside the provide record, avoiding an additional DHT walk |
| 103 | +for the Client |
| 104 | +([rationale](https://github.com/probe-lab/network-measurements/blob/master/results/rfm17.1-sharing-prs-with-multiaddresses.md)). |
| 105 | + |
| 106 | +### Concurrency |
| 107 | + |
| 108 | +Implementation specific. Recommendation is `10` |
| 109 | + |
| 110 | +### Resiliency |
| 111 | + |
| 112 | +Implementation specific. Recommendation is `3` |
| 113 | + |
| 114 | +### Routing Table Refresh Interval |
| 115 | + |
| 116 | +SHOULD `10min`. Only peers that have been seen in the last 10 minutes should remain in the routing table. If peer hasn't been seen recently, try to ping it to see if it's still alive. |
| 117 | + |
| 118 | +## DHT Swarm |
| 119 | + |
| 120 | +## Routing Table |
| 121 | + |
| 122 | +### Routing Table Refresh |
| 123 | + |
| 124 | +### Public addresses |
| 125 | + |
| 126 | +### IP Diversity Filter |
| 127 | + |
| 128 | +SHOULD implement. |
| 129 | + |
| 130 | +## Lookup Process |
| 131 | + |
| 132 | +### Lookup termination |
| 133 | + |
| 134 | +This is hard |
| 135 | + |
| 136 | +## Peer Routing |
| 137 | + |
| 138 | +DHT Clients that want to be routable must make sure they are in the peerstore of the closest DHT servers to their own PeerID. |
| 139 | + |
| 140 | +When performing a `FIND_NODE` lookup, the client will converge to the closest nodes in XOR distance to the requested PeerID. These nodes are expected to know the multiaddrs of the target peer. The |
| 141 | + |
| 142 | +### Signed Peer Records |
| 143 | + |
| 144 | +## Content Routing |
| 145 | + |
| 146 | +### Provider Records |
| 147 | + |
| 148 | +### IPNS |
| 149 | + |
| 150 | +### Validators |
| 151 | + |
| 152 | +## Wire format |
| 153 | + |
| 154 | +Currently same as libp2p kad-dht |
| 155 | + |
| 156 | +Profobuf |
| 157 | + |
| 158 | +## Backpressure |
| 159 | + |
| 160 | +TBD |
| 161 | + |
| 162 | +## Client Optimizations |
| 163 | + |
| 164 | +### Checking peer behaviour before adding to routing table |
| 165 | + |
| 166 | +Make a `FIND_NODE` request and inspect response before adding node to RT. Followed https://blog.ipfs.tech/2023-ipfs-unresponsive-nodes/ |
| 167 | + |
| 168 | +## libp2p Kademlia DHT Implementations |
| 169 | + |
| 170 | +* Go: [`libp2p/go-libp2p-kad-dht`](https://github.com/libp2p/go-libp2p-kad-dht) |
| 171 | +* JS: [libp2p/kad-dht](https://github.com/libp2p/js-libp2p/tree/main/packages/kad-dht) |
| 172 | +* Rust: [libp2p-kad](https://github.com/libp2p/rust-libp2p/tree/master/protocols/kad) |
| 173 | + |
| 174 | +## References |
| 175 | + |
| 176 | +[0]: Maymounkov, P., & Mazières, D. (2002). Kademlia: A Peer-to-Peer Information System Based on the XOR Metric. In P. Druschel, F. Kaashoek, & A. Rowstron (Eds.), Peer-to-Peer Systems (pp. 53–65). Berlin, Heidelberg: Springer Berlin Heidelberg. [DOI](https://doi.org/10.1007/3-540-45748-8_5) [pdf](https://www.scs.stanford.edu/~dm/home/papers/kpos.pdf) |
0 commit comments