diff --git a/connections/README.md b/connections/README.md index b8ce295cb..3d0404187 100644 --- a/connections/README.md +++ b/connections/README.md @@ -181,7 +181,7 @@ If the peers agree on a protocol, multistream-select's job is done, and future traffic over the channel will adhere to the rules of the agreed-upon protocol. If a peer receives a `"na"` response to a proposed protocol id, they can either -try again with a different protocol id or close the channel. +try again with a different protocol id or close the channel with error code `PROTOCOL_NEGOTIATION_FAILED` as defined in [libp2p error codes](./../error-codes/README.md) spec. ## Upgrading Connections diff --git a/error-codes/README.md b/error-codes/README.md new file mode 100644 index 000000000..22a6d14d1 --- /dev/null +++ b/error-codes/README.md @@ -0,0 +1,101 @@ +# Error Codes + +| Lifecycle Stage | Maturity | Status | Latest Revision | +| --------------- | ------------- | ------ | --------------- | +| 1A | Working Draft | Active | r0, 2023-01-23 | + +Authors: [@sukunrt] + +Interest Group: [@marcopolo], [@achingbrain] + +[@MarcoPolo]: https://github.com/MarcoPolo +[@achingbrain]: https://github.com/achingbrain + +## Introduction +When closing a connection or resetting a stream, it's useful to provide the peer +with a code that explains the reason for the closure. This enables the peer to +better respond to the abrupt closures. For instance, it can implement a backoff +strategy to retry _only_ when it receives a `RATE_LIMITED` error code. An error +code doesn't always indicate an error condition. For example, a node can terminate an idle connection, or close a connection because a connection to the same peer over a better transport is available. In both these cases, it can signal an appropriate error code to the other end. + +## Semantics +Error Codes can be signaled on Closing a connection or on resetting a Stream. Error Codes are unsigned 32-bit integers. The range 0 to 0xffff is reserved for +libp2p errors. Application specific errors can be defined for protocols from +integers outside of this range. + +From an application perspective, error codes provide a best effort guarantee. On resetting a libp2p stream or closing a connection with an error code, the error code may or may not be delivered to the application on the remote end. The specifics depend on the transport used. For example, WebTransport doesn't support error codes at all, while WebRTC doesn't support Connection Close error codes, but supports Stream Reset error codes. + +### Connection Close and Stream Reset Error Codes +Error codes are defined separately for Connection Close and Stream Reset. The namespace doesn't overlap as it is clear from the context whether the stream was reset by the other end, or it was reset as a result of a connection close. +Implementations MUST provide the Connection Close error code on streams that are reset as a result of remote closing the connection. + +Libp2p streams are reset unilaterally, calling `Reset` on a stream resets both the read and write end of a stream. For transports, like QUIC, which support cancelling the read and write ends of the stream separately, implementations MAY provide the ability to signal error codes separately on resetting either end. + +## Error Codes Registry +Libp2p connections are shared by multiple applications. The same connection used in the dht may be used for gossip sub, or for any other application. Any of these applications can close the underlying connection on an error, resetting streams used by the other applications. To correctly distinguish which application closed the connection, Connection Close error codes are allocated to applications from a central registry. + +For simplicity, we manage both Connection Close and Stream Reset error codes from a central registry. The libp2p error codes registry is maintained here with all the allocations so far listen in (error-codes.csv)[./error-codes.csv]. + +Error codes are allocated to applications in 8 bit chunks. To request an +allocation, raise a PR allocating 256 codes right after the last allocation. If +the last allocated range is 0x1900 - 0x19ff, add 0x1a00 - 0x1aff for your +application. + +### Libp2p Reserved Error Codes +Error code 0 signals that no error code was provided. Implementations MUST handle closing a connection with error code 0 as they handle closing a connection with no error code, and resetting a stream with error code 0 as they handle resetting a stream without any error code. + +Error codes from 1 to 0xfff are reserved for transport errors. These are used by the transports to terminate connections or streams on transport errors. + +Error codes from 0x1000 to 0xffff are reserved for libp2p. This includes multistream error codes, as it is necessary for libp2p connection establishment over TCP, but not kad-dht or gossip-sub error codes. See [Libp2p error codes](./libp2p-error-codes.md) for the libp2p reserved error codes. + +Some transports, like QUIC, support sending an error code greater than a 32 bit int. On receiving such a value, implementations MUST use `CODE_OUT_OF_RANGE` as the libp2p error code. + + +## Transport Specification and Wire Encoding +Different transports will encode the 32-bit error code differently on the wire. For instance, Yamux will use Big Endian and QUIC uses varint. They also provide different semantics: Webtransport doesn't define error codes, WebRTC doesn't support Connection Close error codes, Yamux error codes on Connection Close cannot be reliably sent over the wire. + +### QUIC +QUIC provides the ability to send an error on closing the read end of the +stream, resetting the write end of the stream and on closing the connection. + +For stream resets, the error code MUST be sent on `RESET_STREAM` and `STOP_SENDING` frames using the `Application Protocol Error Code` field as per +the frame definitions in the +[RFC](https://www.rfc-editor.org/rfc/rfc9000.html#name-reset_stream-frames). + +For Connection Close, the error code MUST be sent on `CONNECTION_CLOSE` frame +using the Error Code field as defined in the +[RFC](https://www.rfc-editor.org/rfc/rfc9000.html#section-19.19-6.2.1). + +### Yamux +Yamux streams are reset unilaterally. Receiving a stream frame with `RST` flag set resets both the read and write end of the stream. So, there's no way to separately signal error code on closing the read end of the stream, or resetting the write end of the stream. + +For Connection Close, the 32-bit Length field is interpreted as the error code. + +For Stream Resets, the error code is sent in the `Window Update` frame, with the +32-bit Length field interpreted as the error code. See [yamux spec +extension](https://github.com/libp2p/specs/pull/622). + +TCP connections with Yamux may not deliver the error code to the peer depending on the TCP socket options used. In particular, setting the `SO_LINGER` socket option with timeout 0, the OS discards all the data in the send buffer and sends a TCP RST to immediately close the connection, preventing error code delivery. + +### WebRTC +A libp2p WebRTC connection is closed by closing the underlying WebRTC Peer Connection. As there's no way to provide any information to the peer on closing a WebRTC Peer Connection, it's not possible to signal error codes on Connection Close. + +For Stream Resets, the error code can be sent in the `errorCode` field of the +WebRTC message with `flag` set to `RESET_STREAM`. + +### WebTransport +Error codes for WebTransport will be introduced when browsers upgrade to draft-9 +of the spec. The current WebTransport spec implemented by Chrome and Firefox is +[draft-2 of WebTransport over +HTTP/3](https://www.ietf.org/archive/id/draft-ietf-webtrans-http3-02.html#section-4.3-2). +This version allows for only a 1-byte error code. 1 byte is too restrictive and +as the latest WebTransport draft, +[draft-9](https://www.ietf.org/archive/id/draft-ietf-webtrans-http3-02.html#section-4.3-2) +allows for a 4-byte error code to be sent on stream resets, we will introduce +error codes over WebTransport later. + +### HTTP +Protocols that work over http MUST use the response header `Libp2p-Error-Code` to send the error code. The grammar for the field is similar to `Content-Length` +``` +Libp2p-Error-Code: 1*DIGIT +``` diff --git a/error-codes/error-codes.csv b/error-codes/error-codes.csv new file mode 100644 index 000000000..793159127 --- /dev/null +++ b/error-codes/error-codes.csv @@ -0,0 +1,4 @@ +start_range,end_range,name +0, 0, No error code +1, 0xfff, Transport errors +0x1000, 0xffff, libp2p diff --git a/error-codes/libp2p-error-codes.md b/error-codes/libp2p-error-codes.md new file mode 100644 index 000000000..08e2b3890 --- /dev/null +++ b/error-codes/libp2p-error-codes.md @@ -0,0 +1,32 @@ +# Libp2p error codes + +## Connection Error Codes +| Name | Code | Description | +| --- | --- | --- | +| NO_ERROR | 0 | No reason provided for disconnection. This is equivalent to closing a connection or resetting a stream without any error code. | +| Reserved For Transport | 0x1 - 0xfff | Reserved for transport level error codes. | +| PROTOCOL_NEGOTIATION_FAILED | 0x1000 | Rejected because we couldn't negotiate a protocol. Used by multistream select for security negotiation | +| RESOURCE_LIMIT_EXCEEDED | 0x1001 | Rejected because we ran into a resource limit. Implementations MAY retry with a backoff | +| RATE_LIMITED | 0x1002 | Rejected because the connection was rate limited. Implementations MAY retry with a backoff | +| PROTOCOL_VIOLATION | 0x1003 | Peer violated the protocol | +| SUPPLANTED | 0x1004 | Connection closed because a connection over a better tranpsort was available | +| GARBAGE_COLLECTED | 0x1005 | Connection was garbage collected | +| SHUTDOWN | 0x1006 | The node is shutting down | +| GATED | 0x1007 | The connection was gated. Most likely the IP / node is blacklisted. | +| CODE_OUT_OF_RANGE | 0x1008 | The error code received from the peer was greater than 4294967295(Max uint32). + + +## Stream Error Codes +| Name | Code | Description | +| --- | --- | --- | +| NO_ERROR | 0 | No reason provided for disconnection. This is equivalent to resetting a stream without any error code. | +| Reserved For Transport | 0x1 - 0xfff | Reserved for transport level error codes. | +| PROTOCOL_NEGOTIATION_FAILED | 0x1000 | Rejected because we couldn't negotiate a protocol. Used by multistream select| +| RESOURCE_LIMIT_EXCEEDED | 0x1001 | Stream rejected because we ran into a resource limit. Implementations MAY retry with a backoff | +| RATE_LIMITED | 0x1002 | Rejected because the connection was rate limited. Implementations MAY retry with a backoff | +| PROTOCOL_VIOLATION | 0x1003 | Rejected because the stream protocol was violated. MAY be used interchangably with `BAD_REQUEST` | +| SUPPLANTED | 0x1004 | Resetted because a better transport is available for the stream | +| GARBAGE_COLLECTED | 0x1005 | Idle Stream was garbage collected | +| SHUTDOWN | 0x1006 | The node is shutting down | +| GATED | 0x1007 | The stream was gated. Most likely the IP / node is blacklisted. | +| CODE_OUT_OF_RANGE | 0x1008 | The error code received from the peer was greater than 4294967295(Max uint32). diff --git a/error-codes/validate.py b/error-codes/validate.py new file mode 100644 index 000000000..23de204eb --- /dev/null +++ b/error-codes/validate.py @@ -0,0 +1,30 @@ +import csv + + +with open("error-codes.csv", "r") as f: + reader = csv.DictReader(f, skipinitialspace=True) + codes = [ + { + "start_range": int(row["start_range"], base=16), + "end_range": int(row["end_range"], base=16), + "name": row["name"].strip(), + } + for row in reader + ] + +def intersects(a, b): + return a["start_range"] <= b["end_range"] and a["end_range"] >= b["start_range"] + +if __name__ == "__main__": + for code in codes: + if code["start_range"] > code["end_range"]: + print(f"invalid range: \"{code['name']}\" has start greater than end") + exit(1) + + for (idx, code) in enumerate(codes): + for other in codes[:idx]: + if intersects(code, other): + print(f"overlapping ranges: \"{code['name']}\" intersects with \"{other['name']}\"") + exit(1) + + print("no errors found") \ No newline at end of file diff --git a/webrtc/README.md b/webrtc/README.md index c482dfccc..52dfa3dab 100644 --- a/webrtc/README.md +++ b/webrtc/README.md @@ -76,6 +76,9 @@ message Message { optional Flag flag=1; optional bytes message = 2; + + // errorCode is the reason for resetting the stream. This field is only meaningful when flag is set to RESET_STREAM or STOP_SENDING + optional uint32 errorCode = 3; } ```