Notes on ocap urls

Notes on remote ocap references and ocap URL design

Assumptions

Assertion: Given libp2p as the network communications substrate, we can assume the following:

Each libp2p network node has a unique public PeerID that's derived via some kind of hash scheme from the node's private key, which is in turn securely generated as part of that node's initialization and thenceforth never disclosed to anyone.
A libp2p connection between two nodes is bidirectionally authenticated, so each endpoint knows that its counterpart is genuinely the node identified by the PeerID it alleges itself to have, meaning neither node is in a position to falsify its identity to the other.
A libp2p connection between two nodes is encrypted, meaning that nobody outside the two nodes themselves is able to know the contents of message traffic between them.

Any of these assumptions might be based on a mistaken understanding of libp2p on my part, and so should be double checked by somebody else (hopefully somebody who knows libp2p much better than I do). And of course all of this is subject to the usual caveats that apply any time you are dealing with cryptographic stuff: i.e., that the underlying cryptographic algorithms and protocols are soundly designed and correctly implemented, that neither endpoint is externally compromised outside the scope of the relevant security analyses, etc. However, with these qualifications in mind, the rest of this document will proceed on the basis that the above assertions are actually true. Furthermore, it should be possible to (relatively painlessly, one hopes) recast the following design on top of any other non-libp2p network communications substrate that satisfies the above assertions. (Though for purposes of concreteness I will continue to use some of libp2p's terminology rather than attempting to be 100% generic.)

Cluster identity vs. location

Proceding from the above:

Each Cluster is associated with a distinct network node and its unique PeerID. This PeerID is derived from the private key generated by the Cluster's kernel instance when it is first created (and thenceforth stored as part of the kernel's private durable state accessible only to the kernel itself). Consequently, all the properties asserted above concerning the relationships between communicating libp2p network nodes map directly to the relationships between communicating Clusters. In other words, for purposes of this discussion, PeerID and ClusterID are the same thing.

A ClusterID uniquely identifies a Cluster, but does not itself tell us where to direct network communications to actually establish a connection to it, i.e., it does not locate a Cluster. Since we expect Ocap Kernel instances to be commonly hosted by web browsers, we will have to rely heavily on libp2p (or something like it) to enable the establishment of inbound connections. In particular, this most likely involves NAT hole punching, which necessarily requires the cooperation of third parties to act as connection rendezvous points or, if necessary, as traffic relays. Note that because the communications link is end-to-end encrypted and bi-directionally authenticated, these third parties need not be extended quite the same degree of trust that we require of the inter-Cluster communication links itself even if they are part of that link -- a secure connection means that while an intermediary can block or disrupt communication (and possibly do some amount of traffic analysis, the seriousness of which shouldn't be discounted), it is not in a position to alter or eavesdrop on the communication's contents. In the case where an intermediary is interfering with message traffic, it should be possible, in principle at least, for communicating Clusters to switch to using different, hopefully better behaved intermediaries.

All this implies that intermediary location or identity is distinct from the information essential to designate object references across Cluster boundaries. Any such intermediary information can at best be regarded as hinting at location and even in that capacity is not guaranteed to be stable. Therefor, I think the core of an externalizable object designator is technically a URI rather than a URL: while it identifies an object, additional external information sources will most likely be needed to actually locate and contact the object's host, and even if such additional information is not required to determine location (e.g., if the that information is already known to the kernel attempting to establish communications), external services will probably still be needed to actually effectuate a connection.

Ocap URLs and URIs

The above suggests the following URL form:

ocap: oid @ clusterid [ , locationhint ]*

For example:

jo91waLQA1NNeBmZKUF@12D3KooWPjceQrSwdWXPyLLeABRXmuqt69Rg3sBYbU1Nft9HyQ6X,/dns4/example.org/tcp/9001/ws/p2p/12D3KooWJBDqsyHQF2MWiCdU4kdqx4zTsSTLRdShg7Ui6CRWB4uc

The URL scheme ocap identifies this as an object reference. Note that the specific choice of ocap is a trial balloon intended mainly for concreteness, just to have something to write down in this document. If it collides with some existing practice we should feel free to substitute a different string, as long as we can settle on something definite before our specification is locked in.

oid designates a particular object in the context of the Cluster that hosts it. It is generated by taking the hosting kernel's kref for the object, padding to at least a minimum length (size and padding bits TBD), prepending a random salt (length TBD), encrypting with the kernel's public key (either directly or via a symmetric key derived from the public key, TBD), and then base58btc encoding it. Whenever the hosting kernel vends an ocap URL to the outside world it generates a new URL afresh, using a new random salt, so that different references to the same object cannot be compared for equality by outsiders. The oid comes first in the URL so that it will be simple to separate the object-specific portion from the remainder of the URL, which pertains to the Cluster.

clusterid encodes the ClusterID, which is in turn the PeerID generated for the Cluster's kernel. It should be sufficient to simply use the PeerID directly as it is encoded by libp2p: a base58btc encoded representation of a hash of the libp2p node's public key.

locationhints are zero or more comma separated location hints, each taking the form of a libp2p multiaddr designating the address of a potential contact point. An ocap URL with no location hints is an ocap URI.

Connection search logic

When a kernel wants to send a message to a Cluster that it's not currently in communications with, it iterates through a set of possible contact points, attempting to use each in turn to establish a connection, until it either succeeds or exhausts the possiblities and gives up.

These contact points can come from a variety of sources including (but not necessarily limited to): the last known (or most recently known) contact points via which the kernel was previously in communication with the Cluster, the location hints in an ocap URL referencing an object in the Cluster, well known widely-connected contact points, or contact points manually entered by the user.

When the kernel successfully opens a connection to one of these contact points, it queries for the ClusterID it is seeking connection to. The response will be one of the following:

If the contact point is actually a direct address for the target Cluster's kernel (possible in the case of a non-browser server kernel that can listen directly on a public address), the two parties can simply begin communicating. This is a successful result and the search for a connection stops.
If the contact point is a relay and/or rendezvous server that has a direct relationship to the target Cluster, it will perform the libp2p handshake associated with setting up a WebRTC handoff or a traffic relay, connecting the querier to the Cluster in question. This is also a successful result and terminates the search.
The response can be an error indication that the Cluster of interest is either unknown to the contact point or is known but currently unavailable (we might possibly want to support the latter case with an optional indicator of how long the querier should wait before trying again). The querying kernel resumes its iteration of candidate contact points, possibly scheduling a later retry of this one.
The response can be a list of additional possible contact point hints. Any of these that refer to addresses not already tried can be added (in whatever order experience teaches us works best) to the list of remaining possibilities to iterate through. The kernel resumes its search with a possibly expanded list of candidates. Offering additional location hints might be the sole service offered by some contact points. That is, rather than primarily offering the service of relaying to or directly connecting with a desired endpoint, they instead are offering the service of being a Cluster contacts directory (of course, a server could offer both of these).

External services

While in an ideal world we'd like the participants in our emerging network of interacting Clusters to be completely autonomous, dependence on external, third-party services (for obtaining both location information and connectivity) is not per se a radical departure from existing networking norms. Current network infrastructure already relies on a variety of third-party services, such as DNS, certificate authorities, ISPs, and telecommunications service providers. However, our particular dependencies are still cause for concern: existing network infrastructure is relatively mature, whereas what we want to do requires the introduction of new species of services.

The services required are quite simple to implement and shouldn't pose much of a computational burden to run, but bootstrapping the necessary service ecosystem is likely to still be a challenge. Our plan for rolling out the Ocap Kernel in a mode that supports remote connectivity will need to incorporate a strategy for getting a critical mass of supporting services in place. This is likely more of a marketing and business challenge than a technological one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Notes on ocap urls

Notes on remote ocap references and ocap URL design

Assumptions

Cluster identity vs. location

Ocap URLs and URIs

Connection search logic

External services

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally