Skip to content

GIP-0083: Substreams On The Network. #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 22, 2025
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 51 additions & 67 deletions gips/0083.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,39 @@ IMPORTANT: Non-normative (or Informative) sections were added for the sake of cl

# Specification

## Permissionless Discovery

This section discusses how Indexers may be listed on the Payments Gateway UI.

Substreams providers MUST register their service on-chain by publishing a _Substreams Service Deployment_ manifest to IPFS, and [`allocate`ing](https://github.com/graphprotocol/contracts/blob/main/packages/contracts/contracts/staking/Staking.sol#L296) to it through the Staking contract on the Arbitrum chain.

Here is a sample _Substreams Service Deployment_ manifest:
```yaml
specVersion: 0.0.5
description: "Substreams Data Service for MyNetwork"
service:
type: substreams-v1
endpoint: https://mynetwork.substreams.example.com
network: mynetwork
provider:
id: my-company
name: My Company
logo: https://company.example.com/logo.png
```

The service `substreams-v1` means that the endpoint makes available a gRPC endpoint responding to these methods:

- sf.substreams.rpc.v2.Stream/Block
- sf.substreams.rpc.v2.EndpointInfo/Info

as specified by https://github.com/streamingfast/substreams/blob/develop/proto/sf/substreams/rpc/v2/service.proto updated from time-to-time in backwards compatible ways.

The Payment Gateway MUST listen to the Arbitrum Staking contract for such registrations (through a has a Substreams module streamed, a Subgraph or other means), and update its local view of the network.

The Payment Gateway CAN have systems to perform health checks (e.g. on `/health` or `/info` endpoints), or other checks to ensure active, up-to-date and properly configured providers are offered to end users. A Payment Gateway CAN check block height to ensure the backing provider is close to chain head before offering it to users.



## Indexer Selection Algorithm (ISA)

To provide consumers with fair access to Substreams providers, an Indexer Selection Algorithm is needed.
Expand Down Expand Up @@ -62,18 +95,18 @@ An example of the payment flow today, which ends up abiding by this specificatio

https://github.com/streamingfast/network-payments-cli

It involves a small dance, of publishing a subgraph manifest, like:
It involves a small dance, of publishing a manifest similar to:

```yaml
specVersion: 0.0.5
description: "thegraph.market Payment Gateway usage"
description: "thegraph.market Payment Gateway Usage"
usage:
serviceName: substreams
namespace: sf.substreams.rpc.v2.Stream
serviceType: substreams-v1
network: mainnet
nonce: some-uuid
```

then allocating to it, sharing the allocation ID with the Payment Gateway, in order for the Indexer to receive payment.
then allocating to it, sharing the allocation ID with the Payment Gateway, in order for the Indexer to receive payment. Mind you, this isn't the same thing as the _Substreams Service Deployment_ manifest, as this one is short-lived as to not be exposed to the curation fee market.

This flow is expected to be simplified in the future, with the Payment Gateway being able to automatically detect new Substreams services, and allocate to them, without manual intervention.

Expand Down Expand Up @@ -108,6 +141,7 @@ THIS SECTION IS NON-NORMATIVE
Curation can happen on the published Subgraph manifest above, although a nonce may be added to avoid paying curation fees. Because there are no indexer rewards, curation is not adding value, therefore it is reasonable for participants to want to avoid it.



## Economic Security and Disputes

Economic security is achieved through slashing and disputes, similar to subgraphs as of Jan 24th 2025. Indexers providing incorrect data CAN have their stake slashed. Therefore, to be an Indexer recognized by the Payment Gateway, one must staked the Network token, just like for subgraphs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small typo here. "one must staked" -> "one must stake"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has there been discussions on setting the minimum staking requirement for substreams or the same as for subgraphs? Any clarifications on the slashing penalties?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be clarified to ensure that the slashing threat is credible. This being pre-Horizon, presumably the same 100k stake and 2.5% slash apply?

Expand All @@ -123,11 +157,20 @@ Having no queries like subgraphs have, Proofs of Indexing for Substreams are alw

Substreams endpoints provide signed attestations with each block's response, as per [the `attestation` field](https://github.com/streamingfast/substreams/blob/develop/proto/sf/substreams/rpc/v2/service.proto#L96) in the `BlockScopedData` response, matching the operator key specified in [the `attestation_public_key` field](https://github.com/streamingfast/substreams/blob/develop/proto/sf/substreams/rpc/v2/service.proto#L106) of the `SessionInit` response.

The attestation payload MUST be signed by a special _attestation key_, derived from the operator's key, so that disputes can be opened on a separate allocation destined for disputes.
The payload signed over is composed of:
- the Substreams output module's payload, hashed with SHA256 (32 bytes), followed by:
- the 'b' character, followed by:
- the block ID as UTF-8 encoded string, followed by:
- the 'm' character, followed by:
- the module hash for the top-level Substreams module being attested (32 bytes)

The attestation payload MUST be signed by a special _attestation key_, derived from the operator's key, so that disputes can be opened on the _Substreams Service Deployment ID_.

A valid _attestation key_, attached to an allocation MUST be verified by the Payment Gateway, before giving the assurance to Consumers that the Indexer can be slashed for misbehaving.
A valid _attestation key_, attached to such an allocation MUST be verified by the Payment Gateway, before giving the assurance to Consumers that the Indexer can be slashed for misbehaving.

An Arbiter MUST be able to validate payloads, through multiple providers and investigation, and after judgement, be able to slash the Indexer's stake, using the information provided in those payloads via an on-chain allocations.
An Arbiter MUST be able to validate payloads, through multiple providers and investigation, and after judgement, be able to slash the Indexer's stake, using the information provided in those payloads via an on-chain allocations.

> INFORMATIVE: The module hash covers everything needed for deterministic execution, and nothing more. It is easily computed by all known compliant Substreams library.


#### Methods of analysis
Expand Down Expand Up @@ -162,59 +205,6 @@ Assignment of Arbiters is not in scope for this specification, and should be dea

> INFORMATIVE: The Core Developers of Substreams committed to providing support for arbitration cases involving Substreams. Many third-party competent teams are also able and willing to help.

## Permissionless Discovery

This section discusses how Indexers will be listed on the Payments Gateway UI.

Substreams providers MUST register their service on-chain by publishing a registration payload via the DataEdge contract on the Arbitrum chain.

Here is a sample payload:

```json
{
"type": "substreamsServiceRegistration",
"data": {
"endpoint": "https://mynetwork.substreams.example.com",
"service": "substreams-v1",
"network": "mynetwork",
"providerLogo": "https://company.example.com/logo.png",
"providerName": "My Company",
"providerId": "my-company",
}
}
```

Here is a sample revocation:

```json
{
"type": "substreamsServiceRevocation",
"data": {
"service": "substreams-v1",
"network": "mynetwork",
"providerId": "my-company",
}
}
```

The triplet `service`, `network` and `providerId` is what uniquely identifies a Substreams service published on the network.

The service `substreams-v1` means that the endpoint makes available a gRPC endpoint responding to these methods:

- sf.substreams.rpc.v2.Stream/Block
- sf.substreams.rpc.v2.EndpointInfo/Info

as specified by https://github.com/streamingfast/substreams/blob/develop/proto/sf/substreams/rpc/v2/service.proto updated from time-to-time in backwards compatible ways.

The Payment Gateway MUST listen to the Arbitrum DataEdge contract, for registrations and revocation (through a has a Substreams module streamed, a Subgraph or other means), and update its local view of the network.

The Payment Gateway CAN have systems to perform health checks (e.g. on `/health` or `/info` endpoints), or other checks to ensure active, up-to-date and properly configured providers are offered to end users. A Payment Gateway CAN check block height to ensure the backing provider is close to chain head before offering it to users.


### DataEdge address

The exact address of the DataEdge will be decided by the Council, either reusing some of the listed addresses below, or deploying a new contract for this purpose.


### Tools

Expand All @@ -230,12 +220,6 @@ This command allows providers to register their service, specifying their operat



### References

- Current DataEdge deployment [addresses](https://github.com/graphprotocol/contracts/blob/main/packages/data-edge/addresses.json)
- The relevant contract sources are `DataEdge` ([DataEdge.sol](https://github.com/graphprotocol/contracts/blob/main/packages/data-edge/contracts/DataEdge.sol)) and `EventfulDataEdge` ([EventfulDataEdge.sol](https://github.com/graphprotocol/contracts/blob/main/packages/data-edge/contracts/EventfulDataEdge.sol)). `EventfulDataEdge` emits an event upon registration, while `DataEdge` does not.


# Meaning and commitment

Approving this GIP settles affirmatively the question of whether Substreams is "On The Network".
Expand Down