-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: cas scaling #1189
base: develop
Are you sure you want to change the base?
feat: cas scaling #1189
Conversation
Can we split |
I didn't think we'd want to apply the load of either publishing or storing commits to IPFS. I feel like that's just asking for (now gigantic) failed batches due to IPFS issues. Do we really care to spend time figuring out how IPFS does under either type of load? IMO, cutting IPFS out of the picture will be essential for scaling CAS. |
I definitely agree that the goal is to stop publishing to ipfs entirely. I'm just saying let's wind down our use of ipfs in 2 phases. First let's stop publishing notifications of the anchor commits to pubsub, while continuing to still create them in ipfs so that they are available to bitswap. Then if we see no problems with that, then we can stop creating the anchor commits against the CAS IPFS entirely. |
I'll add the second flag though I'm quite certain we'll be turning it off right about the time we start benchmarking CAS with 1M sized batches 😉 I'd also posit a different way of looking at this. Our decision to keep IPFS a part of the CAS architecture shouldn't be based on whether or not IPFS has issues during testing, but on whether it's more or less likely to have issues with larger batches. I'm certain that it is much more likely to run into issues with larger batches that we'll then have to mitigate (e.g. writing intermediate Merkle-tree nodes might be be too much too fast and need additional pacing, etc.) In other words, even if it sails through all of our benchmarking, I would still recommend turning IPFS off completely for production. Whether it works now under high load (I'm skeptical) doesn't mean that it will remain stable in production even under the same type of load. |
Sorry, when I talked about making decisions based on whether or not there are issues, I wasn't saying that we shouldn't remove IPFS unless it presents performance issues. I want to remove IPFS ASAP. I was saying that I'm concerned that once we remove it, we'll have issues in the field from our users where they aren't getting anchor commits. I'm concerned about any lingering bugs in the anchor polling system delivering anchor commits via CAR files, because up until now we've always had pubsub and bitswap as a backup, so if that isn't totally working we might not ever have noticed. I'm also worried about users in the wild running old versions of js-ceramic that don't even have the new polling and CAR file features, and might not upgrade or do anything until suddenly their writes start failing with CACAO errors due to missing anchor commits once we turn off publishing of anchors to IPFS. So that is the reason I want to do this in two phases, so that we have time for users in the wild to report issues to us if they stop getting anchor commits, and for us to have a clearer picture where things are breaking down if that happens. |
Yeah, that makes sense, thanks for clarifying. If we have to choose, we can also always setup separate (dedicated) CAS clusters for V' partners that don't use IPFS at all, and leave current CAS with IPFS. |
…ng-or-another-approach' into feature/ws1-1562-prototype-for-cas-scaling
@@ -55,6 +55,7 @@ export class AnchorRepository implements IAnchorRepository { | |||
async findByRequestId(requestId: string): Promise<StoredAnchor | null> { | |||
const row = await this.table.where({ requestId: requestId }).first() | |||
if (!row) return null | |||
console.log("IM HERE with ", row, requestId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we take this log out, @JulissaDantes? Might flood the logs at high throughput.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are completely right. Thank you for pointing this out!
This PR makes two main changes:
CAS_USE_IPFS_STORAGE
).These changes are backwards compatible. Once ready, they should be merged before the changes from this PR for updating the Scheduler to generate/store larger batches.