Skip to content

[sql-44] firewalldb: add migration code for privacy mapper from kvdb to SQL #1092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ViktorTigerstrom
Copy link
Contributor

@ViktorTigerstrom ViktorTigerstrom commented Jun 17, 2025

Based on the branch of #1079, to only show the relevant commits.

This PR introduces the migration logic for transitioning the privacy mapper from kvdb to SQL.

Note that as of this PR, the migration is not yet triggered by any production code, i.e. only tests execute the migration logic.

I will rebase this on the new version of #1079, once the current feedback has been addressed after we agree on the approach there.

As this PR is heavily dependent on #1079, this is blocked on the dependent PR until it has been merged. This is therefore not ready for review.

Part of #917

@ViktorTigerstrom ViktorTigerstrom added the no-changelog This PR is does not require a release notes entry label Jun 17, 2025
@ViktorTigerstrom ViktorTigerstrom self-assigned this Jun 17, 2025
@ellemouton
Copy link
Member

ellemouton commented Jun 18, 2025

@ViktorTigerstrom - feel free to push a base branch to upstream & then target that so that this pr can just contain the commits that are relevant

@ViktorTigerstrom ViktorTigerstrom force-pushed the 2025-06-migrate-privacy-mapper branch from bbe2b90 to 9715313 Compare July 11, 2025 22:45
@ViktorTigerstrom
Copy link
Contributor Author

Rebased this on the latest version of #1079, as that PR feels good to go, and this PR is therefore close to not being blocked by that PR (and therefore ready to review).

@ViktorTigerstrom ViktorTigerstrom changed the base branch from master to kvstores-migration-PR July 11, 2025 22:50
@ViktorTigerstrom
Copy link
Contributor Author

Requested a review by both of you @ellemouton & @bitromortac, even though #1079 isn't merged yet, just so you know that it is ready for review once that's been merged.

Copy link
Member

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really really great work. Very clean 🔥

Comment on lines +509 to +514
privPairs, err := collectPrivacyPairs(ctx, kvStore, sqlTx)
if err != nil {
return fmt.Errorf("error migrating privacy mapper store: %w",
err)
}

// 2) Insert all collected privacy pairs into the SQL database.
err = insertPrivacyPairs(ctx, sqlTx, privPairs)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to tests the performance of this one as i think unlike the kv-store db, this one could be quite populated.

So for this one it might make sense to migrate on the fly instead of collecting everything in memory first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I will benchmark this as a follow-up and see if it makes sense to process every privacy pair individually, or if this version works as well. I've prepared a separate branch that changes the code here to processes each pair one by one here:
https://github.com/ViktorTigerstrom/lightning-terminal/tree/2025-06-migrate-privacy-mapper-individual-pair-processing

That branch adds one additional commit that changes this code. If you think it already makes sense to change the code to that version already before benchmarking, let me know and I'll fixup that last commit into this PR.

Comment on lines 686 to 689
_, err := sqlTx.GetPseudoForReal(
ctx, sqlc.GetPseudoForRealParams{
GroupID: groupID,
RealVal: realVal,
},
)
if err == nil {
return fmt.Errorf("duplicate privacy pair %s:%s: %w",
realVal, pseudoVal, ErrDuplicatePseudoValue)
} else if !errors.Is(err, sql.ErrNoRows) {
return err
}

_, err = sqlTx.GetRealForPseudo(
ctx, sqlc.GetRealForPseudoParams{
GroupID: groupID,
PseudoVal: pseudoVal,
},
)
if err == nil {
return fmt.Errorf("duplicate privacy pair %s:%s: %w",
realVal, pseudoVal, ErrDuplicatePseudoValue)
} else if !errors.Is(err, sql.ErrNoRows) {
return err
}

err = sqlTx.InsertPrivacyPair(
ctx, sqlc.InsertPrivacyPairParams{
GroupID: groupID,
RealVal: realVal,
PseudoVal: pseudoVal,
},
)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not potentially do this in one go with a single query that has a conflict rule?

Copy link
Contributor Author

@ViktorTigerstrom ViktorTigerstrom Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I noticed that since the privacy_mapper SQL table already has unique constraints, those additional duplication checks becomes redundant. I included those as they were present in the sql_store.go insertion function. As those checks therefore also are redundant there, I included an extra commit 5506372 that removes them there as well. If you think we shouldn't remove them there as well, let me know and I'll drop that commit :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included an extra commit 5506372

Actually, I noticed that we have those checks explicitly in the kvdb stores as well, so I think this falls back to what we've discussed before: Don't change or optimize & change the definition in the SQL store before we've migrated, and instead mimic the kvdb store behaviour. Therefore I updated the PR again to remove the extra commit, as I believe we need to wait until after the migration has been shipped before we can change that.

Comment on lines +604 to +601
if realBkt := bkt.Bucket(realToPseudoKey); realBkt != nil {
realToPseudoRes, err = collectPairs(realBkt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking: do we have a test somewhere that makes sure that things dont fail on an empty kvdb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "empty" test in every migration does exactly that? Note that the real kvdb to SQL migration will also initiate the actual kvdb stores by calling accounts.NewBoltStore, session.NewDB & firewalldb.NewBoltDB before the migration starts, which initiates the base buckets.

Comment on lines 735 to 734
kvEntries: fn.Some(insertedEntries),
// No privacy pairs are inserted in this test.
privPairs: fn.None[privacyPairs](),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have a test that includes both

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good idea! I added a randomFirewallDBEntries that combines the results of randomKVEntries & randomPrivacyPairs into one test :)

Copy link
Contributor

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Only have a couple of nits 🚀

@@ -34,6 +34,11 @@ var (
testEntryValue = []byte{1, 2, 3}
)

// expectedResult represents the expected result of a migration test.
type expectedResult struct {
kvEntries fn.Option[[]*kvEntry]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary to have an option type here, as an empty []*kvEntry already signals that, not? (also for the pair data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think it's a bit cleaner to use an option type here to make it explicitly clear if there's any expected kvEntries or privPairs set. I noticed that we have a few other cases where the option type is used in combination with a [] in litd & lnd. I therefore kept this as is for now. I don't have super strong preference though, so if you do have strong preference for not using the option type here, I'll update to remove it.

@@ -213,6 +218,20 @@ func TestFirewallDBMigration(t *testing.T) {
}
}

// The assertMigrationResults function will currently assert that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd rather not put comments of what will happen in the future, because it may add confusion for somebody who's not involved in the process and it also may never happen, or we forget to update the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true, agree 🙏! I added a new commit 4d2a423 that removes (hopefully) all of those comments

Comment on lines 686 to 710
_, err := sqlTx.GetPseudoForReal(
ctx, sqlc.GetPseudoForRealParams{
GroupID: groupID,
RealVal: realVal,
},
)
if err == nil {
return fmt.Errorf("duplicate privacy pair %s:%s: %w",
realVal, pseudoVal, ErrDuplicatePseudoValue)
} else if !errors.Is(err, sql.ErrNoRows) {
return err
}

_, err = sqlTx.GetRealForPseudo(
ctx, sqlc.GetRealForPseudoParams{
GroupID: groupID,
PseudoVal: pseudoVal,
},
)
if err == nil {
return fmt.Errorf("duplicate privacy pair %s:%s: %w",
realVal, pseudoVal, ErrDuplicatePseudoValue)
} else if !errors.Is(err, sql.ErrNoRows) {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are those checks necessary as they look to be covered by collectGroupPairs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with #1092 (comment)


// multiSessionsOnePrivPairs inserts 1 session with 1 privacy pair into the
// boltDB.
func oneSessionAndPrivPair(t *testing.T, ctx context.Context,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could use createPrivacyPairs directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these functions to avoid having to change the signature of the populateDB function, and to not to not have to define the populateDB function within the actual tests list definition. If you have an idea of how to do it in a cleaner way than having these separate functions, let me know :)

Comment on lines 794 to 795
realKey := fmt.Sprintf("real-%d-%d", i, j)
pseudoKey := fmt.Sprintf("pseudo-%d-%d", i, j)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it add anything if we'd let real/pseudo keys be the same for the sessions, to leave i away?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm interesting idea! I changed this so that the realKey values will be the same across sessions (i.e. removed the i part for the realKey), but kept the pseudoKey value as is. The motivation behind this is that in real world data, there will often be cases where the real value in the privacy pairs will be the same across sessions, while the pseudo value should differ. If you disagree with that reasoning and think we should also change the pseudoKey values to be the same across sessions, let me know :)

As the firewalldb migration will include more than just the migration
of the kvstore data, we rename the migration tests that only migrate
the kvstore data to make it clearer which tests only focus on migrating
kv entries.
Currently, the migration tests for firewalldb only migrates the kv
stores. In future commits, we will also migrate the privacy mapper
and the actions in the firewalldb,. Before this commit, the expected
results of the migrations tests could only be kv records, which will not
be the case when we also migrate the privacy mapper and the actions.

Therefore, we prepare the migration tests to expect more than just kv
records. This commit introduces a new type of `expectedResult` type
which the prep of the migration tests will use, which can specify more
than just one type of expected result.
This commit removes speculative comments from the firewalldb migration
docs that forecast future implementations. Such comments can create
confusion for developers looking at the current code base without
knowing our plans how the migration will be further developed.
@ViktorTigerstrom ViktorTigerstrom force-pushed the 2025-06-migrate-privacy-mapper branch from 9715313 to d440c12 Compare July 23, 2025 14:31
@ViktorTigerstrom ViktorTigerstrom changed the base branch from kvstores-migration-PR to master July 23, 2025 22:35
This commit introduces the migration logic for transitioning the
privacy mapper store from kvdb to SQL.

Note that as of this commit, the migration is not yet triggered by any
production code, i.e. only tests execute the migration logic.
@ViktorTigerstrom ViktorTigerstrom force-pushed the 2025-06-migrate-privacy-mapper branch 2 times, most recently from 5506372 to a4fc12e Compare July 23, 2025 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog This PR is does not require a release notes entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants