Replies: 2 comments 1 reply
-
|
Hi @jpf91, you're welcome! For deduplication, there is no need to "rewrite" chunks that already exist in the repository, the existing chunk can just be referenced by its chunkid from the new archive. There is no overhead with this and borg did it that way since ever, the no-delete mode does not influence this. Guess the only case we have where rewriting would be needed is if you would recompress (all) existing chunks with different compression settings. But if you just switch to new compression settings in the Yes, pruning and compacting will need delete permissions and one would use a trusted borg client for that, which is permitted to do that. In borg2, prune and delete are currently very "quick and dirty", they just remove the single main reference to the archive(s), orphaning all chunks that are only referenced by the deleted archives. After that, compact will clean up and remove all orphan (unreferenced) chunks. prune and compact will need to decrypt archive metadata, so this needs the borg key (and passphrase) and should be only done from trusted machines. prune does not work with the "age of chunks/blocks", it works with the archive timestamp that is contained in the encrypted archive metadata. compact also needs to decrypt the archive to determine what chunks are still in use. So, guess most people will want to run prune and compact on one of their own machines, not on a provider machine. For some storage types (like cloud storage), the latter is usually not possible anyway. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the detailed answer! So no-delete mode sounds like a good step forward. So with this, a compromised client can also no longer mark archives as deleted, do that an unattended compact would delete them? I guess prune would have to check against some timestamp manipulation (posting future fake archives so the retention policy drops real older archieves). But apart from that, I guess it’s safe? As for compact needing the decrypt key, that’s a bit unfortunate 😅 I think in practice many users just want to host their backup somewhere without having to think about it too much. Borgbase with PikaBackup already does a great job at this, for example by sending emails when there was no new backup etc. But as long as a compromised client can delete data, this is not fool proof. (And I think it’s not even unlikely crypto Trojans would really actively delete backups). So no-delete is required, but then as a PikaBackup user with no technical background, how would you ensure regular compaction? I think for those cases it’s not acceptable to give the data key to storage providers. And even storing it on some vserver is not perfect. But if the provider could only read metadata, it should be less of an issue (I know, depending on the kind of metadata it might leak file size and some more things. Still I think it would be acceptable). So is there any hope we could get dedicated metadata (for compaction) and data keys? I know it’s probably a lot to ask for, but I really think needing fully trusted clients for compaction is a severe limitation. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @ThomasWaldmann ,
thanks for all the great work on borg!
One thing I thought borg1 was always lacking is a good model for potentially compromised clients (malware, cryptotrojans). The anppend only mode was lacking for that, because of the manual rollback.
So I wonder if borg2 addresses this fully. The no-delete mode seems to be the first step. But I wonder: If the client can’t rewrite chunks, is dedup still fully effective? Of course it can not delete old data, but in practice will it be able to reference old chunks just fine? Or will there be additional storage overhead even if no data was deleted in the backup?
And another thing: of course it‘d be required to run compact in this case every now and then to enforce the retention policy. I think it would be best to have a setup, where a normal (potentially untrusted) client can’t delete. A second, partially trusted, client (potentially hosted by the storage provider, e.g borgbase) could compact the repo regularly.
But: can this be done without giving access to data encryption keys to the storage provider? Of course knowing the age of blocks through metadata can already leak some information, but I think this might be acceptable in some use cases. But giving data decryption keys to storage providers is not acceptable.
So I wonder: can borg2 support this use case?
Beta Was this translation helpful? Give feedback.
All reactions