-
Notifications
You must be signed in to change notification settings - Fork 15
GPII-3391: New dataloader (orig: Add force_update to gpii-dataloader) #163
GPII-3391: New dataloader (orig: Add force_update to gpii-dataloader) #163
Conversation
|
yes, the dataloader is idempotent, but it will delete all the data in the couchdb before uploading again the data. https://github.com/gpii-ops/gpii-dataloader/blob/master/loadData.sh#L45-L50 So I guess that it will destroy all the empty preferences set that Javi will be creating eventually. |
|
Ok, after discussion at standup I'll leave this open and we'll bring this up on Wednesday APCP meeting. Originally the dataloader was meant to be run every time there's a new version, but because of its current implementation, when all data are lost, we might not really want to fix this :). |
|
This PR gpii-ops/gpii-dataloader#6 and GPII/universal#626 should be merged first. |
Thanks Alf, I'm going through those tickets fo review them. There's also #83 in this repo that should be merged I assume. |
|
I've updated the PR description with DB cleanup steps required for new dataloader and at the moment this is still waiting for the move of data loader from it's own image to the universal. |
|
@stepanstipl You mean "Delete":
|
Yea, I do :), thanks, fixed speling. |
…eferences to be consistent with dataloader
|
This now depends on GPII/universal#692 and gpii-ops/gpii-version-updater#10, replaces #83. See description above for details. |
62ccb59 to
4885846
Compare
|
LGTM |
1 similar comment
|
LGTM |
|
@stepanstipl FYI, I've got a copy of this pull and #10 (although there I'm using my docker account for |
|
LGTM |
|
LGTM |
|
@stepanstipl Progress! I wrote:
With @mrtyler's help, my AWS dev cluster is back up and running. The dataloader did its thing and properly added the views and snapsets. Since this run involved creating a new database from scratch, that's all the dataloader did. I'll try adding some new views and 'user' PrefsSafes. But, since it's a weekend, I'll |
|
I killed the dataloader, and re-deployed it. It correctly erased and replaced the snapsets (I checked by looking at the creation dates). |
|
Added one 'user' PrefsSafe/GPII key by hand to the database, and re-deployed the data loader. Only the snapsets were updated (again, by checking creation dates). |
|
@klown thanks for testing further |
One more successful test this morning:
|
|
I found one odditity, @stepanstipl. The dataloader log shows that the "build" directory is:
So: why does the dataloader running inside the universal image not have access to the |
|
Actually, I think I see part of the problem. Line 6 of
|
|
Yes, that's on purpose - the files in the build directory gets created when the convert script is run ( You're right the build directory tree is created as part of postinstall, but it does not contain any files, just two empty dirs - |
Starting with no I wanted to say that when I create the universal docker image, I see the same result. But, actually an error occurs with the
Double checking CI and the last test run on Fri shows the same error message -- look for Still, you have a point about the distinction between preferences data from the docker build time vs. data generated when the dataloader runs. The latter has the latest versions of Any insights @cindyli? |
|
I think that there's actually no need to do the conversion during the dataloader runtime as the Docker image is immutable, therefore the result should always be the same - both during build time and runtime (I guess it was necessary to do that when the dataloader was cloning the repo during runtime). So if we fix the postinstall step, we should just be able to load the data directly from the I imagine the only difference would be in case you run the |
|
There's another difference. A developer can, similarly, run Beyond fixing the |
@klown you mean change the default value of that variable in the dataloader ( |
No, my mistake. I meant that it should default value to
Yes, for development. Suppose a developer wants to make a tweak to one of the preferences in The developer quickly gets their new test preferences converted to a |
I was able to reliably reproduce the problem described in GPII-3391, it happens when the version of dataloader is updated. The reason being that
batch/v1.Job's containerimageis not mutable.Solution is to use helm's
--forceupgrade which :As the dataloader is expected to be idempotent, this should work just fine.
This depends on gpii-ops/exekube#19
Update::
As this is connected with new dataloader release, it will require a manual cleanup of
gpiiKeyand correcpondingprefsSaferecords in a DB. And as we're dealing with prod db, I documented all the steps to be taken and expected number of deleted docs:For details see GPII/universal#626.
Update 2 - merge/release steps::
When merging and releaseing the order should be following:
(All the cleanup steps apply only to GCP, as behaviour for AWS will not change - the new versions of dataloader are not applied there)