Skip to content
This repository was archived by the owner on Nov 5, 2018. It is now read-only.
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
edfc334
GPII-3138: Modified snapset loading to remove existing snapsets, the…
klown Jun 29, 2018
760b795
GPII-3138: Script to update snapsets.
klown Jun 29, 2018
3583141
GPII-3138: Script to update snapsets.
klown Jul 4, 2018
f2374d4
GPII-3138: Script to update snapsets in the database.
klown Jul 4, 2018
c1a494f
GPII-3138: Script to update snapsets in the database.
klown Jul 6, 2018
a8a161f
GPII-3138: Script to update snapsets in the database
klown Jul 6, 2018
98e70d3
GPII-3138: Update snapsets in the database
klown Jul 6, 2018
d042026
GPII-3138: Update snapsets in the database
klown Jul 10, 2018
96bddc6
GPII-3138: Update snapsets in the database
klown Jul 10, 2018
7a290b2
GPII-3138: Update snapsets in the database
klown Jul 10, 2018
01908db
GPII-3138: Update snapsets in the database
klown Jul 11, 2018
372e9c4
GPII-3138: Update snapsets in the database
klown Jul 11, 2018
8118574
GPII-3138: Update snapsets in the database
klown Jul 16, 2018
d42e2a5
GPII-3138: Update snapsets in the database
klown Jul 20, 2018
a730327
GPII-3138: Update snapsets in the database
klown Jul 23, 2018
0e9577b
GPII-3138: Update snapsets in the database.
klown Jul 24, 2018
933d03f
GPII-3138: Docker image that updates snapset in the database
klown Jul 30, 2018
0ba06e4
GPII-3138: Docker image that updates snapset in the database
klown Jul 31, 2018
9bb3f46
GPII-3138: Docker image that updates snapset in the database
klown Jul 31, 2018
8c60cd8
GPII-3138: Docker image that updates snapset in the database
klown Aug 1, 2018
ab35aa1
GPII-3138: Docker image that updates snapset in the database
klown Aug 5, 2018
0034b4d
GPII-3138: Docker image that updates snapset in the database
klown Aug 7, 2018
2059195
GPII-3138: Docker image that updates snapset in the database
klown Aug 7, 2018
29b414e
GPII-3138: Docker image that updates snapset in the database
klown Sep 10, 2018
c9cf927
GPII-3138: Docker image that updates snapsets in the database
klown Sep 10, 2018
82aa358
GPII-3138: Docker image that updates snapsets in the database
klown Sep 12, 2018
c14879c
GPII-3138: Docker image that updates snapsets in the database
klown Sep 12, 2018
b7bea34
GPII-3138: Docker image that updates snapsets in the database
klown Sep 18, 2018
9b4b3f1
GPII-3138: Docker image that updates snapsets in the database
klown Sep 20, 2018
0ec50b3
GPII-3138: Docker image that updates snapsets in the database
klown Sep 20, 2018
05e8dcc
GPII-3138: Docker image that updates snapsets in the database
klown Sep 20, 2018
a147932
GPII-3138: Docker image that updates snapsets in the database
klown Sep 20, 2018
ab4d265
GPII-3138: Docker image that updates snapsets in the database
klown Sep 27, 2018
2680cd3
GPII-3138: Docker image that updates snapsets in the database
klown Oct 17, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 3 additions & 12 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,8 @@ FROM node:8-alpine

WORKDIR /home/node

RUN apk add --no-cache curl git && \
git clone https://github.com/GPII/universal.git && \
cd universal && \
rm -f package-lock.json && \
npm install json5 && \
npm install fs && \
npm install rimraf && \
npm install mkdirp && \
node scripts/convertPrefs.js testData/preferences/ build/dbData/ && \
apk del git
RUN apk add --no-cache curl git

COPY loadData.sh /usr/local/bin
COPY deleteAndLoadSnapsets.sh /usr/local/bin/

CMD ["/usr/local/bin/loadData.sh"]
CMD ["/usr/local/bin/deleteAndLoadSnapsets.sh"]
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
# CouchDB Data Loader

Builds a [sidecar container](http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html) with the CouchDB data from the GPII/universal repository baked in and a mechanism for loading them into a CouchDB database.
Builds a [sidecar container](http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html) that contains the `git` command and a shell script for setting up a CouchDB database. When the docker image is run, this sequence is executed:
1. Clones the latest version of [GPII universal](https://github.com/gpii/universal/),
1. Converts the preferences in universal into snapset Prefs Safes and GPII Keys,
1. Creates a CouchDB database if none exits,
1. Optionally clears an existing database of all its records,
1. Deletes any snapsets currently in the database,
1. Loads the latest snapsets created at the second step into the database.

## Building

Expand All @@ -12,6 +18,7 @@ Builds a [sidecar container](http://blog.kubernetes.io/2015/06/the-distributed-s
- `CLEAR_INDEX`: If defined, the database at $COUCHDB_URL will be deleted and recreated. (optional)
- `STATIC_DATA_DIR`: The directory where the static data to be loaded into CouchDB resides. (optional)
- `BUILD_DATA_DIR`: The directory where the data built from a npm step resides. (optional)
- `NODE_PATH`: Universal's root directory. (optional)

The use of environment variables for data directories is useful if you want to mount the database data using a Docker volume and point the data loader at it.

Expand All @@ -31,5 +38,5 @@ $ docker run -d -p 8081:8081 --name preferences --link couchdb -e NODE_ENV=gpii.
Loading couchdb data from a different location (e.g. /home/vagrant/sync/universal/testData/dbData for static data directory and /home/vagrant/sync/universal/build/dbData for build data directory):

```
$ docker run --name dataloader --link couchdb -v /home/vagrant/sync/universal/testData/dbData:/static_data -e STATIC_DATA_DIR=/static_data -v /home/vagrant/sync/universal/build/dbData:/build_data -e BUILD_DATA_DIR=/build_data -e COUCHDB_URL=http://couchdb:5984/gpii -e CLEAR_INDEX=1 gpii/gpii-dataloader
$ docker run --name dataloader --link couchdb -v /home/vagrant/sync/universal/testData/dbData:/static_data -e STATIC_DATA_DIR=/static_data -v /home/vagrant/sync/universal/build/dbData:/build_data -e BUILD_DATA_DIR=/build_data -e COUCHDB_URL=http://couchdb:5984/gpii [-e CLEAR_INDEX=1] -v /home/vagrant/sync/universal:/universal -e NODE_PATH=/universal gpii/gpii-dataloader
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think [-e CLEAR_INDEX=1] is meant to indicate that this parameter is optional (hence the square brackets). Was this the intent?

I think this is going to confuse users, because when I copy/paste the line I get this unhelpful error:

docker: invalid reference format.
See 'docker run --help'.

I suggest removing the [-e CLEAR_INDEX=1] clause from this command and mentioning it in a bullet point afterward, or listing the complete command twice -- once with and once without the [-e CLEAR_INDEX=1].

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea @mrtyler. I've taken the second approach showing the two ways to run the image, one with CLEAR_INDEX set, and another with it not set.

```
19 changes: 19 additions & 0 deletions loadData.sh → deleteAndLoadSnapsets.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

STATIC_DATA_DIR=${STATIC_DATA_DIR:-/home/node/universal/testData/dbData}
BUILD_DATA_DIR=${BUILD_DATA_DIR:-/home/node/universal/build/dbData}
NODE_PATH=${NODE_PATH:-/home/node/universal}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking if a var name like UNIVERSAL_PATH is better than NODE_PATH because this path is defined for accessing scripts/ in the universal repo. What do you think?

Copy link
Contributor

@mrtyler mrtyler Jul 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that NODE_PATH isn't very clear.

It's a little ambiguous because "universal" can mean so many things, but I like UNIVERSAL_DIR or maybe UNIVERSAL_ROOT_DIR (_PATH isn't consistent with how the other variables are named).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a way to eliminate the NODE_PATH variable entirely. Related: I discovered that since the deleteSnapSets.js script uses infusion, an npm install infusion is required here. Simply cloning universal does not provide an instance of infusion within its node_modules folder. If infusion is so installed, and a cd universal is executed after it is cloned (already done), that's enough for the node command to find infusion. No need for NODE_PATH.


log() {
echo "$(date +'%Y-%m-%d %H:%M:%S') - $1"
Expand Down Expand Up @@ -31,7 +32,24 @@ if [ -z "$COUCHDB_URL" ]; then
fi

log "Starting"
log "Clear index: $CLEAR_INDEX"
log "Static: $STATIC_DATA_DIR"
log "Build: $BUILD_DATA_DIR"
log "Node path: $NODE_PATH"
log "Working directory: `pwd`"

# Set up universal
git clone https://github.com/GPII/universal.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this change by dynamically pulling universal repo at every run.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyli.

You're right, the httpTests.js was just an investigative script to see how the http module worked, especially with CouchDB urls. I've removed it.

I've also updated the README

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize you are just moving what was there before (and I probably should have caught it when reviewing the change when Cindy made it ;)), but please add --depth 1. It will make the clone run faster, and will result in a (slightly) smaller Docker image.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done.

cd universal
rm -f package-lock.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is no longer necessary as package-lock.json has been removed from the universal repo.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right; however, the subsequent npm install commands create a package-lock.json. I don't think it matters since the point is to set up a clone of universal and use it to load and modify the database. That is, I don't think package-lock.json affects the database operations. But, I'm not 100% sure, so, for the time being I've moved the rm command to after the npm install commands.

npm install json5
npm install fs
npm install rimraf
npm install mkdirp
node scripts/convertPrefs.js testData/preferences/ build/dbData/
cd -

# Initialize (possibly clear) data base
if [ ! -z "$CLEAR_INDEX" ]; then
log "Deleting database at $COUCHDB_URL"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

if ! curl -fsS -X DELETE "$COUCHDB_URL"; then
Expand All @@ -45,5 +63,6 @@ if ! curl -fsS -X PUT "$COUCHDB_URL"; then
fi

# Submit data
node $NODE_PATH/scripts/deleteSnapsets.js $COUCHDB_URL
loadData $STATIC_DATA_DIR
loadData $BUILD_DATA_DIR