Skip to content

Commit b52f7a0

Browse files
committed
update development environment
1 parent 10f8602 commit b52f7a0

File tree

13 files changed

+115
-47
lines changed

13 files changed

+115
-47
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,5 +75,8 @@ target/
7575
# Jupyter NB Checkpoints
7676
.ipynb_checkpoints/
7777

78+
# bash history
79+
.bash_history
80+
7881
# exclude data from source control by default
79-
/data/
82+
# /data/

.test_environment.py.swp

1 KB
Binary file not shown.

Dockerfile

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,50 @@
1-
# Base image
2-
FROM python:3.6
31

4-
# Updating repository sources
5-
RUN apt-get update
2+
# Adapted from https://towardsdatascience.com/how-docker-can-help-you-become-a-more-effective-data-scientist-7fc048ef91d5
3+
FROM ubuntu:16.04
4+
5+
# Adds metadata to the image as a key value pair example LABEL
6+
LABEL maintainer="Simon Kassel <[email protected]>"
7+
8+
RUN apt-get update --fix-missing && apt-get install -y wget bzip2 ca-certificates \
9+
build-essential \
10+
byobu \
11+
curl \
12+
git-core \
13+
htop \
14+
pkg-config \
15+
python3-dev \
16+
python-pip \
17+
python-setuptools \
18+
python-virtualenv \
19+
unzip \
20+
nano \
21+
&& \
22+
apt-get clean && \
23+
rm -rf /var/lib/apt/lists/*
24+
25+
RUN echo 'export PATH=/opt/conda/bin:$PATH' > /etc/profile.d/conda.sh && \
26+
wget --quiet https://repo.continuum.io/archive/Anaconda3-5.0.0.1-Linux-x86_64.sh -O ~/anaconda.sh && \
27+
/bin/bash ~/anaconda.sh -b -p /opt/conda && \
28+
rm ~/anaconda.sh
29+
30+
ENV PATH /opt/conda/bin:$PATH
631

732
# Copying requirements.txt file
833
COPY requirements.txt requirements.txt
934

1035
# pip install
11-
RUN pip install --no-cache -r requirements.txt
36+
RUN pip install --no-cache -r requirements.txt &&\
37+
rm requirements.txt
38+
39+
# Open Ports for Jupyter
40+
EXPOSE 8000
1241

13-
# Exposing ports
14-
EXPOSE 8888
42+
#Setup File System
43+
RUN mkdir project
44+
ENV HOME=/project
45+
ENV SHELL=/bin/bash
46+
VOLUME /project
47+
WORKDIR /project
1548

16-
# Running jupyter notebook
17-
# --NotebookApp.token ='demo' is the password
18-
# CMD ["jupyter", "notebook", "--no-browser", "--ip=0.0.0.0", "--allow-root", "--NotebookApp.token='demo'"]
49+
# Run a shell script
50+
CMD ["/bin/bash"]

README.md

Lines changed: 49 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,71 @@
1-
Azavea data analytics team python project template
1+
Azavea Data Analytics team python project template
22
==============================
33

4-
template for Azavea Data Analytics team Python projects
4+
A file structure template, development environment and rule set for python data analytics projects on the data analytics team
5+
6+
Getting Started
7+
------------
8+
From within the root directory, first remove git tracking from the project
9+
10+
`rm -rf .git`
11+
12+
If you have not already done so, build the Docker image (you will only need to do this once)
13+
14+
`docker build -t da-project-template .`
15+
16+
Run a Docker container
17+
18+
`./scripts/container.sh .'
19+
20+
This will open a bash shell within the Docker container. Within the container the 'project' directory on the host machine (as specified as a parameter of `container.sh` above) will map to `/project` within the container. You can now access the full file structure of this template from within the container.
21+
22+
To exit
23+
24+
`exit`
525

626
Project Organization
727
------------
828

9-
├── LICENSE
10-
├── Makefile <- Makefile with commands like `make data` or `make train`
1129
├── README.md <- The top-level README for developers using this project.
1230
├── data
13-
│   ├── external <- Data from third party sources.
14-
│   ├── interim <- Intermediate data that has been transformed.
15-
│   ├── processed <- The final, canonical data sets for modeling.
16-
│   └── raw <- The original, immutable data dump.
31+
│   ├── interm <- Intermediate data that has been transformed
32+
│   ├── organized <- Raw datasets that have been renamed or reorganized into a new folder structure but have not been changed at all
33+
│   ├── processed <- The final, canonical data sets for modeling
34+
│   └── raw <- The original, immutable data dump
35+
36+
├── docs <- A default Sphinx project; see sphinx-doc.org for details (currently not configured)
1737
18-
├── docs <- A default Sphinx project; see sphinx-doc.org for details
38+
├── guide <- A set of markdown files with documented best practices, guidelines and rools for collaborative projects
1939
2040
├── models <- Trained and serialized models, model predictions, or model summaries
2141
2242
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
23-
│ the creator's initials, and a short `-` delimited description, e.g.
24-
│ `1.0-jqp-initial-data-exploration`.
43+
│ the creator's initials, and a short `-` delimited description, e.g
44+
│ `1.0-jqp-initial-data-exploration`
2545
2646
├── references <- Data dictionaries, manuals, and all other explanatory materials.
2747
2848
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
2949
│   └── figures <- Generated graphics and figures to be used in reporting
3050
31-
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
32-
│ generated with `pip freeze > requirements.txt`
33-
34-
├── src <- Source code for use in this project.
35-
│   ├── __init__.py <- Makes src a Python module
36-
│ │
37-
│   ├── data <- Scripts to download or generate data
38-
│   │   └── make_dataset.py
39-
│ │
40-
│   ├── features <- Scripts to turn raw data into features for modeling
41-
│   │   └── build_features.py
42-
│ │
43-
│   ├── models <- Scripts to train models and then use trained models to make
44-
│ │ │ predictions
45-
│   │   ├── predict_model.py
46-
│   │   └── train_model.py
47-
│ │
48-
│   └── visualization <- Scripts to create exploratory and results oriented visualizations
49-
│   └── visualize.py
50-
51-
└── tox.ini <- tox file with settings for running tox; see tox.testrun.org
51+
├── requirements.txt <- The requirements file for reproducing the analysis environment
52+
53+
└── src <- Source code for use in this project.
54+
55+
   ├── data <- Scripts to download or generate data
56+
   │   └── make_dataset.py
57+
58+
   ├── features <- Scripts to turn raw data into features for modeling
59+
   │   └── build_features.py
60+
61+
   ├── models <- Scripts to train models and then use trained models to make
62+
│ │ predictions
63+
   │   ├── predict_model.py
64+
   │   └── train_model.py
65+
66+
   └── visualization <- Scripts to create exploratory and results oriented visualizations
67+
   └── visualize.py
68+
5269

5370

5471
--------

data/.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Ignore everything in this directory
2+
*
3+
# Except this file
4+
!.gitignore
File renamed without changes.
File renamed without changes.

docs/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# [Work in Progress]

guide/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Rules & Best Practices
2+
3+
A markdown documentation of rules, guidelines and best practices for working on collaborative data analysis projects on the Data Analytics team

requirements.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,4 +55,6 @@ traitlets==4.3.2
5555
wcwidth==0.1.7
5656
Werkzeug==0.12.2
5757
widgetsnbextension==3.0.6
58-
jupyterlab==0.31.9
58+
jupyterlab==0.31.9
59+
geopandas==0.3.0
60+
descartes==1.1.0

0 commit comments

Comments
 (0)