Skip to content

Commit

Permalink
new: generate diagrams from code
Browse files Browse the repository at this point in the history
- Track mermaid code in separate files (*.mmd)
- Generate diagrams using mermaid cli courtesy of docker image: https://github.com/mermaid-js/mermaid-cli#use-dockerpodman
  • Loading branch information
nickumia-reisys committed Sep 12, 2023
1 parent 5894911 commit 0e30190
Show file tree
Hide file tree
Showing 5 changed files with 41 additions and 36 deletions.
43 changes: 7 additions & 36 deletions docs/harvesting.md
Original file line number Diff line number Diff line change
@@ -1,43 +1,14 @@
# Harvesting Pipeline Structure

To generate diagrams, run the mermaid-cli:
```bash
docker run --rm -u `id -u`:`id -g` -v ./:/data minlag/mermaid-cli -i new_harvesting.mmd -o new_harvesting.svg [-t dark -b transparent]
```

## Old Harvesting Logic
Unique to each file + schema format
```mermaid
flowchart LR
sc([SOURCE CREATION])
gs([GATHER STAGE])
fs([FETCH STAGE])
is([IMPORT STAGE])
sc --> gs
gs --> fs
fs --> is
```
![diagram](./old_harvesting.svg)

## New Harvesting Logic
Universal to all file + schema formats
```mermaid
flowchart TD
sc([SOURCE CREATION])
extract([Extract Catalog Source])
compare([Compare Source Catalog to Data.gov Catalog])
nochanges{No Changes?}
deletions{Datasets to Delete?}
updates{Datasets to Add or Update?}
load([Load into Data.gov Catalog])
validate([Validate Dataset])
transform([Transform Schema of Dataset])
completed([End])
sc --> extract
extract --> compare
compare --> deletions
compare --> updates
deletions --> load
updates --> validate
validate --> transform
transform --> validate
validate --> load
load --> completed
compare --> nochanges
nochanges --> completed
```
![diagram](./new_harvesting.svg)
24 changes: 24 additions & 0 deletions docs/new_harvesting.mmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
flowchart TD
sc([SOURCE CREATION])
extract([Extract Catalog Source])
compare([Compare Source Catalog to Data.gov Catalog])
nochanges{No Changes?}
deletions{Datasets to Delete?}
updates{Datasets to Add or Update?}
load([Load into Data.gov Catalog])
validate([Validate Dataset])
transform([Transform Schema of Dataset])
completed([End])

sc --> extract
extract --> compare
compare --> deletions
compare --> updates
deletions --> load
updates --> validate
validate --> transform
transform --> validate
validate --> load
load --> completed
compare --> nochanges
nochanges --> completed
1 change: 1 addition & 0 deletions docs/new_harvesting.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 8 additions & 0 deletions docs/old_harvesting.mmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
flowchart LR
sc([SOURCE CREATION])
gs([GATHER STAGE])
fs([FETCH STAGE])
is([IMPORT STAGE])
sc --> gs
gs --> fs
fs --> is
Loading

1 comment on commit 0e30190

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

Coverage Report
FileStmtsMissCoverMissing
harvester
   __init__.py30100% 
harvester/db/models
   __init__.py50100% 
   models.py530100% 
harvester/extract
   __init__.py1922 89%
   dcatus.py1122 82%
harvester/utils
   __init__.py00100% 
   json.py2266 73%
   pg.py3544 89%
   s3.py2466 75%
harvester/validate
   __init__.py00100% 
   dcat_us.py240100% 
TOTAL1962090% 

Tests Skipped Failures Errors Time
29 0 💤 0 ❌ 0 🔥 13.901s ⏱️

Please sign in to comment.