Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
392 commits
Select commit Hold shift + click to select a range
9216761
fix: change warning message
t0maboro May 19, 2022
839497d
Update src/widgets/unfold_widgets/preprocessing_widget.py
t0maboro May 19, 2022
a875c70
Merge branch '@tboron/preprocessing-part-2' of https://github.com/mha…
t0maboro May 19, 2022
2007ce7
fix: removing rows with nulls automatically
t0maboro May 19, 2022
f2d8bf7
fix: reduce labels function fixed
t0maboro May 19, 2022
70b4a68
feat: Add place for description in visualization.
gregori0o May 19, 2022
bb31d77
style: update plots
t0maboro May 19, 2022
5ec732b
style: include at least one value in the pie plot
t0maboro May 19, 2022
7c429b6
refactor: Apply suggestion from review.
gregori0o May 19, 2022
9f6a3ec
chore: Update description
gregori0o May 19, 2022
9ca4d13
Merge pull request #15 from mhawryluk/@gregori0o/improve_algorithm_km…
gregori0o May 19, 2022
84cd274
chore: Start implementing ResultsWidget and ResultsEngine
mhawryluk May 19, 2022
4d06f05
style: Add styling to QTabBar
mhawryluk May 19, 2022
f23db0d
chore: Store results in State
mhawryluk May 19, 2022
6a82133
refactor: Turn some methods into common functions
mhawryluk May 20, 2022
ec94d32
feat: Create KMeansResultsWidget and utilize it in ResultsWidget
mhawryluk May 20, 2022
e9035a2
fix: Small changes and improvements to results widget
mhawryluk May 20, 2022
dc7ddf7
feat: Add centroid group to KMeansResultsWidget
mhawryluk May 20, 2022
a130729
style: Combine technique and algorithm TabWidgets
mhawryluk May 20, 2022
a495b70
style: Round values in the centroids QTableView
mhawryluk May 20, 2022
589a42b
feat: Add parameters groupbox to KMeansResultsWidget
mhawryluk May 21, 2022
e334b79
chore: Add mock buttons to PreprocessingWidget
mhawryluk May 21, 2022
2d6b1d1
refactor: Move KMeansResultsWidget to a new package
mhawryluk May 21, 2022
d238659
fix: Remove results widgets on dataset import
mhawryluk May 21, 2022
5be777a
fix: refresh spinbox max value when screen is loading
t0maboro May 22, 2022
2eb7022
refactor: remove unused engine from kmeans options
t0maboro May 22, 2022
4d2991b
Merge pull request #17 from mhawryluk/hotfix/fix-num-of-clusters-spinbox
t0maboro May 23, 2022
f387776
Merge pull request #16 from mhawryluk/@mhawryluk/results-widget
mhawryluk May 23, 2022
3065727
Merge branch 'develop' into @tboron/preprocessing-part-2
t0maboro May 23, 2022
140a6d4
Merge pull request #14 from mhawryluk/@tboron/preprocessing-part-2
t0maboro May 23, 2022
d5cd6ab
Merge branch 'develop' into @tboron/loading-screen
t0maboro May 23, 2022
1d5c872
fix: loading screen class during algorithm run and other minor debugs
t0maboro May 23, 2022
50c8b51
Merge pull request #12 from mhawryluk/@tboron/loading-screen
t0maboro May 25, 2022
202222d
fix: Delete 'src'
gregori0o May 26, 2022
5de72f8
Update requirements.txt
gregori0o May 26, 2022
f412f45
Update README.md
gregori0o May 26, 2022
904848b
feat: add button logic
t0maboro May 29, 2022
ae36d10
feat: cluster preview on double click
t0maboro May 29, 2022
1a89ed3
feat: plot cluster after checking it
t0maboro May 29, 2022
5fae2da
feat: save cluster to csv file
t0maboro May 30, 2022
3075340
style: Change visualization style of step with changed centroid.
gregori0o Jun 1, 2022
45b6130
fix: keep showing selected cluster on axes change
t0maboro Jun 4, 2022
4a762a7
Change text to suggested
t0maboro Jun 4, 2022
0a5b2e9
fix: correct error logic on cluster saving
t0maboro Jun 4, 2022
0434b89
style: add layout for saving all clusters button
t0maboro Jun 4, 2022
4843fe7
feat: save data to csv with cluster index
t0maboro Jun 4, 2022
c96448a
Merge pull request #20 from mhawryluk/@gregori0o/change_style_step_vi…
gregori0o Jun 5, 2022
329ebb3
Update src/widgets/results_widgets/k_means_results.py
t0maboro Jun 6, 2022
04bc9b2
Update src/widgets/results_widgets/k_means_results.py
t0maboro Jun 6, 2022
5d49d64
fix: remove selected indexes loop
t0maboro Jun 6, 2022
883cfcb
Merge pull request #19 from mhawryluk/@tboron/results-clusters-view
t0maboro Jun 6, 2022
460820f
fix: Fix small bugs in visualization. Change k-means for numeric data…
gregori0o Jun 7, 2022
d0c1baf
chore: Add raw_data variable and change result visualization.
gregori0o Jun 8, 2022
6ef0bb5
feat: Create customer data generating script
mhawryluk Jun 9, 2022
54295b5
chore: Use random data generator API
mhawryluk Jun 9, 2022
f8e6df0
refactor: Remove commented lines. Change Null to None.
gregori0o Jun 9, 2022
72af82f
fix: Check selected columns.
gregori0o Jun 9, 2022
ecf20f3
fix: Fix bugs.
gregori0o Jun 9, 2022
045e027
Merge pull request #21 from mhawryluk/@gregori0o/optimize-kmeans-algo…
gregori0o Jun 9, 2022
2898088
Merge branch 'develop' of https://github.com/mhawryluk/data-mining-to…
mhawryluk Jun 9, 2022
0118e73
chore: Add products to customers data and noise values
mhawryluk Jun 9, 2022
8f39e8d
Implement initial reducer logic and PCA algorithm
t0maboro Aug 21, 2022
95b65e0
Automatic reduction flow
t0maboro Aug 21, 2022
4b83d33
feat: Create class for extra trees algorithm.
gregori0o Sep 4, 2022
bdbf151
feat: Implementation of Extra Trees algorithm.
gregori0o Sep 5, 2022
afa6b6e
feat: Implement Options widget for Extra Trees algorithm.
gregori0o Sep 5, 2022
da31f3a
feat: Add selecting file for input with QFileDialog
mhawryluk Sep 18, 2022
5f32c08
feat: Implement tree to dot format converter.
gregori0o Sep 25, 2022
5464d85
feat: Implement A-Priori algorithm for finding association rules
mhawryluk Oct 3, 2022
f386e62
feat: Implement visualization of random forest by QGraphViz.
gregori0o Oct 4, 2022
f3bc369
Apply suggestions
t0maboro Oct 6, 2022
01722a5
Debug columns selection on preprocessing
t0maboro Oct 6, 2022
c5f77a9
Better columns naming
t0maboro Oct 6, 2022
f6d82b2
feat: Implement result panel for ExtraTrees algorithm.
gregori0o Oct 6, 2022
d495b42
fix: Fix problems with errors, when create graphs.
gregori0o Oct 7, 2022
5adcfd1
feat: prepare schema
t0maboro Oct 7, 2022
49cf3a7
feat: manual and automatic estimation
t0maboro Oct 7, 2022
ae01084
refactor: Optimize and clean up A-priori algorithm
mhawryluk Oct 9, 2022
0f8fa08
Fix issues
t0maboro Oct 10, 2022
f0310f4
Fix some issues
t0maboro Oct 10, 2022
ce4167d
refactor: Small changes from reviews.
gregori0o Oct 10, 2022
62da5c6
fix: Review fixes
mhawryluk Oct 11, 2022
a98c025
fix: Small fix
mhawryluk Oct 11, 2022
ccc82e9
Replace JS naming conventions and apply suggestions from review
t0maboro Oct 11, 2022
d756b19
remove logs
t0maboro Oct 11, 2022
83c5a4b
Replace render with spinbox adjustment
t0maboro Oct 11, 2022
9165d32
feat: Add sorting rules by confidence + bug fix
mhawryluk Oct 12, 2022
e634d04
feat: First version of creation steps in ExtraTrees.
gregori0o Oct 13, 2022
bf90646
feat: Finish creation steps.
gregori0o Oct 15, 2022
4ce3877
feat: Add QtImageViewer
gregori0o Oct 16, 2022
d37c7b4
feat: Add animation
gregori0o Oct 16, 2022
940a528
fix: Fix problems with buttons on animation settings.
gregori0o Oct 16, 2022
7410d62
feat: Add visualization of the apriori results in a form of scatter plot
mhawryluk Oct 17, 2022
1131063
feat: Add bar plot showing support or confidence of a set/rule
mhawryluk Oct 17, 2022
71fcfd5
refactor: Review suggestion.
gregori0o Oct 22, 2022
9873cd4
fix: Fix zooming in QtImageViewer.
gregori0o Oct 22, 2022
4fa67f7
fix: Review suggestion.
gregori0o Oct 22, 2022
2ca95b4
Merge branch '@t0ms0n00/pca-reduction' into @t0ms0n00/missing-value-e…
t0maboro Oct 22, 2022
61b4e09
Merge pull request #25 from mhawryluk/@t0ms0n00/missing-value-estimation
t0maboro Oct 22, 2022
196c5b4
feat: Extend the APriori results widget
mhawryluk Oct 22, 2022
c90e82b
feat: Add step by step walk through for apriori visualization widget
mhawryluk Oct 23, 2022
91de4e4
feat: Add animation + refactor + small visual improvements
mhawryluk Oct 23, 2022
7850ec9
Merge pull request #24 from mhawryluk/@gregori0o/decision-tree-algorithm
gregori0o Oct 23, 2022
36fa4a1
refactor: Review fixes
mhawryluk Oct 23, 2022
7caf527
feat: Create abstract classes and adjust to extra trees and kmeans im…
gregori0o Oct 23, 2022
24c0c3b
fix: Set the latest result as default results widget
gregori0o Oct 23, 2022
067adbd
refactor: Small refactor.
gregori0o Oct 23, 2022
863d01c
fix: Change PyQt5-sip version in requirements.txt
mhawryluk Oct 24, 2022
b5320a9
Merge branch 'develop' into @mhawryluk/a-priori
mhawryluk Oct 24, 2022
e2b72c9
Merge pull request #26 from mhawryluk/@mhawryluk/a-priori
mhawryluk Oct 24, 2022
182deb0
Implement gmm algorithm
t0maboro Oct 15, 2022
933fd6c
Copy common visualisation parts
t0maboro Oct 15, 2022
48013aa
First version of plot
t0maboro Oct 15, 2022
17b4d0d
Visualisation done
t0maboro Oct 15, 2022
0b17f3c
Show results
t0maboro Oct 15, 2022
1fb61f2
Show final mean and variance on results screen
t0maboro Oct 19, 2022
981bd4c
Fix dumb naming
t0maboro Oct 20, 2022
f082815
Fix review issues pt.1
t0maboro Oct 22, 2022
14d0bdb
Fix review issues pt.2
t0maboro Oct 22, 2022
db0ed46
Remove duplicated import
t0maboro Oct 24, 2022
fa16740
Merge pull request #27 from mhawryluk/@t0ms0n00/gmm-clustering
t0maboro Oct 24, 2022
f9e624a
Concatenate reduced columns with dataset
t0maboro Oct 24, 2022
979da8b
feat: Creat config file for Algorithm References.
gregori0o Oct 25, 2022
4ecb517
Merge branch 'develop' into @gregori0o/abstract-class-for-new-algorithms
gregori0o Oct 25, 2022
10d591e
style: Modify layout in import section
mhawryluk Oct 25, 2022
9f0f6f4
Merge branch 'develop' into @mhawryluk/file-selector
mhawryluk Oct 25, 2022
ef3092f
feat: Implement usage of algorithms_config.py file.
gregori0o Oct 25, 2022
98e7fa8
fix: Adjust algorithms to new classes
gregori0o Oct 26, 2022
4c16710
refactor: Move steps visualization to widgets.steps_widgets
gregori0o Oct 26, 2022
bede648
refactor: Separate canvas from steps widgets and move to visualizatio…
gregori0o Oct 26, 2022
c6f654a
fix: Review suggestions.
gregori0o Oct 26, 2022
5ddc9d8
fix: Review suggestions.
gregori0o Oct 26, 2022
0243733
feat: Create DataGeneratorWidget
mhawryluk Oct 30, 2022
d0572a0
style: Add styling to DataGeneratorWidget
mhawryluk Oct 30, 2022
b648c76
Merge pull request #28 from mhawryluk/@gregori0o/show-last-algorithm-…
gregori0o Oct 31, 2022
f71c03c
Update src/widgets/unfold_widgets/preprocessing_widget.py
t0maboro Nov 2, 2022
49eca13
fix: Review fixes
mhawryluk Nov 3, 2022
585400a
Merge pull request #29 from mhawryluk/@mhawryluk/file-selector
mhawryluk Nov 3, 2022
9be4963
Fix reviewed issues
t0maboro Nov 6, 2022
56fc04c
Merge branch 'develop' into @t0ms0n00/pca-reduction
t0maboro Nov 6, 2022
ecb303b
Return min 2 columns on automatic reduction
t0maboro Nov 6, 2022
6011f08
fix: Review suggestion
gregori0o Nov 7, 2022
b60aab6
refactor: Add pre-commit
gregori0o Nov 7, 2022
292a0c3
fix: Problems with imports
gregori0o Nov 8, 2022
f19cff3
refactor: Small changes
gregori0o Nov 8, 2022
08d5b9b
refactor: Remove check_numeric and use delect_dtypes (as in GMM)
gregori0o Nov 8, 2022
a6cf914
refactor: Merge clustering canvas
gregori0o Nov 8, 2022
c43f9e0
refactor: Create component for step visualization
gregori0o Nov 9, 2022
f78fdbc
refactor: Use ParametersGroupBox
gregori0o Nov 9, 2022
daa5be8
refactor: Add signals to clustering template
gregori0o Nov 10, 2022
a748883
refactor: Separate ClustersTable class
gregori0o Nov 10, 2022
0af9259
fix: Not change section if linalg error in gmm
gregori0o Nov 10, 2022
611b984
refactor: Use widget to samples data and column choice
gregori0o Nov 10, 2022
2e9fa50
refactor: Remove fig and axes argument from canvas.
gregori0o Nov 11, 2022
712264c
feat: Add scatter plot in preprocessing section.
gregori0o Nov 11, 2022
3405ecf
fix: Fix problems with pre-commit.
gregori0o Nov 11, 2022
d0372a3
upgrade: Upgrade requirements
gregori0o Nov 11, 2022
61cc262
Merge remote-tracking branch 'origin/@mhawryluk/customers-data-gen' i…
mhawryluk Nov 12, 2022
f9e5c3a
feat: Create tooltip widget.
gregori0o Nov 13, 2022
7463ae4
feat: Continue data generator implementation
mhawryluk Nov 13, 2022
e454980
feat: Prepare places to add descriptions.
gregori0o Nov 13, 2022
24f1f54
refactor: All algorithms in config was done.
gregori0o Nov 13, 2022
3ef6e7a
Merge branch 'develop' into @gregori0o/refactor
gregori0o Nov 13, 2022
a40021d
Merge branch '@gregori0o/refactor' into @gregori0o/improve-descriptions
gregori0o Nov 16, 2022
078a7b5
feat: Prepare place for descriptions in config file.
gregori0o Nov 16, 2022
98865dc
chore: Add descriptions of algorithms.
gregori0o Nov 16, 2022
087e662
chore: Add descriptions of parameters in options section.
gregori0o Nov 16, 2022
04a20c9
feat: Implement clustering_blobs_generator and its options widget
mhawryluk Nov 19, 2022
03aa965
Apply @gregori0o suggestions
t0maboro Nov 20, 2022
3b7fa56
chore: Add annotation and group by to Scatter plot in preprocessing.
gregori0o Nov 20, 2022
a9cef4f
fix: Add sort to scatter plot.
gregori0o Nov 20, 2022
8d3553f
feat: Add option validation, tooltips, random seed option
mhawryluk Nov 20, 2022
1865a1d
fix: Set minimum number of reduced dimensions to 1
t0maboro Nov 20, 2022
6fbc8ff
fix: Bounds for enabling reduction
t0maboro Nov 20, 2022
726201c
chore: Add descriptions to extra trees step visualization.
gregori0o Nov 20, 2022
24cbb3b
feat: Add descriptions to AssociationRulesOptions
mhawryluk Nov 21, 2022
bbb6cce
Merge pull request #23 from mhawryluk/@t0ms0n00/pca-reduction
t0maboro Nov 21, 2022
714a4b4
style: Change layout in setup widget.
gregori0o Nov 21, 2022
38e1493
Merge branch 'develop' into @gregori0o/refactor
gregori0o Nov 21, 2022
caa8020
fix: Apply review suggestion.
gregori0o Nov 22, 2022
937683f
chore: Change method of loading data. Select columns remove columns.
gregori0o Nov 22, 2022
e311b4a
fix: Apply review suggestion.
gregori0o Nov 22, 2022
8caed8e
feat: Add limitation of data.
gregori0o Nov 22, 2022
d4a691e
fix: Deny limit to zero.
gregori0o Nov 22, 2022
d5521f0
fix: Save valid columns if click save to database.
gregori0o Nov 22, 2022
ab99626
style: Add svg info icon for macOS compatibility
mhawryluk Nov 27, 2022
443b7a5
fix: Small changes from review.
gregori0o Nov 29, 2022
81d57b5
style: Apply pre-commit
gregori0o Nov 29, 2022
28ddba7
fix: Add reset_index.
gregori0o Nov 29, 2022
43c07be
Merge pull request #34 from mhawryluk/@gregori0o/limit-input-data
gregori0o Nov 29, 2022
86b467d
Merge branch 'develop' into @gregori0o/refactor
gregori0o Nov 29, 2022
ec0cd7f
Merge pull request #30 from mhawryluk/@gregori0o/refactor
gregori0o Nov 29, 2022
f188f3c
style: Apply style suggestion.
gregori0o Nov 29, 2022
ece0cc0
Merge branch 'develop' into @gregori0o/improve-descriptions
gregori0o Nov 29, 2022
699e5bf
feat: Add tooltips to reduction.
gregori0o Nov 29, 2022
a18b7ed
Update src/widgets/unfold_widgets/algorithm_setup_widget.py
gregori0o Nov 29, 2022
e6212e0
fix: Fix problems with columns in GMM results.
gregori0o Nov 30, 2022
d0d6b6e
feat: Add metrics info to algorithm and results.
gregori0o Nov 30, 2022
6750a1a
feat: Add dunn index in kmeans.
gregori0o Nov 30, 2022
61b5ea1
feat: Add accuracy in extra trees.
gregori0o Nov 30, 2022
0539f81
Merge pull request #33 from mhawryluk/@gregori0o/improve-descriptions
gregori0o Dec 3, 2022
1ba6ce5
merge: Merge develop
mhawryluk Dec 3, 2022
3fb2058
feat: Add scatter plot to data generator
mhawryluk Dec 3, 2022
5080262
fix: Data generator bug fixes
mhawryluk Dec 3, 2022
a3ba34e
feat: Implement Dunn index and Davies Bouldin index.
gregori0o Dec 3, 2022
171155a
Merge pull request #35 from mhawryluk/@gregori0o/fix-error-gmm-result
gregori0o Dec 4, 2022
7683d1f
fix: Fix - if test_data not exist.
gregori0o Dec 4, 2022
9b70d7b
Merge branch 'develop' into @gregori0o/add-metrics
gregori0o Dec 4, 2022
87af0e7
feat: Add option of including noise in the clustering generated data
mhawryluk Dec 4, 2022
ddae0b8
feat: Add option to specify parameters of normal distribution for eac…
mhawryluk Dec 4, 2022
2942d91
fix: Swap normal distribution for multivariate normal distribution in…
mhawryluk Dec 4, 2022
e999211
feat: Implement Noncentral F distribution blobs generator
mhawryluk Dec 4, 2022
94943f4
feat: Add validating whether provided degrees of freedom are positive
mhawryluk Dec 4, 2022
ff4b7c6
fix: Fix Dunn Index, feat: Add Silhouette Coefficient
gregori0o Dec 7, 2022
0ba7d53
fix: Review fixes for data generator
mhawryluk Dec 8, 2022
94cbe09
fix: Remove 'null' name from group by in preprocessing widget.
gregori0o Dec 9, 2022
d86b493
fix: Disable big data support in readers.
gregori0o Dec 10, 2022
f54641b
fix: Final data generator improvements
mhawryluk Dec 10, 2022
0bba375
Merge pull request #32 from mhawryluk/@mhawryluk/data-generator
mhawryluk Dec 10, 2022
a8411a7
Merge pull request #38 from mhawryluk/disable-chunks-reading
gregori0o Dec 10, 2022
cfc09d5
feat: Introduce merge screen layout
t0maboro Oct 31, 2022
a9cddae
style: improve screen look
t0maboro Oct 31, 2022
2578258
feat: Handle data import on the new screen
t0maboro Oct 31, 2022
f00a1ee
feat: Handle onClose event
t0maboro Oct 31, 2022
8be60fe
refactor: Make methods private
t0maboro Oct 31, 2022
77814fc
fix: rendering and data importing
t0maboro Oct 31, 2022
063e0df
feat: drag and drop columns
t0maboro Oct 31, 2022
41b0a95
fix: columns section layout
t0maboro Nov 11, 2022
11d8cfd
feat: Merge sets
t0maboro Nov 11, 2022
d8c3da4
fix: review issues pt.1
t0maboro Nov 20, 2022
4a3135b
refactor: introduce class for initializing data reader
t0maboro Nov 20, 2022
cd75a41
feature: add option for equalizing columns with nulls
t0maboro Nov 20, 2022
7b521b8
fix: use precommit
t0maboro Dec 3, 2022
288a18c
feat: removing columns
t0maboro Dec 3, 2022
d6a375f
refactor: remove dead code
t0maboro Dec 3, 2022
6f69f8a
feat: sorting alphabetically with key
t0maboro Dec 3, 2022
d529195
fix: update columns selection box when merge succeeded
t0maboro Dec 3, 2022
7968141
fix: drop rejected columns from new_data
t0maboro Dec 10, 2022
0ff91cd
fix: better column naming
t0maboro Dec 10, 2022
49324a9
feature: new merge logic
t0maboro Dec 10, 2022
039509d
Add covariance regulator
t0maboro Dec 9, 2022
1249b35
Merge pull request #37 from mhawryluk/@t0ms0n00/heuristic-for-inverta…
t0maboro Dec 10, 2022
6c6ad4c
fix: Apply suggestion from review.
gregori0o Dec 10, 2022
f5675dc
Merge branch 'develop' into @gregori0o/add-metrics
gregori0o Dec 10, 2022
e3358a4
fix: review comments
t0maboro Dec 10, 2022
bff3f4f
Merge pull request #31 from mhawryluk/@t0ms0n00/merging-datasets
t0maboro Dec 10, 2022
0d4852d
fix: Fix mistakes in metrics.
gregori0o Dec 10, 2022
31bf60e
fix: Improve performance of dunn index.
gregori0o Dec 11, 2022
2364c39
Merge pull request #36 from mhawryluk/@gregori0o/add-metrics
gregori0o Dec 11, 2022
70c8e16
style: Small style improvements
mhawryluk Dec 14, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 160 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
29 changes: 29 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
repos:
- repo: https://github.com/pycqa/isort
rev: 5.10.1
hooks:
- id: isort
name: isort (python)
args:
- "--profile=black"

- repo: https://github.com/psf/black
rev: 22.10.0
hooks:
- id: black
args:
- --line-length=88
- --include='\.pyi?$'

- repo: https://github.com/pycqa/flake8
rev: 5.0.4
hooks:
- id: flake8
args:
- "--max-line-length=88"
- "--max-complexity=18"
- "--select=B,C,E,F,W,T4,B9,c90"
- "--ignore=E203,E266,E501,W503,F403,F401,E402"

default_language_version:
python: python3.10
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
# Data Mining Tool

## Requirements
- Python >= 3.10
- dependencies listed in the file [requirements.txt](requirements.txt)

## Getting started

```commandline
git clone https://github.com/mhawryluk/data-mining-tool data-mining-tool
cd data-mining-tool
virtualenv .venv
source .venv/bin/activate
python3 -m pip install -r requirements.txt

python3 src/app.py
python -m pip install -r requirements.txt
export MONGO_PASS=<password to mongo db>
cd src
python app.py
```
Loading