Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFoS Support #63

Open
wants to merge 32 commits into
base: master
Choose a base branch
from
Open

Conversation

allwefantasy
Copy link

@allwefantasy allwefantasy commented Oct 18, 2017

What changes are proposed in this pull request?

Adding some kinds of Running mode. For now, it support "Normal" and "TFos"

Here is the code show how it works:

estimator = TFTextFileEstimator(inputCol="sentence_matrix", outputCol="sentence_matrix", labelCol="preds",
                                fitParam=[{"epochs": 1, "cluster_size": 2, "batch_size": 1, "model": "/tmp/model"}],
                                runningMode="TFoS",
                                mapFnParam=map_fun)

When runningMode is set to Normal, this means tensorflow will be invode with the number of fitParam's size and they do have the same training data but with diffrent fitParam and multi model will be saved.

When runningMode is set to TFoS, this means a tensorflow cluster will be invoked and only one model will be saved.

How is this patch tested?

Manual tests

Since TFoS only can run in standalone mode, for now it do not support unitest. But i provide TFoSTest.py to test Manually.

Fix this Issue: #52

allwefantasy and others added 30 commits October 13, 2017 17:22
2. Introduce Kafka to avoid broadcast huge tranning data
2. Introduce Kafka to avoid broadcast huge tranning data
2. Introduce Kafka to avoid broadcast huge tranning data
@codecov-io
Copy link

codecov-io commented Oct 18, 2017

Codecov Report

Merging #63 into master will decrease coverage by 5.02%.
The diff coverage is 55.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #63      +/-   ##
==========================================
- Coverage   83.06%   78.04%   -5.03%     
==========================================
  Files          23       25       +2     
  Lines        1234     1512     +278     
  Branches        5        5              
==========================================
+ Hits         1025     1180     +155     
- Misses        209      332     +123
Impacted Files Coverage Δ
python/sparkdl/__init__.py 100% <ø> (ø) ⬆️
python/sparkdl/transformers/utils.py 100% <100%> (ø) ⬆️
...ython/sparkdl/estimators/tf_text_file_estimator.py 42.62% <42.62%> (ø)
python/sparkdl/transformers/tf_text.py 78.26% <78.26%> (ø)
python/sparkdl/param/shared_params.py 81.25% <83.33%> (+1.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 328e51e...bbfcb20. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants