Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Reading GIF files #46

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

seanpquig
Copy link

@seanpquig seanpquig commented Aug 26, 2017

We have been using this library over at GIPHY and love it. We had to adapt it to work with GIFs, and we thought we'd share some of the changes with the community. It includes:

  • readGifs function that reads a directory of GIFs and splits them out into individual frames/images that can be fed into InceptionV3 and other models.
  • supporting unit tests mimicking tests in TestReadImages
  • updates to .gitignore

@codecov-io
Copy link

codecov-io commented Aug 27, 2017

Codecov Report

Merging #46 into master will increase coverage by 0.11%.
The diff coverage is 89.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #46      +/-   ##
==========================================
+ Coverage   85.06%   85.18%   +0.11%     
==========================================
  Files          19       19              
  Lines         991     1026      +35     
  Branches        5        5              
==========================================
+ Hits          843      874      +31     
- Misses        148      152       +4
Impacted Files Coverage Δ
python/sparkdl/image/imageIO.py 89.92% <89.47%> (-0.51%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5a6e07...f86ac03. Read the comment docs.

@thunterdb
Copy link
Contributor

Hello @seanpquig thank you very much for this contribution, we will be happy to add support for GIFs to Deep Learning Pipelines.

I have some design questions about the new schema added for GIF which we should be able to resolve without too much change on your side.

To give some context, we are in the process of consolidating different image processing solutions around the image schema described in python/sparkdl/image/imageIO.py and I believe that we can add some extra fields to the image schema to handle gifs, without having to create a separate schema. Before offering some changes, I would like to understand a bit more some of the use case: from looking at gifSchema, it looks like you do not use the fact that frames in a GIF are ordered and you simply store them independently in a dataframe? Do you foresee such a use case of keeping all the frames together?

@seanpquig
Copy link
Author

Hey @thunterdb. I tried to take a minimal and flexible approach and have the gif schema be per frame and identical to the image schema with an additional frameNum field to keep track of ordering. This has allowed us to write our own custom functions and processing to do things like frame sampling, averaging model predictions across frames, and investigating ordering effects.

I think combining in a common schema could be great as long as it doesn't sacrifice any information and the ability to access individual frames. Perhaps images could be loosely modeled as special case of a GIF that has a single frame. Your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants