CNN + Required DualArray implementation #11
+472
−76
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
sorry for the big pull request, I just realised that for some reason Github no longer recognised my copy of DualArrays.jl as a fork so I created a new fork and moved everything over.
This is a working CNN implementation and the DualArray/DualMatrix implementations required. I have commented motivations for things like multiplication. I have a couple of questions:
-Right now it seems a bit slow: although after a few hundred epochs the network seems to guess the MNIST numbers pretty consistently, it takes 1-2 seconds per epoch (I can implement this in Lux and compare speeds). I suspect this is because sparsity of the Jacobian is not preserved through converting to and from 4-tensors. What tools are available to help with this?
-I have changed partials of the Dual type to be an array instead of a vector to preserve directions of perturbations when indexing from a DualMatrix. Is this the right thing to do?