Skip to content

Conversation

@max-vassili3v
Copy link
Collaborator

sorry for the big pull request, I just realised that for some reason Github no longer recognised my copy of DualArrays.jl as a fork so I created a new fork and moved everything over.

This is a working CNN implementation and the DualArray/DualMatrix implementations required. I have commented motivations for things like multiplication. I have a couple of questions:

-Right now it seems a bit slow: although after a few hundred epochs the network seems to guess the MNIST numbers pretty consistently, it takes 1-2 seconds per epoch (I can implement this in Lux and compare speeds). I suspect this is because sparsity of the Jacobian is not preserved through converting to and from 4-tensors. What tools are available to help with this?

-I have changed partials of the Dual type to be an array instead of a vector to preserve directions of perturbations when indexing from a DualMatrix. Is this the right thing to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants