Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reshape Layer implementation for the GPU Architecture #8

Open
wants to merge 400 commits into
base: master
Choose a base branch
from

Conversation

steremma
Copy link

@steremma steremma commented May 25, 2018

This PR implements and tests all the functions of the Reshape Layer in CUDA. Those are:

  1. The Flatten function.
  2. The Deflatten function.
  3. The Reshape function.

I additionally refactored the respective testing suite to remove code duplication between the same tests implemented in different architectures.

@steremma steremma requested a review from lmoneta as a code owner May 25, 2018 09:43
@steremma steremma changed the title Flatten implementation for the GPU Architecture [DNM] Flatten implementation for the GPU Architecture May 27, 2018
@steremma
Copy link
Author

I renamed this PR to add the [DNM] (Do Not Merge) label. The reason is that I intend to implement the whole ReshapeLayer in this PR rather than opening new ones for subsequent methods. I will let everyone know on mattermost when this should be merged (within a week hopefully)!

@steremma steremma changed the title [DNM] Flatten implementation for the GPU Architecture Flatten implementation for the GPU Architecture May 29, 2018
@steremma
Copy link
Author

I believe this is ready to be merged :)

couet and others added 25 commits June 14, 2018 11:36
- use auto
both in the sequential and MT case.
The sequential case has also been fixed since a non-triggered lazy snapshot
caused a crash.
Bloom filter in header section of so files is well described by:
https://blogs.oracle.com/solaris/gnu-hash-elf-sections-v2
and
lld/ELF/SyntheticSections.cpp

The point is that the static linker puts bloom filter value to .gnu.hash section
in so files. We just have to read this value and compare to the
mangled_name hash that we're looking for. Bloom filter is a false posive
probability data structure, so it might say "yes" to library which
"doesn't" contain mangled_name, but it won't say "no" to library which
"do" contain mangled_name.

Modules W/O this patch
```
Processing tutorials/hsimple.C...
hsimple   : Real Time =   0.04 seconds Cpu Time =   0.03 seconds
(TFile *) 0x562b37a14fe0
Processing /home/yuka/CERN/ROOT/memory.C...
cpu  time = 0.362307 seconds
sys  time = 0.039741 seconds
res  memory = 278.215 Mbytes
vir  memory = 448.973 Mbytes
```

Modules With this patch
```
Processing tutorials/hsimple.C...
hsimple   : Real Time =   0.05 seconds Cpu Time =   0.05 seconds
(TFile *) 0x564410677780
Processing /home/yuka/CERN/ROOT/memory.C...
 cpu  time = 0.356471 seconds
 sys  time = 0.079519 seconds
 res  memory = 266.73 Mbytes
 vir  memory = 423.59 Mbytes
```

This difference become bigger when we need to lookup more libraries in
experiments.
rather than re-creating the full set every time it's needed.
This is a performance optimisation. The routine for finding branches
names is not lightweight, especially when dealing with trees with
tens of branches. The names of the branches is now cached in the
RInterface and its ownership shared among the RInterface<T> instances.
- use auto
- formating
- use auto
- spell check
- spell check
- formating
- formating
linev and others added 26 commits July 4, 2018 13:04
If current pointer changes due to reallocation offset can be used as is
…tectures.

The tests include overlapping local views, as well as the case where depth > 0.
Since some time back the gradient boosting option of TMVA is called
"BoostType=Grad", not "GradBoost". This updated textual output to use
the new name.
…or arguments"

This change needs more thought

This reverts commit 48fc442.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.