Reshape Layer implementation for the GPU Architecture #8

steremma · 2018-05-25T09:43:18Z

This PR implements and tests all the functions of the Reshape Layer in CUDA. Those are:

The Flatten function.
The Deflatten function.
The Reshape function.

I additionally refactored the respective testing suite to remove code duplication between the same tests implemented in different architectures.

steremma · 2018-05-27T11:47:40Z

I renamed this PR to add the [DNM] (Do Not Merge) label. The reason is that I intend to implement the whole ReshapeLayer in this PR rather than opening new ones for subsequent methods. I will let everyone know on mattermost when this should be merged (within a week hopefully)!

steremma · 2018-05-29T08:27:46Z

I believe this is ready to be merged :)

- use auto

in case gDebug is greater than 0.

both in the sequential and MT case. The sequential case has also been fixed since a non-triggered lazy snapshot caused a crash.

…MT cases

Bloom filter in header section of so files is well described by: https://blogs.oracle.com/solaris/gnu-hash-elf-sections-v2 and lld/ELF/SyntheticSections.cpp The point is that the static linker puts bloom filter value to .gnu.hash section in so files. We just have to read this value and compare to the mangled_name hash that we're looking for. Bloom filter is a false posive probability data structure, so it might say "yes" to library which "doesn't" contain mangled_name, but it won't say "no" to library which "do" contain mangled_name. Modules W/O this patch ``` Processing tutorials/hsimple.C... hsimple : Real Time = 0.04 seconds Cpu Time = 0.03 seconds (TFile *) 0x562b37a14fe0 Processing /home/yuka/CERN/ROOT/memory.C... cpu time = 0.362307 seconds sys time = 0.039741 seconds res memory = 278.215 Mbytes vir memory = 448.973 Mbytes ``` Modules With this patch ``` Processing tutorials/hsimple.C... hsimple : Real Time = 0.05 seconds Cpu Time = 0.05 seconds (TFile *) 0x564410677780 Processing /home/yuka/CERN/ROOT/memory.C... cpu time = 0.356471 seconds sys time = 0.079519 seconds res memory = 266.73 Mbytes vir memory = 423.59 Mbytes ``` This difference become bigger when we need to lookup more libraries in experiments.

rather than re-creating the full set every time it's needed. This is a performance optimisation. The routine for finding branches names is not lightweight, especially when dealing with trees with tens of branches. The names of the branches is now cached in the RInterface and its ownership shared among the RInterface<T> instances.

- use auto

- formating

- use auto

- spell check

- spell check - formating

- formating

If current pointer changes due to reallocation offset can be used as is

Introduce dummy for it

…minate code duplication

…tectures. The tests include overlapping local views, as well as the case where depth > 0.

…xpect the same arguments

Since some time back the gradient boosting option of TMVA is called "BoostType=Grad", not "GradBoost". This updated textual output to use the new name.

…st on different architectures

…d in-method comments

…architectures to prove correctness

…ents

…the `ReshapeLayer` class

…or arguments" This change needs more thought This reverts commit 48fc442.

…s. Added tests for the CPU architecture

steremma requested a review from lmoneta as a code owner May 25, 2018 09:43

steremma changed the title ~~Flatten implementation for the GPU Architecture~~ [DNM] Flatten implementation for the GPU Architecture May 27, 2018

steremma changed the title ~~[DNM] Flatten implementation for the GPU Architecture~~ Flatten implementation for the GPU Architecture May 29, 2018

couet and others added 25 commits June 14, 2018 11:36

- spell check

9ca988b

- use auto

use auto

7eca657

Add operator<< for RSha256Hash, also exploiting ADL

117657e

[IO] Add printout about non-treatment of streamer info record

a159306

in case gDebug is greater than 0.

[DF] Warn the user in case a lazy snapshot was booked but not triggered

cfd707f

both in the sequential and MT case. The sequential case has also been fixed since a non-triggered lazy snapshot caused a crash.

[DF] Add tests for non-triggered lazy snapshot in the sequential and …

8231746

…MT cases

[DOC] Uniform doxygen doc for TTree::Write overloads

1899523

[DF] Clean after test cast

b420f8f

Make header standalone, encapsulate better sha related routines.

4555217

Realease Notes

b3cf3ff

- Spell check

0990957

- use auto

- Use auto

c022c71

- use auto

38ae8bc

- use auto

c4b5f0e

- formating

- spell check

132ee19

- use auto

- use auto

35702ff

Use auto

dd561f6

- use auto

0030a48

- spell check

- use auto

59fd301

- spell check - formating

- new picture with the new default palette

7970ab8

- Use auto

b40c624

- formating

new image corresponding to the new default palette

de12875

new image corresponding to the new default palette

bf56fa7

linev and others added 26 commits July 4, 2018 13:04

xml: use only offset from current position in LocateValue

01912c5

If current pointer changes due to reallocation offset can be used as is

xml: fix in bugfix

a3af87a

xml: use variable reference in ExpandStream

d06150b

Introduce dummy for it

xml: no need to reassign curr variable - not a dummy now

47dd373

Added CUDA implementation for Downsample

87cf4b8

Added tests for the CUDA case and refactored the testing suite to eli…

35d65f3

…minate code duplication

Added new testing entries to CMakeLists under the CUDA_ENABLED condition

2251cb0

Added unit tests for back-propagation, running on all available archi…

927fda5

…tectures. The tests include overlapping local views, as well as the case where depth > 0.

Unified the worker method API calls to make sure all implementation e…

59a2c72

…xpect the same arguments

Fixed a bug on back-propagation detected using my unit tests

936d5bc

Added CUDA implementation for the back-propagation

b859101

[TMVA] ROOT-9081 -- Replace GradBoost with BoostType=Grad

8a547fb

Since some time back the gradient boosting option of TMVA is called "BoostType=Grad", not "GradBoost". This updated textual output to use the new name.

[DOCS] Add me and Massimo to contributor list

24df538

Less verbose output when reading GDML auxiliary info.

904a0cd

Added protection for ComputeLradTsaiFactor when Z=0

4f477ee

Fix for ROOT-7917

67f3043

Tested implementation of Flatten on the GPU

87ba961

refactored tests to remove code duplications when running the same te…

ca6bdbd

…st on different architectures

Added documentation for the new functionality including docstrings an…

f4768ed

…d in-method comments

Added the CUDA implementation for Deflatteting. Added tests for both …

31e74a5

…architectures to prove correctness

Simplified ReshapeLayer API by removing redundant constructor argum…

f6ab6bf

…ents

fix indexing bug in the Reference and Cpu Reshape implementations

753154a

Added CUDA implementation for the Reshape method

1aa5bd0

Renamed all tests to reflect the fact that they test every method of …

5052ce4

…the `ReshapeLayer` class

Added a test-case for the reshape method

58a66f1

Revert "Simplified ReshapeLayer API by removing redundant construct…

6c2ce53

…or arguments" This change needs more thought This reverts commit 48fc442.

steremma force-pushed the reshape branch from f24172c to 1177c51 Compare July 5, 2018 14:28

steremma added 2 commits July 6, 2018 13:23

Fixed all tests to return status code - now CI will catch broken test…

15193dc

…s. Added tests for the CPU architecture

fix warnings

e9bc604

steremma force-pushed the reshape branch from 1177c51 to e9bc604 Compare July 6, 2018 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reshape Layer implementation for the GPU Architecture #8

Reshape Layer implementation for the GPU Architecture #8

steremma commented May 25, 2018 •

edited

Loading

steremma commented May 27, 2018

steremma commented May 29, 2018

Reshape Layer implementation for the GPU Architecture #8

Are you sure you want to change the base?

Reshape Layer implementation for the GPU Architecture #8

Conversation

steremma commented May 25, 2018 • edited Loading

steremma commented May 27, 2018

steremma commented May 29, 2018

steremma commented May 25, 2018 •

edited

Loading