Interfaces continuum bg [ckarwin's]#498
Conversation
|
As far as I can tell, this is almost ready to merge, it just need the extra |
|
Might close #382. Let's remember to check once the unit tests are running. |
Codecov Report❌ Patch coverage is
🚀 New features to boost your workflow:
|
|
Updates: The conflicts with develop have been solved, so the unit tests are running. Now there are unit tests for ubuntu and macOS (with an M chip), as well as w/wo installing pytorch (_ml). There are 4 checks in total based on their combination. All of them are working in the develop branch, but the one with pytorch and running in the M chip is failing. This was expected. Also, note that the unit test with pytorch running on ubuntu is working fine. Hopefully the changes suggested by @mabp1992 will fix that and it can be merged. Note: I know that there's going to be an exception raise for macOS, just the graceful one I requested. Pytest allows you to check for a specific exception https://docs.pytest.org/en/7.1.x/how-to/assert.html |
|
Thanks, @israelmcmc. I will get to this soon. |
0073560 to
4b43a59
Compare
|
The discussion will continue on #519 |
I'm bringing @ckarwin PR from my fork to the main repo. See discussion in:
israelmcmc#10
The continuum BG estimation class is ready to be checked. Below I give a high-level summary of what is contained in this PR and some of the key points.
I have added an example notebook:
BG_estimationNN_example.ipynb. There you will find an explanation of the algorithm, as well as examples of the functionality. The tutorial also includes a spectral fit using the estimated BG. The spectral fit follows the same procedure as outlined in the spectral fit example indocs/api/interfaces/examples/crab.The previous BG estimation class has been removed entirely. The simple interpolation method that was used for the in-painting in the old class has been added as a subclass in the new class (
ContinuumEstimationNN.py). The code for the old algorithm has been optimized significantly. It used to take on the order of hours to run this method, and it now takes on the order of seconds. This is nice to have as a baseline of comparison to the NN-based in-painting methods (more on this below).I updated the tutorial auto-run so that the new tutorial can be ran automatically from the terminal.
This class requires pytorch and torch geometric. For now, I am assuming that these are optional dependencies for cosipy, and so likewise, the class is setup so that nothing should break if the user doesn't have these libraries.
I added unit tests, but they will only run if pytorch and torch geometric are installed. I believe this is coded properly so that the test should just skip if the dependencies are not installed. I'm not sure how this will work out for the coverage requirement of the pull request, but please keep this in mind.
As shown in the tutorial, the NN-based algorithm (using the supervise mode) performs decently: the recovered Crab spectrum is within ~10% of the recovered spectrum using the ideal BG model. The NN-based in-painting performs better than the simple in-painting algorithm, in terms of the recovered Crab spectrum. However, the simple algorithm results in a better likelihood. I don't quite understand this at the moment. In general, the algorithm needs to be tested further and developed to get the optimum results. The good news is that most of the fundamental code is in place now, and so this will make it easier to test and refine.
I spent a lot of time looking over the new refactoring, and trying to understand the logic better. As part of this, I started using pycharm to browse the code, which is helpful. However, for the continuum BG estimation class, I decided not to make this an independent BG protocol. My reasoning here is that I don't think it's really necessary. As shown in the tutorial, I can use the
FreeNormBinnedBackgroundinterface and just replace the input BG (h5) file.I think this might touch on a more general consideration which should be discussed further (maybe not necessarily as part of this PR). It's indeed necessary for the code to be modular, however, at a certain point I think we also run the risk of things becoming unnecessarily complicated. This is something we should avoid.