Investigate combined honest trees + isotonic calibration #2

rflperry · 2022-02-20T20:59:01Z

Background

Honest decision trees build upon conventional decision trees by splitting the samples into two sets: one for learning the decision tree structure and the other for learning the classification posterior probabilities. In practice, this provides better calibration (i.e. the estimated probabilities are closer to the true probabilities). See this paper for details.

The code and experiments for the above paper are located in a fork of ProgLearn. The minimum working code and tutorial is seen in this notebook.

This issue has been copied over from the proglearn repository, but some of the requirements there have already been met by creation of this repository.

Request

An issue was made in sklearn and the simulations and paper attracted developer interest. The paper explored the performance of honest decision forests against the traditional forest as well as two other calibration methods, sigmoid and isotonic. A developer expressed interest in the results of combining honest trees with isotonic calibration given that isotonic calibration seems to do better than just honest posteriors. The request is thus to run the simulations and cc18 experiments from the paper with the added honest + isotonic forest method to see if this combined approach gives better calibration results than either approach alone.

Proposed Workflow

As the current honest forest code and experiments lie on a fork, it may be worthwhile to first create a new repository for just the optimized honest forest code and experiments as a separate entity from proglearn. Either way, the rough workflow would be:

Verify that this honest decision tree can be used as the base estimator for the sklearn isotonic calibration just like the regular sklearn decision tree can be. This may require editing the honest tree class to conform to sklearn specific needs. This is probably the hardest step.
Rerun the overlapping Gaussian simulation using this method too and determine the results.
If the method seems promising, run on the real cc18 data experiments.

jzheng17 · 2022-02-21T07:43:02Z

I saw your email. Will work on calibration. I’m still learning about honest tress and understanding it. Looking to make my first commit this week.

jzheng17 · 2022-04-26T00:35:48Z

Dear Ronan,
Your honest decision tree can indeed be used as the base estimator isotonic calibration. The behaviors are normal programmatically, but I'll verify it's doing what its doing mathematically.

I will now rerun the overlapping Gaussian simulation using this method.

Audrey

jzheng17 · 2022-05-18T18:40:10Z

Dear Ronan,
I've done steps 1 and 2 from the workflow. You can see the results here: https://github.com/jzheng17/honest-forests/blob/main/honest_forests/estimators/tests/isotonic_calibration_test.ipynb Note that I haven't integrated the tests as a pull request to the main repo.

The isotonic calibrated HF works fine and performs slightly better than HF itself. However, when comparing to isotonic calibrated RF using the overlapping Gaussian methods, the curves look almost identical.

I will run the cc18 data experiments now.

Regards,
Audrey

rflperry · 2022-05-18T18:49:27Z

Okay awesome, I'll try to take a look by Friday. I honestly didn't expect a huge difference, but I think the difference between HF and ISO HF is probably the most interesting thing to see in simulations. CC18 is a big set of experiments by the way, I used a computing cluster and it took a long time with many cores and in parallel. I'm not sure it's worth running it if the simulations say what they say now, a waste of energy/time/compute unless there is a big reason you think you should.

jzheng17 · 2022-05-18T18:52:05Z

Dear Ronan,
I’m using Google Colab Pro to run these experiments. Would you suggest any smaller scaled experiments? Because the curves for calibrated HF and calibrated RF are so similar from the toy dataset, I’m curious of how they would look like on real world experiments.

Audrey

rflperry · 2022-05-18T18:56:14Z

You could try individual CC18 datasets. You can probably find the original csv file results on my repo and see which datasets IRF and HF did well compared to RF. But if the curves don't look different on a toy dataset, I wouldn't imagine they would look different on a real dataset. Better time would be spending thinking about reasons why they are the same/different, explaining that, and maybe test through new simulations if there is a meaningful difference.

jzheng17 · 2022-05-18T19:06:52Z

What exactly do you mean by new simulations?

rflperry · 2022-05-18T19:27:04Z

So the initial honest paper asked, how does honest forest compare to other forests and calibration methods. The overlapping gaussian simulation showed that RF wasn't calibrated well, and that honest forests improved calibration as did the other methods. The simulation was designed such that RF would fail.

The initial question posed in this github issue was "Is there a difference between Iso HF and HF and Iso". The simulation you added answers that with, no, there isn't a difference in this example. If you think there is a difference, you (1) come up with how you think they differ (2) design an experiment where one method fails and the other succeeds.

jzheng17 · 2022-05-18T19:46:06Z

Under the scheme of questions related to the overlapping phenomenon between Gaussian clusters, there is a visible difference between Iso HF and HF (and Iso RF and HF), but there isn't a noticeable difference between Iso RF and Iso HF. I see what you mean now. Do you have any suggestions on papers/readings about other common testing methods to see the performance difference related to posterior probabilities (which is what I think the paper was trying to address by proposing HF)? Thank you.

rflperry · 2022-05-19T17:07:44Z

I don't think it is an issue with the metric, I think isotonic leads to better calibration (what we showed in the HF paper). In the HF paper, I cite a Guo 2018? paper which was seminal. You can google papers and tutorials on conformal prediction too, those are related.

jzheng17 · 2022-05-19T17:17:01Z

Sorry for the previous confusion that I might have caused with my wording issues. Just to clarify the objective of the future experiments that I'm designing, are we trying to prove the superiority of Isotonic Calibrated HF over HF, or are we trying to prove the superiority of HF over RF while both are isotonically calibrated? As for the Guo paper you've mentioned, is it this one https://arxiv.org/abs/1706.04599 (On Calibration of Modern Neural Networks).

rflperry · 2022-05-26T15:02:24Z

That's the paper. My HF paper showed that HF is less accurate than RF but improves calibration. Also, it showed IRF is generally more calibrated still than HF. The question of this github issue was is the combination of HF and IRF even more calibrated than IRF (and hence more than HF). My instinct was no, your results suggest no. Future experiments would address this, but I don't think it's an interesting enough question to spend more time on honestly. There are more interesting questions regarding forest methods.

rflperry added the enhancement New feature or request label Feb 20, 2022

PSSF23 assigned jzheng17 Feb 22, 2022

PSSF23 mentioned this issue Jun 1, 2022

ENH test isotonic calibration on simulation data & FIX correct joblib dependency error #4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate combined honest trees + isotonic calibration #2

Investigate combined honest trees + isotonic calibration #2

rflperry commented Feb 20, 2022

jzheng17 commented Feb 21, 2022

jzheng17 commented Apr 26, 2022

jzheng17 commented May 18, 2022

rflperry commented May 18, 2022

jzheng17 commented May 18, 2022

rflperry commented May 18, 2022

jzheng17 commented May 18, 2022

rflperry commented May 18, 2022

jzheng17 commented May 18, 2022

rflperry commented May 19, 2022

jzheng17 commented May 19, 2022

rflperry commented May 26, 2022

Investigate combined honest trees + isotonic calibration #2

Investigate combined honest trees + isotonic calibration #2

Comments

rflperry commented Feb 20, 2022

Background

Request

Proposed Workflow

jzheng17 commented Feb 21, 2022

jzheng17 commented Apr 26, 2022

jzheng17 commented May 18, 2022

rflperry commented May 18, 2022

jzheng17 commented May 18, 2022

rflperry commented May 18, 2022

jzheng17 commented May 18, 2022

rflperry commented May 18, 2022

jzheng17 commented May 18, 2022

rflperry commented May 19, 2022

jzheng17 commented May 19, 2022

rflperry commented May 26, 2022