-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Saving and loading a model repeatedly causes it to break #3820
Comments
Hey @lscheinkman and @scottpurdy, this might be another report similar to #3783. @melon3r Can you perhaps attach some code we can run to replicate this? |
@melon3r Can you try this...it's working fine for our project. We also found that you can compress the binary data here quite a bit...
Also, there is a #3805 minor bug in Nupic now where if you attempt to serialize and deserialize without processing any samples in between it will error out. |
Hey @kyle-sorensen, thank you for the tip, but it didn't work out for me. The model breaks at the exact same point.
@rhyolight I'll try to build a small script to reproduce it and share it ;) |
Thanks @melon3r. Numenta engineer @lscheinkman is working on updating our regression test suite so that we serialize our models in the middle of running the NAB data set, then continue after de-serialization. We hope to see this test fail so we can fix the issue and update the source code. Your script might still be helpful, so please continue with it if you can. |
I found the "issue". 🤦♂️ Trying to replicate it I found it was always failing at the same record, the 2184th, with this config in the model parameters: I just copied if from the HotGym example, so I don't even understand it... Can you help? |
@melon3r Can you try either removing it from the configuration or (if that doesn't work) making it extremely large? Then try again? If it works at least we know what to fix. |
Hi @rhyolight, Removing it from the configuration gave it a default value of 4000. I could configure it to be very high, but I don't think that's how it's supposed to be run on production? Are models not supposed to run indefinitely? What's this configuration actually doing? Debugging the error I found that after processing this number of records, flow changes and it starts doing something with a knn anomaly classification region, which it didn't before. What's the difference between the process before and after this threshold is reached? |
It has to do with something unrelated to HTM. It is a legacy setting that is just causing trouble, and we should remove it. It is not affecting how the HTM runs, it's just expressing a bug. Set it to 999999999. |
Alright, thanks. 999999999 that makes for 1900 years of records, at one record per minute so I guess it'll be good :) |
@lscheinkman found that this was still happening when he starting writing more tests for #3808. |
Hi!
I'm feeding data to a model in small batches, saving the model to disk at the end of each batch, and loading it again for the next one. After a few batches, the model stops working and throws the following error when calling
model.run(input)
:Here's the code used to load and store the model:
I've tried using a model generated from a previous batch and skipping some batches of data, to find out if it was the data that was somehow generating a bad model, but after the same number of batches, no matter their contents, I get to a broken model again. Thus, I suspect a bug is being triggered at
readFromFile
orwriteToFile
(or maybe I'm just doing it wrong).This is with Python 2.7.9, and nupic 1.0.3 from pypi.
The text was updated successfully, but these errors were encountered: