Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Regressor Weather Forecast Data #1410

Closed
c3-VincentBrisse opened this issue Mar 30, 2020 · 2 comments
Closed

Add Regressor Weather Forecast Data #1410

c3-VincentBrisse opened this issue Mar 30, 2020 · 2 comments

Comments

@c3-VincentBrisse
Copy link

To help with forecasting water consumption, I would like to add a regressor with weather data. I have three years of data.
During cross validation, I first train on the first two years, then predict on the next two weeks, then train on the first two years plus one week, then predict on the next two weeks etc...
To avoid data leakage in validation, I should only use weather forecast data, while to have a better training I should obviously use actual weather data. Is there a way to implement this with prophet ?

@bletham
Copy link
Contributor

bletham commented Mar 31, 2020

This came up in a discussion a couple years ago. I made an issue for it in #442, but it isn't something that there's any real interface for. I think the best thing to do will be to write out the CV loop yourself.

Which fortunately I do not think is too difficult. This is all of the code for doing cross validation for a single cutoff:

# Generate new object with copying fitting options
m = prophet_copy(model, cutoff)
# Train model
history_c = df[df['ds'] <= cutoff]
if history_c.shape[0] < 2:
raise Exception(
'Less than two datapoints before cutoff. '
'Increase initial window.'
)
m.fit(history_c, **model.fit_kwargs)
# Calculate yhat
index_predicted = (df['ds'] > cutoff) & (df['ds'] <= cutoff + horizon)
# Get the columns for the future dataframe
columns = ['ds']
if m.growth == 'logistic':
columns.append('cap')
if m.logistic_floor:
columns.append('floor')
columns.extend(m.extra_regressors.keys())
columns.extend([
props['condition_name']
for props in m.seasonalities.values()
if props['condition_name'] is not None])
yhat = m.predict(df[index_predicted][columns])
# Merge yhat(predicts), y(df, original data) and cutoff
return pd.concat([
yhat[predict_columns],
df[index_predicted][['y']].reset_index(drop=True),
pd.DataFrame({'cutoff': [cutoff] * len(yhat)})
], axis=1)

Right now it is calling predict on df[index_predicted][columns]. And you would just need to change that to plug in the forecasted values for the regressor in the window [cutoff, cutoff + horizon].

That does CV for a single cutoff, and then you would just loop over cutoffs and concatenate all of the results.

@c3-VincentBrisse
Copy link
Author

Thank you that answers my question !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants