Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

FilippoAiraldi / mpc-reinforcement-learning Public

Notifications You must be signed in to change notification settings
Fork 47
Star 359

Code
Issues
Pull requests 1
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: FilippoAiraldi/mpc-reinforcement-learning

Releases · FilippoAiraldi/mpc-reinforcement-learning

1.3.1

15 Nov 21:49

Compare

Choose a tag to compare

Loading

1.3.1 Latest

Latest

Changes

implemented mpcrl.util.geometry.ConvexPolytopeUniformSampler which allows for uniformly sampling points from the interior or surface of n-dimensional convex polytopes
improvements to mpcrl.wrappers.env.MonitorEpisodes
modified mpcril.util.control.cbf and other methods about Control Barrier Function such that they do not need to return a casadi.Function
improvements to docs

Assets 2

Loading

All reactions

v1.3.0

18 Oct 08:04

Compare

Choose a tag to compare

Loading

v1.3.0

Changes

Major

heavily improved documentation, but some portions are still work-in-progress
updated to csnlp==1.6.1
improved WarmStartStrategy to provide initial conditions for non-warmstarted variables
implemented continuous-time, discrete-time and input constrained Control Barrier Functions in mpcrl.util.control.cbf, dcbf, and iccbf respectively
implemented mpcrl.util.geometry.ConvexPolytopeUniformSampler

Minor

added property is_training to agents
improvements to mpcrl.util.control.dlqr and lqr methods
adjusted dependencies
fixed tests, warnings, and deprecation messages

Assets 2

Loading

All reactions

v1.2.1

17 Jul 14:49

Compare

Choose a tag to compare

Loading

v1.2.1

Changes

Updated dependency to csnlp >= 1.6.0
In case of additive exploration, implemented clipping based on the env.action_space (if this is a gymnasium.spaces.Box instance)
Now passing an exploration in LstdDpgAgent is mandatory (otherwise, mathematically, the agent won't learn because the advantage function is always zero)
LstdDpgAgent now supports hessian_type = natural
Implemented wrappers.agents.Evaluate that allows to periodically evaluate the performance of the training agent
Implemented new exploration class: OrnsteinUhlenbeckExploration
Implemented bound_consistency for GradientBasedOptimizer instances (ensures values of parameters are clipped within bounds when True)
Minor computation simplifications in LstdQLearningAgent
Fixed some deprecation warnings from numpy and gymnasium
Fixed tests and docstrings

Assets 2

Loading

All reactions

v1.2.0.post1

11 Apr 14:38

Compare

Choose a tag to compare

Loading

v1.2.0.post1

Changes

Major

implemented a base gradient-free agent GlobOptLearningAgent, and a corresponding Bayesion Optimization example based on BoTorch
implemented off-policy Q-learning (see method train_offpolicy)
implemented WarmStartStrategy (allows for finer control on how multistart MPC is fed random initial points)

Minor

reworked internal structure of optimizers (introduced BaseOptimizer)
reworked internal structure of agents
better sensitivity computations in Q-learning
updated tests and docstrings

Assets 2

Loading

All reactions

v1.1.9

29 Dec 16:30

Compare

Choose a tag to compare

Loading

v1.1.9

Changes

Major

improvements to Agent's hooking mechanism: now it uses a dict to keep track of callbacks instead of nested function wrappers
improved seeding
allowing csnlp.ScenarioBasedMpc to be used as MPC by Agent

Minor

improvements to internal files
switched to prehooks
improved readability of full hessian calculation in LstdQLearningAgent
improvements to imports and docstrings
updated tests

Assets 2

Loading

All reactions

v1.1.8

26 Oct 16:52

Compare

Choose a tag to compare

Loading

v1.1.8

Changes

Major

upgraded dependency to csnlp==1.5.8
reworked inner computations both in LstdQlearningAgent and LstdDpgAgent for performance and adherence to theory
reworked inner workings of callbacks: now they are stored in an internal dict, so easier to debug
fixed disrupting bug in the computations of the parameters' bounds for a constrained update
implemented the mpcrl.optim sub-module: it contains different optimizers such as
- Stochastic Gradient Descent
- Newton's Method
- Adam
- RMSprop
moved parameters' constrained update solver to OSQP (QRQP was having scaling issues)
removed LearningRate class
implemented schedulers.Chain, allowing to chain multiple schedulers into a single one

Minor

added possibility to pass integer argument to experience. This will create a buffer with the specified size
improvements to mpcrl.util.math
improvements to wrappers.agents.Log (now uses lazy logging)
fixed bugs on on_episode_end and on_episode_start callback hook
improvements to examples

Assets 2

Loading

All reactions

v1.1.7

13 Sep 09:03

Compare

Choose a tag to compare

Loading

v1.1.7

Changes

Major

further reworked the QP problem that solves for the RL update
added flag remove_bounds_on_initial_action to remove bounds on first action on Q(s,a) to avoid LICQ problems

Minor

added use_last_action_on_fail to let the agent use the last successful action if the MPC fails
small updates to examples
small improvements to docstrings and testing

Assets 2

Loading

All reactions

v1.1.6

29 Aug 09:04

Compare

Choose a tag to compare

Loading

v1.1.6

Changes

Major

removed support for Python 3.8, and added Python 3.11
implemented StepWiseExploration strategy (a wrapper for exploration strategies)
added possibility of using first-order Q-learning
improved DPG: removed hessian (was wrong), changed default linear solver to csparse due to the sparse nature of the system, and added ridge regression
Changed the QP update problem to take the hessian information directly into account in the hessian of the QP

Minor

improved numba with caching and parallel computations where possible
better seeding with recommended np.random.SeedSequence
simplified hooking mechanism with lambdas
fixed bug in examples, where the reward was computed on the next state instead of the current state
updated README and docstrings
updated tests

Assets 2

Loading

All reactions

v1.1.4

20 Jun 15:37

Compare

Choose a tag to compare

Loading

v1.1.4

Changes

moved and added some control methods, e.g., dlqr and rk4, to dedicated file util.control
added flag to disable updates in learning-based agents during evaluation
fixed bugs
- callback hooking in learning-based agents
- rollout experience never consolidated in DPG agent
added additional check on the shape of learnable parameters to avoid unwanted broadcasting
updated dependency to csnlp==1.5.6
fixed some tests

Assets 2

Loading

All reactions

v1.1.3

05 Apr 18:18

Compare

Choose a tag to compare

Loading

v1.1.3

Changes

Major

now learnable parameters (i.e., as per LearnableParameter and LearnableParametersDict) do not need to be 1D/flatten vectors, but also matrices
added skip_first to UpdateStrategy to allow to skip the first n updates for, e.g., building enough experience before updating
updated to csnlp==1.5.4 7and casadi==3.6.0

Minor

streamlined rollout operations in LstdDpgAgent
renamed attribute include_last to include_latest in ExperienceReplay
better type hints and tests

Assets 2

Loading

All reactions

Previous 1 2 Next

Previous Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.