Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Race condition using redis with cooloff enabled #1144

Open
cpoppema opened this issue Dec 1, 2023 · 1 comment
Open

BUG: Race condition using redis with cooloff enabled #1144

cpoppema opened this issue Dec 1, 2023 · 1 comment

Comments

@cpoppema
Copy link

cpoppema commented Dec 1, 2023

Describe the bug

After updating axes to v6+, these errors started to pop up every now and then:

    AxesProxyHandler.user_login_failed(sender=None, request=request, credentials=credentials)
  File "axes/helpers.py", line 614, in inner
    return func(*args, **kwargs)
  File "axes/handlers/proxy.py", line 119, in user_login_failed
    return cls.get_implementation().user_login_failed(
 File "axes/handlers/cache.py", line 123, in user_login_failed
    failures.append(self.cache.incr(key=cache_key, delta=1))
  File "django_redis/cache.py", line 32, in _decorator
    return method(self, *args, **kwargs)
  File "django_redis/cache.py", line 116, in incr
    return self.client.incr(*args, **kwargs)
  File "django_redis/client/default.py", line 428, in incr
    return self._incr(key=key, delta=delta, version=version, client=client, ignore_key_check=ignore_key_check)
  File "django_redis/client/default.py", line 400, in _incr
    raise ValueError("Key '%s' not found" % key)

AxesProxyHandler.user_login_failed is called manually in an authentication layer for api endpoints in this case. It's likely this wil increase the chances of this race condition to happen when users are making api calls at a fixed interval.

To Reproduce

Steps to reproduce the behavior:

  1. set a AXES_COOLOFF_TIME, eg. 15 minutes
  2. on an empty cache, do a first request which you know will be unauthenticated
  3. repeat the request exactly AXES_COOLOFF_TIME later
  4. observe the ValueError from the stacktrace above.

Since I believe this to be a race condition, this is difficult to reproduce naturally. I think there is always a risk of this happening when axes is configured with a cooloff time. I have only tested with redis cache, I don't know if other backends are affected as well.
In user_login_failed the method self.cache.add is called first, followed by self.cache.incr - if the added value is False. Source code snippet:

        for cache_key in cache_keys:
            added = self.cache.add(key=cache_key, value=1, timeout=cache_timeout)
            if added:
                failures.append(1)
            else:
                failures.append(self.cache.incr(key=cache_key, delta=1))
                self.cache.touch(key=cache_key, timeout=cache_timeout)

Because these two operations are not atomic/in a transaction together, it should be possible for the key to expire after self.cache.add and before self.cache.incr.

To always get this error on the second request I modified user_login_failed by expiring the key before the call toincr:

        for cache_key in cache_keys:
            added = self.cache.add(key=cache_key, value=1, timeout=cache_timeout)
            if added:
                failures.append(1)
            else:
                self.cache.expire(cache_key, timeout=-1)  # flush key
                failures.append(self.cache.incr(key=cache_key, delta=1))
                self.cache.touch(key=cache_key, timeout=cache_timeout)

Expected behavior

I expect django-axes to not cause ValueErrors.

Your environment
python version: 3.9
django version: 3.2.23
django-axes version: 6.1.1
Operating system: debian bullseye

Additional context

This system is running Django 3.x and uses django_redis as the caching backend. That package has an additional flag for incr() that skips the exists-check and avoids raising a ValueError: ignore_key_check found here https://github.com/jazzband/django-redis/blob/d7f1292d60ba9d2176149a9ed46b1e1caffb3c0b/django_redis/client/default.py#L557

Even with the call to self.cache.expire passing ignore_key_check=True no longer raises any ValueErrors:

failures.append(self.cache.incr(key=cache_key, delta=1, ignore_key_check=True))

Since this is a flag specific to django_redis I did not create a merge request.

Possible implementation

@aleksihakli
Copy link
Member

Good catch @cpoppema, it does seem like there is a race condition in the code. One option would be to implement a try-except with a ValueError handler to cope with the exception?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants