Summary
Similar to how Redis failures are handled, if an async job is running and suddenly NATS is unreachable or the task queue for the job doesn't exist the job can be failed so that the user gets feedback. It can then me retried once the system is up again.
See ami/jobs/management/commands/chaos_monkey.py for testing NATS flushing.