- 
                Notifications
    
You must be signed in to change notification settings  - Fork 1.4k
 
Emergency Runbook
        Tabatha D Zeitke edited this page Dec 21, 2022 
        ·
        11 revisions
      
    This site is hosted on Gatsby Cloud and is maintained / supported by New Relic's Docs Team team.
- Troubleshooting dashboard
 - #help-documentation (for engineering and content requests)
 - #doc_eng_bots (alert and deployment updates)
 - Alert policy
 - Architecture notes
 
| Scenario | Severity | Resolution | 
|---|---|---|
| Site is not loading | ❗ High | Rollback a release | 
| All localized pages are throwing 500s | ❗ High | Rollback a release | 
| Functionality is broken | Rollback a release | |
| Alert has been triggered | Respond to an incident | |
| Copy needs to be adjusted | 👀 Unknown | Ping @hero in #help-documentation or Use leave a comment in Feedback form on the relevant doc page to generate a Jira ticket | 
If the site is not loading, or a piece of functionality is broken, you will likely need to rollback to a stable release using the following steps. There are two ways to rollback a release:
- Log into Gatsby Cloud with Github two-factor.
 - Select the 
docs-website - mainsite. - Scroll down to Build history to see all the previous builds that have published.
 - Find the appropriate build to roll back to. Click 
Publishto deploy that build of the site. 
If you do not have access to Gatsby Cloud, you can perform a rollback using Github:
- 
Find the pull request (into 
main) that you would like to rollback. - Click 
Revertto create a new pull request that undoes this work. - Have someone review the rollback and approve the pull request.
 - Once the necessary checks have passed, merge into 
main. - A build will be triggered in Gatsby Cloud. Once complete, the rollback will be released.
 
The following steps are for on-call engineers working at New Relic:
- Don't panic, you've got this!
 - Check to see if there is already an ongoing incident in #emergency-room (or in 2, 3, and 4).
 - If there is not an ongoing incident, start one by following the steps in the Incident Commander Runbook.
 - Refer to the troubleshooting dashboard to get an idea for what could be going on.
 - Look at the recent deployments to production to identify a PR that can be reverted.