Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ Not every single piece of infrastructure needs every single item on the list but
| &#9744; | <details><summary>Track server metrics</summary> <p> Metrics around what your hardware is doing, such as CPU, memory, and disk usage. Useful tools: [Application Insights](https://docs.microsoft.com/en-us/azure/azure-monitor/app/monitor-web-app-availability) which is part of [Azure Monitor](https://docs.microsoft.com/en-us/azure/azure-monitor/overview). </p> </details> |
| &#9744; | <details><summary>Configure services for observability</summary> <p> Record events and stream data from all services. Slice and dice it using tools such as [Kafka](https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-introduction), [Honeycomb](https://www.honeycomb.io/), and of course [Application Insights](https://docs.microsoft.com/en-us/azure/azure-monitor/app/monitor-web-app-availability) which is part of [Azure Monitor](https://docs.microsoft.com/en-us/azure/azure-monitor/overview). </p> </details> |
| &#9744; | <details><summary>Store logs</summary> <p> To prevent log files from taking up too much disk space, configure log rotation on every server. To be able to view and search all log data from a central location (i.e., a web UI), set up log aggregation using tools such as [Azure Monitor](https://docs.microsoft.com/en-us/azure/azure-monitor/platform/data-sources-custom-logs), [Filebeat](https://www.elastic.co/products/beats/filebeat), [Logstash](https://www.elastic.co/products/logstash) etc.</p> </details> |
| &#9744; | <details><summary>Set up alerts</summary> <p> Configure alerts when critical metrics cross pre-defined thresholds, such as CPU usage getting too high or available disk space getting too low. Most of the metrics and log tools listed earlier in this section support alerting. Set up an on-call rotation using tools such as [PagerDuty](https://www.pagerduty.com/), [Opsgenie](https://www.opsgenie.com/) and [VictorOps](https://victorops.com/).</p> </details> |
| &#9744; | <details><summary>Set up alerts</summary> <p> Configure alerts when critical metrics cross pre-defined thresholds, such as CPU usage getting too high or available disk space getting too low. Most of the metrics and log tools listed earlier in this section support alerting. Set up an on-call rotation using tools such as [PagerDuty](https://www.pagerduty.com/), [Opsgenie](https://www.opsgenie.com/), [Squadcast](https://www.squadcast.com/) and [VictorOps](https://victorops.com/).</p> </details> |

### **Cost optimization**

Expand Down