~sircmpwn/sr.ht-dev

metrics.sr.ht: Add an alert for high rate of server errors v1 PROPOSED

Ignas Kiela: 2
 Add an alert for high rate of server errors
 Add an alert for high rate of server errors

 2 files changed, 22 insertions(+), 0 deletions(-)
Thanks!

To git@git.sr.ht:~sircmpwn/metrics.sr.ht
   9c1389a..3435b2c  master -> master
Looks like this rule got lost somewhere during the deployment.

https://builds.sr.ht/~sircmpwn/job/242956##task-package-23

-- Email domain proudly hosted at https://migadu.com
Nah, new rules just need manual intervention to enable (they have to be
added to prometheus.yml). I just took care of it.
Export patchset (mbox)
How do I use this?

Copy & paste the following snippet into your terminal to import this patchset into git:

curl -s https://lists.sr.ht/~sircmpwn/sr.ht-dev/patches/11360/mbox | git am -3
Learn more about email & git

[PATCH metrics.sr.ht] Add an alert for high rate of server errors Export this patch

---
 service_rules.yml | 11 +++++++++++
 1 file changed, 11 insertions(+)
 create mode 100644 service_rules.yml

diff --git a/service_rules.yml b/service_rules.yml
new file mode 100644
index 0000000..26fedf0
--- /dev/null
+++ b/service_rules.yml
@@ -0,0 +1,11 @@
# vim: tw=2 sw=2 :
groups:
- name: service
  rules:
  - alert: High rate of 500 errors
    expr: rate(http_requests_total{status="500"}[10m]) > 5 / 60
    for: 2m
    labels:
      severity: important
    annotations:
      summary: "{{ $labels.instance }} has a high rate of 500 errors"
-- 
2.17.1


- Email domain proudly hosted at https://migadu.com
Thanks! Can you bump this to "urgent" severity?
And increase the interval to 5m as well.

[PATCH metrics.sr.ht v2] Add an alert for high rate of server errors Export this patch

---
 service_rules.yml | 11 +++++++++++
 1 file changed, 11 insertions(+)
 create mode 100644 service_rules.yml

diff --git a/service_rules.yml b/service_rules.yml
new file mode 100644
index 0000000..2a71244
--- /dev/null
+++ b/service_rules.yml
@@ -0,0 +1,11 @@
# vim: tw=2 sw=2 :
groups:
- name: service
  rules:
  - alert: High rate of 500 errors
    expr: rate(http_requests_total{status="500"}[10m]) > 5 / 60
    for: 5m
    labels:
      severity: urgent
    annotations:
      summary: "{{ $labels.instance }} has a high rate of 500 errors"
-- 
2.17.1


- Email domain proudly hosted at https://migadu.com
Thanks!

To git@git.sr.ht:~sircmpwn/metrics.sr.ht
   9c1389a..3435b2c  master -> master