Dev0ps

☁️ Ещё одна tui утилита. Для работы с Amazon EC2 инстансами: https://github.com/dutchcoders/cloudman

#tui #amazon #фидбечат

9 views12:28

Dev0ps

Forwarded from DevOps&SRE Library

Distributed systems

http://book.mixu.net/distsys/single-page.html

14 views12:32

Dev0ps

Forwarded from DevOps&SRE Library

ddosify

High-performance load testing tool, written in Golang.

https://github.com/ddosify/ddosify

14 views12:32

Dev0ps

Forwarded from DevOps&SRE Library

Service Status Monitoring Using WhatsApp, Notion, and Python

https://www.twilio.com/blog/service-status-monitoring-whatsapp-notion-python

14 views12:47

Dev0ps

Forwarded from DevOps&SRE Library

How we’re building a production readiness review process at Grafana Labs

https://grafana.com/blog/2021/10/13/how-were-building-a-production-readiness-review-process-at-grafana-labs

15 views12:49

Dev0ps

Forwarded from DevOps&SRE Library

ottr

Ottr is a serverless framework for Public Key Infrastructure (PKI) that aims to provide a robust and scalable method to manage end-to-end certificate rotations using an agentless approach.

https://github.com/airbnb/ottr

15 views19:42

Dev0ps

Forwarded from DevOps&SRE Library

apiclarity

Reconstruct Open API Specifications from real-time workload traffic seamlessly.

https://github.com/apiclarity/apiclarity

15 views19:42

Dev0ps

Forwarded from DevOps&SRE Library

The road to world-class monitoring at Azimo

https://medium.com/azimolabs/the-road-to-world-class-monitoring-at-azimo-bb7dfd358441

15 views19:43

Dev0ps

Forwarded from DevOps&SRE Library

Federating Prometheus Effectively

Federation allows a Prometheus server to scrape selected time series from another Prometheus server. Prometheus federation can be used to scale to hundreds of clusters or to pull related metrics from one service’s Prometheus into another.

https://levelup.gitconnected.com/federating-prometheus-effectively-4ccd51b2767b

14 views19:44

Dev0ps

Forwarded from DevOps&SRE Library

A different and (often) better way to downsample your Prometheus metrics

https://blog.timescale.com/blog/a-different-and-often-better-way-to-downsample-your-prometheus-metrics

13 views19:44

Dev0ps

Forwarded from DevOps&SRE Library

Five-P factors for root cause analysis

https://cloudpundit.com/2021/10/28/five-p-factors-for-root-cause-analysis

13 views19:45

Dev0ps

https://github.com/Noovolari/leapp

GitHub

GitHub - Noovolari/leapp: Leapp is the DevTool to access your cloud

Leapp is the DevTool to access your cloud. Contribute to Noovolari/leapp development by creating an account on GitHub.

13 views19:49

Add a comment

Dev0ps

Forwarded from Мониторим ИТ

PostgreSQL Monitoring for App Developers: Alerts & Troubleshooting

If you choose only one thing to alert on in your PostgreSQL cluster (and as I hope this article makes clear, you should alert on multiple things), it should be availability. If your application is unable to connect or transaction with your database, you're probably in for a bad day. Читать дальше.

Crunchy Data

PostgreSQL Monitoring for App Developers: Alerts & Troubleshooting

When should you be alerted about issues in your PostgreSQL clusters? How do you troubleshoot them? What are some typical solutions?

11 views06:33

Dev0ps

Forwarded from Мониторим ИТ

Percona представляет новый плагин для мониторинга PostgreSQL — pg_stat_monitor.

Проект на Гитхабе.

GitHub

GitHub - percona/pg_stat_monitor: Query Performance Monitoring Tool for PostgreSQL

Query Performance Monitoring Tool for PostgreSQL. Contribute to percona/pg_stat_monitor development by creating an account on GitHub.

13 views06:35

Dev0ps

https://grafana.com/blog/2020/06/23/how-to-visualize-prometheus-histograms-in-grafana/

Grafana Labs

How to visualize Prometheus histograms in Grafana | Grafana Labs

Learn how to turn a Prometheus histogram into a stat panel, bar gauge, or heat map in Grafana

16 views06:41

Add a comment

Dev0ps

https://www.w3.org/TR/trace-context/

www.w3.org

Trace Context

This specification defines standard HTTP headers and a value format to propagate context information that enables distributed tracing scenarios. The specification standardizes how context information is sent and modified between services. Context information…

16 views08:27

Add a comment

Dev0ps

https://www.softether.org/

15 views16:09

Add a comment

Dev0ps

Forwarded from Записки админа

📟 Save your engineers' sleep: best practices for on-call processes. Собственно, из названия всё понятно - полезные советы для организации on-call процесса здорового человека.

#напочитать #support #oncall

8 views21:36

Dev0ps

Forwarded from Грефневая Кафка (pro.kafka)

Время от времени спрашивают как делать приложения, чтобы при падении Кафки приложение не падало. Мне вспомнилась статья Jakub Korab как раз где он разбирается в различных подходах к решению этой задачи.

https://www.confluent.io/blog/how-to-survive-a-kafka-outage/

Confluent

Apache Kafka® Broker Failures & Other Outages

Learn common causes of Apache Kafka® broker failures, as well as how to recover from outages and ensure high availability and resilience in your Kafka cluster.

16 views21:51

Dev0ps

Forwarded from Updates rtfm.co.ua 🇺🇦 (rtfmcoua)

Prometheus: Recording Rules и теги – разделяем алерты в Slack

С 2018 года используем Opsgenie, который получает алерты от Prometheus, CloudWatch и Uptrends, которые потом через Slack-интеграцию отправляет нам в Slack. Интеграции Slack на данный момент выглядят так: В каждой из них настроен фильтр по уровню важности, например интеграция P1, P2 > Slack #devops-alarms-warning: Но есть проблема: так как каналы получаются общие, то все алерты…

https://rtfm.co.ua/prometheus-recording-rules-i-tegi-razdelyaem-alerty-v-slack/

RTFM: Linux, DevOps и системное администрирование | DevOps-инжиниринг и системное администрирование. Случаи из практики.

Prometheus: Recording Rules и теги — разделяем алерты в Slack

Применение Prometheus Recording Rules и Tags для выбора Slack-канала, используя Opsgenie

13 views19:12

About

Blog

Apps

Platform