SRE Bytes: The Four Golden Signals of Monitoring
https://medium.com/@chaoskyle/sre-bytes-the-four-golden-signals-of-monitoring-317420631db6
https://medium.com/@chaoskyle/sre-bytes-the-four-golden-signals-of-monitoring-317420631db6
An Incident Command Training Handbook
https://blog.danslimmon.com/2019/06/24/an-incident-command-training-handbook
https://blog.danslimmon.com/2019/06/24/an-incident-command-training-handbook
Effective SRE: SLO Engineering and Error Budget
https://medium.com/@info_51889/effective-sre-slo-engineering-and-error-budget-cc1ce142274b
https://medium.com/@info_51889/effective-sre-slo-engineering-and-error-budget-cc1ce142274b
pg_activity
pg_activity is a top like application for PostgreSQL server activity monitoring.https://github.com/dalibo/pg_activity
agnos
Obtain (wildcard) certificates from let's encrypt using dns-01 without the need for API access to your DNS provider.https://github.com/krtab/agnos
How to Handle Kubernetes Health Checks
https://doordash.engineering/2022/08/09/how-to-handle-kubernetes-health-checks
https://doordash.engineering/2022/08/09/how-to-handle-kubernetes-health-checks
tracetest
Tracetest is a OpenTelemetry based tool that helps you develop and test your distributed applications. It assists you in the development process by enabling you to trigger your code and see the trace as you add OTel instrumentation. It also empowers you to create trace-based tests based on the data contained in your OpenTelemetry trace. You can verify against both the triggering transactions response AND any of the information contained deep in a span in your trace.https://github.com/kubeshop/tracetest
Introducing the official ClickHouse plugin for Grafana
https://grafana.com/blog/2022/05/05/introducing-the-official-clickhouse-plugin-for-grafana
https://grafana.com/blog/2022/05/05/introducing-the-official-clickhouse-plugin-for-grafana
Observability Best Practices when running FastAPI in a Lambda
https://www.eliasbrange.dev/posts/observability-with-fastapi-aws-lambda-powertools
https://www.eliasbrange.dev/posts/observability-with-fastapi-aws-lambda-powertools
k8spacket
k8spacket - packets traffic visualization for kuberneteshttps://github.com/k8spacket/k8spacket
bindplane-op
BindPlane OP is an open source observability pipeline that gives you the ability to collect, refine, and ship metrics, logs, and traces to any destination. BindPlane OP provides the controls you need to reduce observability costs and simplify the deployment and management of telemetry agents at scale.https://github.com/observIQ/bindplane-op
6 Best Practices for Effective Readiness and Liveness Probes
https://www.datree.io/resources/kubernetes-readiness-and-liveness-probes-best-practices
https://www.datree.io/resources/kubernetes-readiness-and-liveness-probes-best-practices
Optimizing TCP for high WAN throughput while preserving low latency
https://blog.cloudflare.com/optimizing-tcp-for-high-throughput-and-low-latency
https://blog.cloudflare.com/optimizing-tcp-for-high-throughput-and-low-latency
Slowing Down to Speed Up – Circuit Breakers for Slack’s CI/CD
How Slack increased developer productivity and prevented cascading internal failures by implementing orchestration-level circuit breakershttps://slack.engineering/circuit-breakers
gprofiler
gProfiler is a system-wide profiler, combining multiple sampling profilers to produce unified visualization of what your CPU is spending time on.https://github.com/Granulate/gprofiler
jc
CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.https://github.com/kellyjonbrazil/jc
A Comprehensive Guide to Terraform
A series of posts that will teach you best practices for using Terraform in the real worldhttps://blog.gruntwork.io/a-comprehensive-guide-to-terraform-b3d32832baca
Update, Sep 28, 2022
paranoia
Paranoia is a tool to analyse and export trust bundles (e.g., "ca-certificates") from container images. These certificates identify the certificate authorites that your container trusts when establishing TLS connections. The design of TLS is that any certificate authority that your container trusts can issue a certificate for any domain. This means that a malicious or compromised certificate authority could issue a certificate to impersonate any other service, including your internal infrastructure.https://github.com/jetstack/paranoia