DevOps&SRE Library
19K subscribers
426 photos
2 videos
2 files
5.16K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
kubernetes-autoscaling-mixin

A set of Grafana dashboards and Prometheus alerts for Kubernetes Autoscaling using the metrics from Kube-state-metrics, Karpenter, and Cluster-autoscaler.


https://github.com/adinhodovic/kubernetes-autoscaling-mixin
phoenix

Phoenix is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting.


https://github.com/Arize-ai/phoenix
Building Production-Ready Multi-Agent Systems on Kubernetes: Real Lessons from Deploying 11 Specialized AI Agents

https://aws.plainenglish.io/building-production-ready-multi-agent-systems-on-kubernetes-real-lessons-from-deploying-11-b01976cd4236
Stretching a Layer 2 network over multiple KubeVirt clusters

https://kubevirt.io/2025/Stretched-layer2-network-between-clusters.html
How we cut Kubernetes costs by ~60% for Feature Environments with KEDA and Prometheus

https://pierreraffa.medium.com/reducing-feature-environment-costs-with-keda-and-prometheus-in-kubernetes-307d0dcc3264
Build Your Own Kubernetes based SaaS Cloud Platform with Kamaji and GitOps

Think building a SaaS platform is out of reach? With Kamaji, GitOps, and Kubernetes, it’s simpler — and more powerful — than it seems.


https://itnext.io/build-your-own-saas-cloud-platform-with-kamaji-and-gitops-aeec1b5f17fd
gateway-api-bench

This repo aims to provide a comprehensive test suite that goes far beyond the conformance tests, to help users better understand the real-world behavior of implementations.


https://github.com/howardjohn/gateway-api-bench
helm-unittest

Unit test for helm chart in YAML to keep your chart consistent and robust!


https://github.com/helm-unittest/helm-unittest
k8skonf

Write Kubernetes manifests in TypeScript.


https://github.com/konfjs/k8skonf
keel

Kubernetes Operator to automate Helm, DaemonSet, StatefulSet & Deployment updates


https://github.com/keel-hq/keel
kaniko

kaniko is a tool to build container images from a Dockerfile, inside a container or Kubernetes cluster.

This is a supported replacement of the original GoogleContainerTools/kaniko repository, which was archived in June of 2025.


https://github.com/chainguard-forks/kaniko
You probably don't need Oh My Zsh

Oh My Zsh is still getting recommended a lot. The main problem with Oh My Zsh is that it adds a lot of unnecessary bloat that affects shell startup time.

Since OMZ is written in shell scripts, every time you open a new terminal tab, it has to interpret all those scripts. Most likely, you don't need OMZ at all.


https://rushter.com/blog/zsh-shell
I Cannot SSH Into My Server Anymore (And That’s Fine)

https://soap.coffee/~lthms/posts/i-cannot-ssh-into-my-server-anymore.html
terraform-mcp-server

The Terraform MCP Server is a Model Context Protocol (MCP) server that provides seamless integration with Terraform Registry APIs, enabling advanced automation and interaction capabilities for Infrastructure as Code (IaC) development.


https://github.com/hashicorp/terraform-mcp-server
What came first: the CNAME or the A record?

On January 8, 2026, a routine update to 1.1.1.1 aimed at reducing memory usage accidentally triggered a wave of DNS resolution failures for users across the Internet. The root cause wasn't an attack or an outage, but a subtle shift in the order of records within our DNS responses.


https://blog.cloudflare.com/cname-a-record-order-dns-standards
The Biggest Time Sinks During Outages

https://uptimelabs.io/articles/time-outages