DevOps&SRE Library
19K subscribers
426 photos
2 videos
2 files
5.16K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
kubernetes-autoscaling-mixin

A set of Grafana dashboards and Prometheus alerts for Kubernetes Autoscaling using the metrics from Kube-state-metrics, Karpenter, and Cluster-autoscaler.


https://github.com/adinhodovic/kubernetes-autoscaling-mixin
phoenix

Phoenix is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting.


https://github.com/Arize-ai/phoenix
Building Production-Ready Multi-Agent Systems on Kubernetes: Real Lessons from Deploying 11 Specialized AI Agents

https://aws.plainenglish.io/building-production-ready-multi-agent-systems-on-kubernetes-real-lessons-from-deploying-11-b01976cd4236
Stretching a Layer 2 network over multiple KubeVirt clusters

https://kubevirt.io/2025/Stretched-layer2-network-between-clusters.html
How we cut Kubernetes costs by ~60% for Feature Environments with KEDA and Prometheus

https://pierreraffa.medium.com/reducing-feature-environment-costs-with-keda-and-prometheus-in-kubernetes-307d0dcc3264
Build Your Own Kubernetes based SaaS Cloud Platform with Kamaji and GitOps

Think building a SaaS platform is out of reach? With Kamaji, GitOps, and Kubernetes, it’s simpler — and more powerful — than it seems.


https://itnext.io/build-your-own-saas-cloud-platform-with-kamaji-and-gitops-aeec1b5f17fd
gateway-api-bench

This repo aims to provide a comprehensive test suite that goes far beyond the conformance tests, to help users better understand the real-world behavior of implementations.


https://github.com/howardjohn/gateway-api-bench
helm-unittest

Unit test for helm chart in YAML to keep your chart consistent and robust!


https://github.com/helm-unittest/helm-unittest
k8skonf

Write Kubernetes manifests in TypeScript.


https://github.com/konfjs/k8skonf
keel

Kubernetes Operator to automate Helm, DaemonSet, StatefulSet & Deployment updates


https://github.com/keel-hq/keel
kaniko

kaniko is a tool to build container images from a Dockerfile, inside a container or Kubernetes cluster.

This is a supported replacement of the original GoogleContainerTools/kaniko repository, which was archived in June of 2025.


https://github.com/chainguard-forks/kaniko
You probably don't need Oh My Zsh

Oh My Zsh is still getting recommended a lot. The main problem with Oh My Zsh is that it adds a lot of unnecessary bloat that affects shell startup time.

Since OMZ is written in shell scripts, every time you open a new terminal tab, it has to interpret all those scripts. Most likely, you don't need OMZ at all.


https://rushter.com/blog/zsh-shell
I Cannot SSH Into My Server Anymore (And That’s Fine)

https://soap.coffee/~lthms/posts/i-cannot-ssh-into-my-server-anymore.html
terraform-mcp-server

The Terraform MCP Server is a Model Context Protocol (MCP) server that provides seamless integration with Terraform Registry APIs, enabling advanced automation and interaction capabilities for Infrastructure as Code (IaC) development.


https://github.com/hashicorp/terraform-mcp-server
What came first: the CNAME or the A record?

On January 8, 2026, a routine update to 1.1.1.1 aimed at reducing memory usage accidentally triggered a wave of DNS resolution failures for users across the Internet. The root cause wasn't an attack or an outage, but a subtle shift in the order of records within our DNS responses.


https://blog.cloudflare.com/cname-a-record-order-dns-standards
The Biggest Time Sinks During Outages

https://uptimelabs.io/articles/time-outages
qmd

An on-device search engine for everything you need to remember. Index your markdown notes, meeting transcripts, documentation, and knowledge bases. Search with keywords or natural language. Ideal for your agentic flows.


https://github.com/tobi/qmd