DevOps&SRE Library
18.4K subscribers
459 photos
3 videos
2 files
5.01K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Atlantis with Terragrunt – Automate Terraform Workflows

https://spacelift.io/blog/atlantis-terragrunt
What are Terraform Lock Files

Learn what a Terraform lock file is and why it's important.


https://scalr-cdn.com/what-are-terraform-lock-files
zensical

A modern static site generator by the creators of Material for MkDocs.


https://github.com/zensical/zensical
unisondb

UnisonDB is an open-source database designed specifically for Edge AI and Edge Computing.

It is a reactive, log-native and multi-model database built for real-time and edge-scale applications. UnisonDB combines a B+Tree storage engine with WAL-based (Write-Ahead Logging) streaming replication, enabling near-instant fan-out replication across hundreds of nodes — all while preserving strong consistency and durability.


https://github.com/ankur-anand/unisondb
From Signals to Reliability: SLOs, Runbooks and Post-Mortems

https://fatihkoc.net/posts/sre-observability-slo-runbooks
SRE math every engineer should know: a practical guide

https://one2n.io/blog/sre-math-every-engineer-should-know-a-practical-guide
Trixter: A Chaos Proxy for Simulating Network Faults

Trixter is a high-performance chaos proxy designed for injecting network faults at the TCP layer. In essence, it’s a TCP proxy that sits between a client and server, forwarding traffic but intentionally sabotaging it according to your specifications.


https://biriukov.dev/posts/trixter-chaos-proxy

https://github.com/brk0v/trixter
Your Brain on Incidents

My on-call experience started by accident in the mid-2000s. It was 5pm on the Friday at the end of my first week of employment as a software engineer at a financial services company in London. I was in the process of closing down my IDE for the weekend when my boss sauntered up to my desk, grinning awkwardly. He was carrying an IBM Thinkpad, a Blackberry, and the unmistakable burden of a man who needed to call in a favour.


https://uptimelabs.io/your-brain-on-incidents
Kubernetes Networking Tutorial: A Guide for Developers

https://www.freecodecamp.org/news/kubernetes-networking-tutorial-for-developers
KubeElasti

Kubernetes-native scale-to-zero with zero traffic loss, no code changes, and direct integration with kubernetes resources


https://github.com/truefoundry/KubeElasti
opentelemetry-operator

The OpenTelemetry Operator is an implementation of a Kubernetes Operator.


https://github.com/open-telemetry/opentelemetry-operator
zarf

Zarf eliminates the complexity of airgap software delivery for Kubernetes clusters and cloud-native workloads using a declarative packaging strategy to support DevSecOps in offline and semi-connected environments.


https://github.com/zarf-dev/zarf
sriov-network-device-plugin

The SR-IOV Network Device Plugin is Kubernetes device plugin for discovering and advertising networking resources in the form of:

- SR-IOV virtual functions (VFs)
- PCI physical functions (PFs)
- Auxiliary network devices, in particular Subfunctions (SFs)

which are available on a Kubernetes host


https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin
Upgrading PostgreSQL with no data loss and minimal downtime

https://palark.com/blog/postgresql-upgrade-no-data-loss-downtime
push-from-k8s-back-to-docker-registry

"Oops, I accidentally deleted my Docker registry. Can I get my images back?" YES. This tool does exactly that.


https://github.com/tazhate/push-from-k8s-back-to-docker-registry
The $1,000 AWS mistake

A cautionary tale about AWS VPC networking, NAT Gateways, and how a missing VPC Endpoint turned our S3 data transfers into an expensive lesson.


https://www.geocod.io/code-and-coordinates/2025-11-18-the-1000-aws-mistake
5 шагов, как выдержать нагрузку в пиковый сезон и не переплатить за инфраструктуру

1️⃣Определите, какую максимальную нагрузку может выдержать ваша инфраструктура и какой рост трафика ожидается во время распродажи.

2️⃣Оптимизируйте сервис и настройте быстрое восстановление из бэкапа.

3️⃣Подготовьте инфраструктуру к масштабированию. В частности, подключите CDN для ускорения загрузки контента.

4️⃣Мониторьте ситуацию во время пиковых нагрузок и следите за корректностью работы всех узлов.

5️⃣После снижения трафика верните систему в штатный режим.

Вы великолепны! А чтобы инфраструктура была еще выгоднее, подключайте СDN от Selectel со скидкой до 50% на дополнительный трафик. Успейте зарегистрироваться и подать заявку до 31 декабря: https://slc.tl/bs9tj

Реклама. АО "Селектел". erid: 2W5zFHM31VS