DevOps&SRE Library
18.6K subscribers
459 photos
3 videos
2 files
5.03K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Karpenter at Beekeeper by LumApps: Fun Stories

At the beginning of this year, we (Beekeeper by LumApps Engineering) decided to adopt Karpenter for our EKS (Kubernetes/K8s) workloads, replacing our previous node autoscaling setup that used cluster-autoscaler with a managed autoscaling group (ASG). We made this decision before the release and hype of EKS Auto Mode, which is why we chose to implement a self-managed Karpenter solution.


https://medium.com/beekeeper-technology-blog/karpenter-at-beekeeper-by-lumapps-fun-stories-7c55656f02b8
Extracting JVM Data from Crash-Looping Java Containers in Kubernetes

https://medium.com/@zelldon91/getting-data-out-of-burning-java-containers-6e0c8bb53eec
Intelligent Kubernetes Load Balancing at Databricks

Real-Time, Client-Side Load Balancing for Internal and Ingress Traffic in Kubernetes


https://www.databricks.com/blog/intelligent-kubernetes-load-balancing-databricks
Strengthen Kubernetes Security with Vault Agent Injector

https://hackernoon.com/strengthen-kubernetes-security-with-vault-agent-injector
build

Shipwright is an extensible framework for building container images on Kubernetes.


https://github.com/shipwright-io/build
kexa

Kexa is an open-source compliance management tool that simplifies security and compliance across multiple cloud platforms including Azure, Google Cloud, AWS, and more.


https://github.com/kexa-io/kexa
clickhouse-operator

The Altinity Kubernetes Operator for ClickHouse creates, configures and manages ClickHouse clusters running on Kubernetes.


https://github.com/Altinity/clickhouse-operator
ch-vmm

ch-vmm is a Kubernetes add-on for running Cloud Hypervisor virtual machines. By using Cloud Hypervisor as the underlying hypervisor, ch-vmm enables a lightweight and secure way to run fully virtualized workloads in a canonical Kubernetes cluster.


https://github.com/nalajala4naresh/ch-vmm
Enroll

Enroll inspects a Debian-like or RedHat-like system, harvests the state that matters, and generates Ansible roles/playbooks so you can bring snowflakes under management fast.


https://enroll.sh
PHP 8.5 benchmarks: The state of PHP performance across major CMSs and frameworks

PHP 8.5 has now been officially released, and developers naturally want to know what kind of performance improvements they can expect across popular CMSs and frameworks.

To find out, we benchmarked 13 widely used CMSs and frameworks, including WordPress, WooCommerce, Drupal, Joomla, Laravel, Symfony and CodeIgniter, on PHP 8.2, 8.3, 8.4, and 8.5 under identical conditions. WordPress was also tested on PHP 7.4, since a notable share of sites still run on that version.

Our intention is to provide a clear, practical look at how performance shifts across recent PHP releases and what you can expect when upgrading.


https://kinsta.com/blog/php-benchmarks
Finding the grain of sand in a heap of Salt

How do you find the root cause of a configuration management failure when you have a peak of hundreds of changes in 15 minutes on thousands of servers?

That was the challenge we faced as we built the infrastructure to reduce release delays due to failures of Salt, a configuration management tool. (We eventually reduced such failures on the edge by over 5%, as we’ll explain below.) We’ll explore the fundamentals of Salt, and how it is used at Cloudflare. We then describe the common failure modes and how they delay our ability to release valuable changes to serve our customers.

By first solving an architectural problem, we provided the foundation for self-service mechanisms to find the root cause of Salt failures on servers, datacenters and groups of datacenters. This system is able to correlate failures with git commits, external service failures and ad hoc releases. The result of this has been a reduction in the duration of software release delays, and an overall reduction in toilsome, repetitive triage for SRE.

To start, we will go into the basics of the Cloudflare network and how Salt operates within it. And then we’ll get to how we solved the challenge akin to finding a grain of sand in a heap of Salt.


https://blog.cloudflare.com/finding-the-grain-of-sand-in-a-heap-of-salt
Rethinking QA: From DevOps to Platform Engineering and SRE

A wake‑up call for QA to upskill for platform engineering and SRE, including cloud‑native practices, automation mastery, and system reliability at scale.


https://dzone.com/articles/rethinking-qa-from-devops-to-platform-engineering
Queue-Based Autoscaling Without Flapping: Rethinking App Scaling with K8s, KEDA, and RabbitMQ

https://blog.stackademic.com/autoscaling-with-message-queues-why-everyone-gets-it-wrong-with-kubernetes-keda-rabbitmq-and-f1a4c38e0df4
helm-controller

A simple way to manage helm charts with Custom Resource Definitions in k8s.


https://github.com/k3s-io/helm-controller
nxs-universal-chart

nxs-universal-chart is a Helm chart you can use to install any of your applications into Kubernetes/OpenShift and other orchestrators compatible with native Kubernetes API.


https://github.com/nixys/nxs-universal-chart
percona-xtradb-cluster-operator

Percona Operator for MySQL based on Percona XtraDB Cluster (PXC) automates the creation and management of highly available, enterprise-ready MySQL database clusters on Kubernetes.


https://github.com/percona/percona-xtradb-cluster-operator
1
k8z

A lightweight, modern mobile and desktop application for manage kubernetes. Easily for use fast, secure.


https://github.com/k8zdev/k8z
1
opentelemetry-host-metrics

When you're monitoring infrastructure with OpenTelemetry, the Host Metrics Receiver (hostmetrics) is one of the most relevant components to reach for.

It fully replaces traditional agents (like Prometheus Node Exporter), and collects essential system metrics such as CPU, memory, disk, and network usage directly from the machine where the Collector is running.

Because this receiver needs direct access to the underlying system, it's intended to be used when the Collector is deployed as an Agent. For example, as a DaemonSet on Kubernetes nodes or as a service on a VM or bare-metal host, not as a centralized gateway.

In this guide, you'll learn how to configure it as a Node Exporter alternative for monitoring your server infrastructure.


https://www.dash0.com/guides/opentelemetry-host-metrics
Migrating Kubernetes out of the Big Cloud Providers

“Move to kubernetes to save costs” they said in the early days of the k8s frenzy. This was trusting that an efficient pod bin (node) packing would save on node costs (there’s also autoscale but regular cloud already offers that).

The reality is that the overhead costs of running the control plane and auxiliary services in each node (DNS, metric and log collectors etc) plus extra easy ways to make costly mistakes turns most Kubernetes installations a more expensive proposition than running the workloads without it.

For the record, the great thing about k8s and the reason for its success (besides resume-driven technology) is in the standardization it provides and its extensibility or modularity (this plug-in advantage is the reason mediocre software like Wordpress is successful for example).

Managed k8s in the “big three” public cloud providers: Amazon Elastic Kubernetes Service (EKS) in AWS, Google Kubernetes Engine (GKE) in GCP and Azure Kubernetes Service (AKS) in Azure for a startup is expensive.

On the other hand I don’t want to manage k8s master nodes and essential services on “baremetal” — funny that nowadays that means a Virtual Machine (VM), so I was looking for an intermediate solution between the expensive fully managed k8s and the cheapest (in dollars, not in time) completely self-managed k8s.


https://medium.com/@duran.fernando/migrating-kubernetes-out-of-the-big-cloud-providers-45a378943d5c