DevOps&SRE Library
18.3K subscribers
456 photos
4 videos
2 files
4.94K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
Terraform Patterns

This is purely my perspective as a practitioner with firsthand visibility into several working solutions in my career as a software consultant. Much of the vocabulary used in this series is of my own imagination and will surely cede to better nomenclature from the community. Moreover, many implementations I have seen in practice include multiple types of patterns discussed in this series.

Part 1: Module Types: https://medium.com/devoops-discourse/terraform-observed-part-1-module-types-9dec5aa9dc9f

Part 2: Module Arrangement: https://medium.com/devoops-discourse/terraform-observed-part-2-module-arrangement-109d2cf517e1

Part 3: State Misconceptions & Pitfalls: https://medium.com/devoops-discourse/terraform-patterns-observed-part-3-state-misconceptions-pitfalls-e051ca1b7be9
terravision

Terravision visualises Terraform code as live Professional Cloud Architecture Diagrams by analysing the code dynamically. Supports AWS, Google and Azure.

https://github.com/patrickchugh/terravision
The Dark Side of SRE

Site Reliability Engineering has emerged as one of the hottest career paths in tech in the recent years. SREs get to tackle technical challenges on complex systems at scale, and are well-compensated for their specialized skillset.

From the outside, the life of an SRE might seem prestige and full of opportunity. But behind the curtain you can often find reality full of chronic stress, career stagnation, and occupational hazards.

By exploring the flip side of SRE, we can make more informed decisions about our engineering careers and have realistic expectations. Whether you're an aspiring or current SRE, let's discuss darker aspects of things.

https://www.codereliant.io/the-dark-side-of-sre
Being The First SRE

I have been the first Site Reliability Engineer (SRE) several times as a consultant or full-time employee. I’ve been the tech lead on three SRE teams and the only SRE on two others. I’ve succeeded (growing from one SRE to a team of five twice) and failed (quitting without another SRE being found). Here’s what I’ve learned about being the first SRE.

https://medium.com/@hans.knechtions/being-the-first-sre-7866a22975b4
GKE (Google Kubernetes Engine) Review

What if Kubernetes was idiot-proof?

https://matduggan.com/gke-google-kubernetes-engine-review
Understanding the Terraform Check Block Feature

We dive into one of Terraform's most recent features to leverage infrastructure validation.

https://masterpoint.io/updates/understanding-terraform-check
Traffic 101: Packets Mostly Flow

Slack handles billions of inbound network requests per day, all of which traverse through our edge network and ingress load balancing tiers. In this blog post, we’ll talk about how a request flows — from a Slack’s user perspective — across the vast ether of the network to reach AWS and then Slack’s internal services. Let’s dive in!

https://slack.engineering/traffic-101-packets-mostly-flow
beyla

eBPF-based auto-instrumentation of HTTP/HTTPS/GRPC Go services, as well as HTTP/HTTPS services written in other languages (intercepting Kernel-level socket operations as well as OpenSSL invocations).

https://github.com/grafana/beyla
Backup-and-Restore of Containers with Kubernetes Checkpointing API

Kubernetes v1.25 introduced Container Checkpointing API as an alpha feature. This provides a way to backup-and-restore containers running in Pods, without ever stopping them.

This feature is primarily aimed at forensic analysis, but general backup-and-restore is something any Kubernetes user can take advantage of.

So, let's take a look at this brand-new feature and see how we can enable it in our clusters and leverage it for backup-and-restore or forensic analysis.

https://martinheinz.dev/blog/85
Benchmarking Kubernetes node initialization

In this benchmark we compared initialization time across 8 managed Kubernetes providers.

https://symbiosis.host/blog/comparing-node-launch-times
Write your Kubernetes Infrastructure as Go code — Manage AWS services

Deploy DynamoDB and a client app using cdk8s along with AWS Controller for Kubernetes

https://itnext.io/write-your-kubernetes-infrastructure-as-go-code-manage-aws-services-815ecd4d1af8
etcd-backup-restore

Etcd-backup-restore is collection of components to backup and restore the etcd. It also, provides the ability to validate the data directory, so that we could know the data directory is in good shape to bootstrap etcd successfully.

https://github.com/gardener/etcd-backup-restore
kubectl-foreach

Run kubectl commands in all/some contexts in parallel (similar to GNU xargs+parallel)

https://github.com/ahmetb/kubectl-foreach
Deploying non-deployable things on ArgoCD with Kustomize, handling edge cases

https://faun.pub/deploying-non-deployable-things-on-argocd-with-kustomize-handling-edge-cases-aa51d24b3e4d
Full CI/CD workflow with Skaffold for your application

A modern way to building a complete workflow from Local to Production, with Skaffold and Gitlab on a Kubernetes cluster, to reduce cognitive load and operational complexity in application stacks.

https://blog.equationlabs.io/series/workflow-with-skaffold
ClickHouse Keeper: A ZooKeeper alternative written in C++

In this post, we describe the motivation, advantages, and development of ClickHouse Keeper and preview our next planned improvements. Moreover, we introduce a reusable benchmark suite, which allows us to simulate and benchmark typical ClickHouse Keeper usage patterns easily. Based on this, we present benchmark results highlighting that ClickHouse Keeper uses up to 46 times less memory than ZooKeeper ​​for the same volume of data while maintaining performance close to ZooKeeper.

https://clickhouse.com/blog/clickhouse-keeper-a-zookeeper-alternative-written-in-cpp
launchpad

Launchpad is a command-line tool that lets you easily create applications on Kubernetes.

In practice, Launchpad works similar to Heroku or Vercel, except everything is on Kubernetes.

https://github.com/jetpack-io/launchpad
etcdadm

etcdadm is a command-line tool for operating an etcd cluster. It makes it easy to create a new cluster, add a member to, or remove a member from an existing cluster. Its user experience is inspired by kubeadm.

https://github.com/kubernetes-sigs/etcdadm