How to be on-call
https://hart-michael.medium.com/how-to-be-on-call-034e3a202729
I have been on-call for most of my career and led teams with on-call rotations, and have a lot of experience with the negative impact of on-call to my personal life and the lives of my colleagues. I’ve missed Christmas dinner (years later my Mom still brings it up), worked through weekends and nights, missed many kids’ events, and once juggled a fussy baby and an incident call at the same time. My goal is to make being on-call as sane as possible, balancing what the business needs with our collective personal lives.
https://hart-michael.medium.com/how-to-be-on-call-034e3a202729
Use of HTTPS Resource Records
https://www.netmeister.org/blog/https-rrs.html
Good news, everybody -- we have new DNS resource records! Well, not new new, but, you know, newish. You've probably heard of them, or even seen them actively in use, even though they moved from internet draft to formal RFC9460 adoption literally while I was working on this blog post during the last few weeks: the SVCB and HTTPS resource records.
https://www.netmeister.org/blog/https-rrs.html
teks
https://github.com/particuleio/teks
tEKS is a set of Terraform / Terragrunt modules designed to get you everything you need to run a production EKS cluster on AWS. It ships with sensible defaults, and add a lot of common addons with their configurations that work out of the box.
https://github.com/particuleio/teks
Where Did All The Terraform Testing Go?
https://landadevopsjob.com/blog/where-did-all-the-terraform-testing-go
https://landadevopsjob.com/blog/where-did-all-the-terraform-testing-go
Terraform documentation made easy with terraform-docs
https://medium.com/@akhilesh-mishra/terraform-documentation-made-easy-with-terraform-docs-096014b00ecf
A complete guide to Terraform documentation with terraform-docs
https://medium.com/@akhilesh-mishra/terraform-documentation-made-easy-with-terraform-docs-096014b00ecf
tfprovidercheck
https://github.com/suzuki-shunsuke/tfprovidercheck
CLI to prevent malicious Terraform Providers from being executed. You can define the allow list of Terraform Providers and their versions, and check if disallowed providers aren't used
https://github.com/suzuki-shunsuke/tfprovidercheck
terraform-local
https://github.com/localstack/terraform-local
Terraform CLI wrapper to deploy your Terraform applications directly to LocalStack
https://github.com/localstack/terraform-local
liqo
https://github.com/liqotech/liqo
Liqo is an open-source project that enables dynamic and seamless Kubernetes multi-cluster topologies, supporting heterogeneous on-premise, cloud and edge infrastructures.
https://github.com/liqotech/liqo
k8gb
https://github.com/k8gb-io/k8gb
A Global Service Load Balancing solution with a focus on having cloud native qualities and work natively in a Kubernetes context.
https://github.com/k8gb-io/k8gb
Organizing multiple Git identities
https://garrit.xyz/posts/2023-10-13-organizing-multiple-git-identities
Here's a quick tip on how to manage multiple Git identities (e.g. personal, work, client1, client2).
https://garrit.xyz/posts/2023-10-13-organizing-multiple-git-identities
flagd
https://github.com/open-feature/flagd
Flagd is a feature flag daemon with a Unix philosophy. Think of it as a ready-made, open source, OpenFeature-compliant feature flag backend system.
https://github.com/open-feature/flagd
Kubernetes And Kernel Panics
https://netflixtechblog.com/kubernetes-and-kernel-panics-ed620b9c6225
How Netflix’s Container Platform Connects Linux Kernel Panics to Kubernetes Pods
https://netflixtechblog.com/kubernetes-and-kernel-panics-ed620b9c6225
Terraform Security Best Practices
https://sysdig.com/blog/terraform-security-best-practices
In this article we want to explain the benefits of using Terraform, and provide guidance for using Terraform in a secure way by reference to some security best practices.
https://sysdig.com/blog/terraform-security-best-practices
trivy-operator
https://github.com/aquasecurity/trivy-operator
The Trivy Operator leverages Trivy to continuously scan your Kubernetes cluster for security issues. The scans are summarised in security reports as Kubernetes Custom Resource Definitions, which become accessible through the Kubernetes API. The Operator does this by watching Kubernetes for state changes and automatically triggering security scans in response. For example, a vulnerability scan is initiated when a new Pod is created. This way, users can find and view the risks that relate to different resources in a Kubernetes-native way.
https://github.com/aquasecurity/trivy-operator
Exploring the OpenTelemetry Collector
https://blog.frankel.ch/opentelemetry-collector
In this post, I explore the different aspects of the Collector:
- The data kind: logs, metrics, and traces
- Push and pull models
- Operations: reads, transformations, and writes
https://blog.frankel.ch/opentelemetry-collector
Monoliths, Service Architecture, and Microservices
https://architecturenotes.co/granularity-of-systems
There are many discussions about which level of system granulation is the best. We went from monoliths to microservices and back again.
https://architecturenotes.co/granularity-of-systems
DevOps&SRE Library
Learning From Google SRE Team (part-1) In this blog post, we aim to expand on the first 5 lessons shared by Google's Site Reliability Engineering team, offering a closer look at practical implementation examples. https://www.codereliant.io/20-sre-lessons…
Learning From Google SRE Team (part-2)
https://www.codereliant.io/learning-from-google-sre-team-part-2
https://www.codereliant.io/learning-from-google-sre-team-part-2
Source Code Analysis — A Comprehensive Understanding of Kubelet
https://addozhang.medium.com/source-code-analysis-a-comprehensive-understanding-of-kubelet-7a9083514ff0
This article primarily delves into a source code analysis of the kubelet’s functions, key components, and its booting process, summarizing the working principle of kubelet.
https://addozhang.medium.com/source-code-analysis-a-comprehensive-understanding-of-kubelet-7a9083514ff0