DevOps&SRE Library

12 Factor App Revisited

The Twelve-Factor App methodology is a methodology for building software-as-a-service applications by Adam Wiggins. We cover how they have since evolved, and what we can learn from them today and how they changed the status quo of yesteryear.

https://architecturenotes.co/12-factor-app-revisited

4.28K views07:00

DevOps&SRE Library

Navigating the Storm: Strategies for Managing Production Incidents

Streamlining Your Team’s Incident Response

https://medium.com/saas-infra/navigating-the-storm-strategies-for-managing-production-incidents-e92ece8315c

2.99K views16:01

DevOps&SRE Library

Relational Databases Explained

How Relational Databases Work. This post talks about how indexes and transactions work on the inside of relational databases.

https://architecturenotes.co/things-you-should-know-about-databases

3.44K views07:01

DevOps&SRE Library

goreplay

GoReplay is an open-source tool for capturing and replaying live HTTP traffic into a test environment in order to continuously test your system with real data. It can be used to increase confidence in code deployments, configuration changes and infrastructure changes.

https://github.com/buger/goreplay

3.12K views16:01

DevOps&SRE Library

Writing Terraform for unsupported resources

TerraCurl is a utility Terraform provider that allows engineers to make managed and unmanaged API calls in their Terraform code.

https://www.hashicorp.com/blog/writing-terraform-for-unsupported-resources

3.26K views07:01

DevOps&SRE Library

terraform-workspaces-terragrunt-ansible

There are multiple ways to configure environment settings in Terraform. This repo evaluates initial four but it has branched out to multiple possible methods with an aim of writing DRY easy to maintain code.

https://github.com/neilpricetw/terraform-workspaces-terragrunt-ansible

3.49K views16:01

DevOps&SRE Library

Health Checking

https://blog.eightnoteight.dev/p/health-checking

3.35K views07:02

DevOps&SRE Library

The yaml document from hell

For a data format, yaml is extremely complicated. It aims to be a human-friendly format, but in striving for that it introduces so much complexity, that I would argue it achieves the opposite result. Yaml is full of footguns and its friendliness is deceptive. In this post I want to demonstrate this through an example.

This post is a rant, and more opinionated than my usual writing.

https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell

3.25K views16:00

DevOps&SRE Library

Graceful Shutdown

https://blog.eightnoteight.dev/p/graceful-shutdown

3.18K views07:01

DevOps&SRE Library

Can We Stop With Those Horrible “System Overview” Dashboards Already?

https://betterprogramming.pub/can-we-stop-with-those-horrible-system-overview-dashboards-already-5ea10a28fecf

3.35K views16:01

DevOps&SRE Library

Autoscaling Thread Pools

https://blog.eightnoteight.dev/p/autoscaling-threadgoroutine-pools

3.13K views07:01

DevOps&SRE Library

Rundown of LinkedIn’s SRE practices

https://www.srepath.com/rundown-of-linkedins-sre-practices

3.3K views16:02

DevOps&SRE Library

Recruiting developers into Site Reliability Engineering (SRE)

https://www.srepath.com/recruiting-developers-site-reliability-engineering-sre-guide

3.3K views07:00

DevOps&SRE Library

Rundown of Uber’s SRE practice

https://www.srepath.com/rundown-of-uber-sre-practice

3.72K views16:01

DevOps&SRE Library

Bad Observability

Observability has become a bit of a buzzword in the industry for the last few years. Exactly what "observability" means depends on who you ask, but most people would agree its about both:

- Being able to observe customer experience and behavior

- Being able to observe and understand what's happening within our technology solutions

There's plenty of content out there telling you how to implement observability, or what good looks like. But what about bad observability? What are some anti-patterns to watch out for?

https://squaredup.com/blog/slight-reliability/bad-observability

3.17K views07:01

DevOps&SRE Library

What Are Structured Logs and How Do They Improve Performance?

Logging information in a structured format for better analysis and processing of log data

https://betterprogramming.pub/why-you-should-use-structured-logging-format-47a388711316

4.02K views16:00

DevOps&SRE Library

SRE Transformation: our thoughts

https://layeraleph.com/advice/2022/11/15/sre-transformation

3.03K views07:02

DevOps&SRE Library

Need your own incident post-mortem template? Here’s ours

Having a dedicated incident post-mortem is just as important as having a robust incident response plan. The post-mortem is key to understanding exactly what went wrong, why it happened in the first place, and what you can do to avoid it in the future.

It’s an essential document but many organizations either haphazardly put together post-incident notes that live in disparate places or don’t know where to start in creating their own post-mortems. To help, we’re sharing the incident post-mortem template that we use internally.

This template outlines our “sensible default” for documenting any incident, technical or otherwise. We believe it strikes a healthy balance between raw data, human interpretation, and concrete actions. And we say “sensible default” because it’s rare that this will perfectly cover the specific needs of your organization, and that’s fine. Think of this as a launching off point for your own incident post-mortem document.

Within each section, we’ve outlined the background on what it’s for, why it’s important, and how we advise you to complete it.

https://incident.io/blog/incident-post-mortem-template

3.47K views16:00

DevOps&SRE Library

Seamless critical traffic migration with CoreDNS request rewrite feature

https://engineering.mercari.com/en/blog/entry/20221213-seamless-critical-traffic-migration-with-coredns-request-rewrite-feature

3.21K views07:01

DevOps&SRE Library

Site Reliability Engineer (SRE) Interview Preparation Guide

This repository is an attempt to consolidate useful resources for Site Reliability Engineer (SRE) interview preparation.

https://github.com/mxssl/sre-interview-prep-guide

4.45K views16:01

About

Blog

Apps

Platform