DevOps&SRE Library
18.3K subscribers
457 photos
4 videos
2 files
4.94K links
Библиотека статей по теме DevOps и SRE.

Реклама: @ostinostin
Контент: @mxssl

РКН: https://www.gosuslugi.ru/snet/67704b536aa9672b963777b3
Download Telegram
kubebrain

A High Performance Metadata System for Kubernetes


https://github.com/kubewharf/kubebrain
hardeneks

Runs checks to see if an EKS cluster follows EKS Best Practices.

https://github.com/aws-samples/hardeneks
k8s-wait-for

A simple script that allows to wait for a k8s service, job or pods to enter a desired state

https://github.com/groundnuty/k8s-wait-for
k9s

K9s provides a terminal UI to interact with your Kubernetes clusters. The aim of this project is to make it easier to navigate, observe and manage your applications in the wild. K9s continually watches Kubernetes for changes and offers subsequent commands to interact with your observed resources.


https://github.com/derailed/k9s
superedge

SuperEdge is an open source container management system for edge computing to manage compute resources and container applications in multiple edge regions. These resources and applications, in the current approach, are managed as one single Kubernetes cluster.


https://github.com/superedge/superedge
k3s-ansible

The easiest way to bootstrap a self-hosted High Availability Kubernetes cluster. A fully automated HA k3s etcd install with kube-vip, MetalLB, and more


https://github.com/techno-tim/k3s-ansible
addon-controller

Sveltos Kubernetes add-on controller programmatically deploys add-ons and applications in tens of clusters. Support for ClusterAPI powered clusters, Helm charts, kustomize ,YAMLs. Sveltos has built-in support for multi-tenancy.


https://github.com/projectsveltos/addon-controller
kine

Run Kubernetes on MySQL, Postgres, sqlite, dqlite, not etcd.


https://github.com/k3s-io/kine
linstor-server

High Performance Software-Defined Block Storage for container, cloud and virtualisation. Fully integrated with Docker, Kubernetes, Openstack, Proxmox etc.


https://github.com/LINBIT/linstor-server
RedisInsight

RedisInsight is a visual tool that provides capabilities to design, develop and optimize your Redis application. Query, analyse and interact with your Redis data.


https://github.com/RedisInsight/RedisInsight
ScratchDB

Scratch is an open-source alternative to BigQuery, Redshift, and Snowflake. Runs on Clickhouse.


https://github.com/scratchdata/ScratchDB
terraform-provider-namecheap

A Terraform Provider for Namecheap domain DNS configuration.


https://github.com/namecheap/terraform-provider-namecheap
Argo Workflows - Proven Patterns from Production

Argo Workflows provides an excellent platform for infrastructure automation, and has replaced Jenkins as my go tool for running scheduled or event-driven automation tasks.

In growing my experience with Argo Workflows, I’ve killed clusters, broken workflows and generally made a mess of things. I’ve also built a lot of workflows that needed refactoring as they became difficult to maintain.

This blog post aims to share some of the lessons I’ve learned, and some of the patterns I’ve developed, to help you avoid the same mistakes I’ve made.


https://hodgkins.io/argo-workflow-proven-patterns-from-production
Top 10 common Dockerfile linting issues

We've added the ability to lint Dockerfiles on demand in Depot. This post covers the top 10 most common Dockerfile linting issues we've seen flowing through Depot.


https://depot.dev/blog/dockerfile-linting-issues
Scaling Elasticsearch by Cleaning the Cluster State

We often get questions like:

- How much data can I put in an Elasticsearch cluster?
- How many nodes can an Elasticsearch cluster have?
- What’s the biggest cluster that you’ve seen?

And while the 14-year-old in me is proud to say that we’ve done 24/7 support for clusters of 1000+ nodes holding many PB of data, I am quick to add that:

1. It doesn’t mean it’s a good idea to have clusters that big.
2. Such generic questions deserve more nuanced answers. Which is exactly what this blog post does. And it applies to OpenSearch as well as for Elasticsearch. And for the most part, to Solr (where the cluster state is stored in Zookeeper).


https://sematext.com/blog/elasticsearch-scaling-cluster-state
Learning From Google SRE Team (part-1)

In this blog post, we aim to expand on the first 5 lessons shared by Google's Site Reliability Engineering team, offering a closer look at practical implementation examples.


https://www.codereliant.io/20-sre-lessons-from-google-part1
DevOps&SRE Library
SRE Interview Prep Plan (Week 2) This week is dedicated to providing you with the skills and knowledge to automate routine tasks, create scripts to solve complex problems, and manage infrastructure as code. As we look at scripting languages like Python and…
SRE Interview Prep Plan (Week 3)

This week, we're taking another significant step forward as we get into the critical stack of monitoring and alerting. Now, it's time to equip yourself with the knowledge and tools needed to keep an eye on systems, analyze performance, and respond quickly to any issues that may come up.


https://www.codereliant.io/sre-interview-prep-plan-week-3
tailspin

A log file highlighter

https://github.com/bensadeh/tailspin
The costs of microservices

The microservices architecture adds more moving parts to the overall system, and this doesn’t come for free. The cost of fully embracing microservices is only worth paying if it can be amortized across dozens of development teams.


https://robertovitillo.com/costs-of-microservices