Introductions
So this is my collection of thoughts, ideas and craziness documented for everyone to see. I have a lot of different interests that I cycle through irregularly. It's hard to keep track of the bits and pieces so I figure I should write some of this down.
I don't like blogging, much to much like trying to keep a journal or a dairy ( not that there is anything wrong in that ). Instead I have specific, task-focused documents that I want to keep around so that when I revisit a topic, I can pick up where I left off.
If others find my documents and notes of interest, then that's great. That is not the reason why I'm doing this though.
If folks have suggestions or corrections, then great! Create a fork, branch and submit a PR. I don't mind being corrected and giving credit where credit is due. I'm not right all the time and there are much more knowledgable folks out there then me.
Kubernetes
Kubernetes is a big deal, and is something that I enjoy using.
I have decided to document my kubernetes setup so that other folks can follow along and see what I have going on, and maybe learn a thing or two along the way. It is also documented because I tend to rebuild the mess every couple of months and I usually forget something along the way!
Order of Operations
Installing Kubernetes
This is for a single host installation. I'll include instructions for adding an additional host down the line if you are so inclined.
I have consolidated a number of different documents, mostly for the troubleshooting.
Setup
This is for installation on an Ubuntu 24.04 LTS machine. I would recommend a machine with at least 8GiB of RAM, 16GiB of hard drive space and at least two cores. The more the better things will run and the more stuff you can cram into Kubernetes.
Make sure your server is up to date before we get started.
sudo apt update && sudo apt full-upgrade -y 
Step By Step
Step One: Disable Swap
Kubeadm will complain if swap is enabled, so let's disable able that.
sudo swapoff -a
Step Two: Kernel Parameters
There are some parameters that need to be tuned in the linux kernel for Kubernetes to work properly.
sudo tee /etc/modules-load.d/containerd.conf << EOF
overlay
br_netfilter
EOF
sudo tee /etc/sysctl.d/kubernetes.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1
EOF
sudo modprobe overlay && sudo modprobe br_netfilter && sudo sysctl --system
Step Three: Container Runtime Installation
Let's install containerd.
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmour -o /etc/apt/trusted.gpg.d/docker.gpg && \
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" && \
sudo apt update && \
sudo apt install -y containerd.io
Need to have a configuration file for containerd. Luckily, we can generate one.
containerd config default | sudo tee /etc/containerd/config.toml >/dev/null 2>&1 
sudo sed -i 's/SystemdCgroup \= false/SystemdCgroup \= true/g' /etc/containerd/config.toml
Then we'll need to enable and restart containerd.
sudo systemctl enable containerd && \
sudo systemctl restart containerd
Step Four: Kubernetes Runtime Installation
This will install kubernetes 1.34 to be installed on the machine. At the time of writing, it is the most current version.
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.34/deb/Release.key | sudo gpg  --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg && \
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.34/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list && \
sudo apt update && \
sudo apt install -y kubelet kubeadm kubectl
Step Five: Kubeadm Execution
Now for the part that we've all been waiting for! The prerequisites are in place so now it's time to get kubernetes up and running.
sudo kubeadm init
This will take a little while to run.
Step Six: Tool Configuration
Once the installation is complete, you will need to configure kubectl. Execute the following:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
This will copy the configuration to your home directory. This will allow you to use kubectl from anywhere on the system.
Step Seven: Remove the Taints
This is a single node system, which means that it's also running as a control panel. Normally, kubernetes doesn't want additional pods on the control panel, which makes for an interesting catch-22. Thankfully, we can fix that.
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
Step Eight: Container Networking
In order for the node to become ready, it will need networking installed. For simple, one node installs I personally think Calico is perfectly reasonable.
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.30.3/manifests/calico.yaml
It will take a little bit for the networking jitters to settle.
Step Nine: Does it work?
Let's pull up the cluster information
kubectl cluster-info
Do all nodes show READY?
kubectl get nodes -o wide
What do the pods look like?
kubectl get pods -o wide -A
Step Ten: Install Something!
You can also create a very basic nginx deployment just to see if things work.
kubectl apply -f https://gist.githubusercontent.com/bbrietzke/c59b6132c37ea36f9b84f1fee701a642/raw/952524cec7892e9db350fc62773c32ddfd9ab867/kubernetes-test.yaml
Then:
open http://kubernetes-host.local:30081/
And you should see a website in the browser of your choice.
Local Setup
There are some tools that I like to have on my local machine that makes working with Kubernetes much easier. This document will go through the installation and configuration of them.
What to install?
Since I'm on a Mac, everything is installed through (Homebrew)[https://brew.sh/].
brew install k9s helm kubernetes-cli
Configuration
Kubernetes Tools
The only thing that really needs configuration is the kubernetes configuration tools. For that, you need to get a copy of the kube config file from the control-plane.
mkdir ~/.kube
scp ubuntu@basil.local:.kube/config ~/.kube/config
This configures both k9s and kubectl, so that bit is done. Both of the tools should work as you would expect.
Helm
You don't have to configure helm, but it's not a bad idea either. My personal chart repository is configured as follows.
helm repo add bbrietzke http://bbrietzke.github.io/charts
I have a few charts that I tend to use, in particular for setting up namespaces. I have three, prod, dev, and infra. There isn't much else to customize, so this just works.
helm install namespaces bbrietzke/namespaces
It's also the first helm chart I created.
Adding a Worker Node
On the control plane, all you need is:
kubeadm token create --print-join-command
Just copy and paste that over to the new worker node and it will do the rest.
Managing Certificates in Kubernetes
Dealing with TLS certificates is a pain in butt!
This document is just a reshash/shorten view with my specific configuration. You can find the full documentation over at cert-manager.io.
Installation
Helm
helm repo add jetstack https://charts.jetstack.io && helm repo update
There are a number of ancilliary resources that have to be installed. You can do it manually, or let the helm chart do it ( which is what I did ).
helm install \
  cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --version v1.11.0 --set installCRDs=true
Go over verify section on the official docs to make sure it's working.
Configuration
You have to create issuers per namespace that will actually create and distribute the certificates. It's one of those resources that you created when you installed the helm charts.
Self-Signed
I created self-signed certificates for my namespaces just because.
Here is an example CRD:
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: dev-selfsigned-issuer
  namespace: dev
spec:
  selfSigned: {}
I'm sure I'll write up a Helm chart at somepoint with the issuers that I need.
Using the Certificate Manager
The certificates are mostly used by your ingress controllers to prove that the domain is valid and encrypt the communications between the origin and the client. I'm sure that can be used else where, but this is the scenerio that I use them for.
You will need to modify the ingress resource defination to be similiar to:
...
kind: Ingress
metadata:
    namespace: dev
    annotation:
        cert-manager.io/issuer: dev-selfsigned-issuer
...
The namespace must match the name of the issuer for that namespace.
NFS Persistent Volumes
Do your pods need to have persitent volumes for your home Kubernetes cluster? Turns out, NFS is an option
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
Helm Charts
You can add the helm chart with:
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
Then always pull the values, since you will need to customize the NFS server IP and path
helm show values nfs-subdir-external-provisioner/nfs-subdir-external-provisioner
An example customized one looks like the following:
nfs:
  server: 10.0.0.155
  path: /srv/nfs_shares/kubernetes
And of course, we have to provision:
helm install -f nfs_prov.yml nfs-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner -n infra
Metrics server is not quite as easy to install as advertise, at least on system installed with kubeadm.  But, the fix to make
it work is pretty simple, assuming you do it the insecure way.
The correct way is more complicated, but also secure.
Installing
The code and instructions to install metrics server can be found here. Again, not repeating them here so they go stale.
The instructions work perfectly, things get installed and then don't work. At all.
The metrics pod just never becomes ready and then logs complain about SSL certificates being invalid.
Cheating and Being Insecure
The easiest option is to add --kubelet-insecure-tls to the spec.container.args array.  I added it as the last one.
You can make this edit either in the deployment or prior to doing a kubectl apply if you download the manifest files.
RAID Arrays
Let's build a few different kinds of RAID arrays and then use them.
I'm not going to go into detail about what RAID is, or the different levels since there is plenty of documentation out there already. These are the commands to setup software RAID for Linux and the general workflow to follow.
Do we have any now?
Let's just double check
cat /proc/mdstat
Okay, which Devices?
Insert the new drives into the machine. You should know what the come up as, but if you don't, try:
lsblk
That should get you a list of all the block devices on the system and where they are being used at. Some of them don't make sense, but you should see the ones that you just added. If needed, run the command and copy down the results. Then insert the drives and execute it again.
I'll be using the following:
/dev/sda
/dev/sdb
/dev/sdc
/dev/sdd
Create the ARRAYS!
RAID 0
We'll start with RAID 0, since it will allow us to use all the drives as one big ( though not redundant ) block device.
sudo mdadm --create --verbose /dev/md0 --level=raid0 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
RAID 1
Simple mirroring. The easiest to use, a decent redundancy package and not all that wasteful.
sudo mdadm --create --verbose /dev/md0 --level=raid1 --raid-devices=2 /dev/sda /dev/sdb
RAID 5
Probably the best all around choice. Best use of capacity and good performance.
sudo mdadm --create --verbose /dev/md0 --level=raid5 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
Creating the FileSystem
sudo mkfs -t ext4 /dev/md0
Mounting
You should have these drives come up every time you want to use them, so add the entries to /etc/fstab.
First you need the UUID of the array.
sudo blkid /dev/md0
Take the UUID and the following string and open up /etc/fstab to add something along the lines of:
UUID=655d2d3e-ab31-49c7-9cc3-583ec81fd316 /srv ext4 defaults 0 0
Then you can execute sudo mount -a and have the array appear where you wanted it.
Update config
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
then
sudo update-initramfs -u
Destroying RAID Arrays
You create RAID drives, so you should know who to tear them down.
So we built a few RAID arrays and mounted them, so that's grand. Now let's tear them down.
Unmount everything
Make sure you have the arrays unmounted from there normal ( or abnormal ) paths. If you don't, not much of this is going to work out well.
Check mdstat
Lets check to see which arrays we currently have configured.
cat /proc/mdstat
The above should return any arrays you current have built and what state they are in. In this case, we need to know the names of the arrays so that we can remove them.
Remove Arrays
Okay, let's stop the arrays.
mdadm --stop /dev/md127  # or whatever was returned above
Now that we don't have working arrays, we can zero them out so that they are empty.
mdadm --zero-superblock /dev/sda /dev/sdb /dev/sdc /dev/sdd
Trust, but Verify
Rerun the mdstat and make sure those pesky things are gone.
Custom Host Publishing
If you're like me, you probably have AVAHI running on all your servers just to make name resolution simplier. What you probably didn't know is you can do some neat tricks with Avahi.
Neat Trick Number One
There is a command call avahi-publish that will publish a hostname on your network.  This is pretty cool, because it means you don't have to remember to add the hostname to your hosts file.  It also means you can use this to publish a hostname on a server that doesn't have Avahi installed.
For example:
avahi-publish -a -R pandora.local 192.168.1.10 # or what-ever IP address that you want...
Now you can ping pandora.local and it will respond!
What good is this? Imagine giving your router an avahi registered name and you can log into it without having to remember the IP. If you're on Xfinity, you can do:
avahi-publish -a -R router.local 10.0.0.1
You will be able to ping router.local and it will respond!
Neat Trick Number Two
So pinging a router by name is nice and all, but not really that exciting.
What you can do is combine the above with Kubernetes' ingress resource definations to have multiple ingress' on the same host without having to anything magical to DNS.
Neat Trick Number Three
Again, neat-o and all, but now you have a terminal up and running hosting names and that's just a waste of energy. What if the terminal window closes or the machine resets? Then you have to manually execute the commands to get the network back online.
Systemd to the rescue!
[Unit]
Description=Avahi OwnCloud
[Service]
ExecStart=/usr/bin/avahi-publish -a -R pandora.local 10.0.0.238
Restart=always
[Install]
WantedBy=default.target
Save the file in /etc/systemd/system ( i.e. /etc/systemd/system/avahi-pandora.service ).  Then treat it as any normal service.
sudo systemctl enable avahi-pandora
sudo systemctl start avahi-pandora
Systems and Application Monitoring with Prometheus
Trying to figure out what is going on when something is broke can be hard, so it's nice to have tooling to help with that. Prometheus is one such tool. It can also, with proper tuning and work, tell you before something is going to break.
And it make pretty graphs. Everybody loves pretty graphs!
Prometheus?
Prometheus monitoring solution is a free and open-source solution for monitoring metrics, events, and alerts. It collects and records metrics from servers, containers, and applications. In addition to providing a flexible query language (PromQL), and powerful visualization tools, it also provides an alerting mechanism that sends notifications when needed.
Prerequisites
A machine that can run Ubuntu 22.04 ( or other LTS ).
You should also have basic administrative knowledge and an account that has sudo access on the above box.
Installation
Update the system
sudo apt update && sudo apt -y upgrade
Create the Prometheus User Account
sudo groupadd --system prometheus
sudo useradd -s /sbin/nologin --system -g prometheus prometheus
Create Directories
These are for configuration files and libraries.
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
Install Prometheus
Now for the fun part!
The tarball used for this is the latest ( as of this writing ) LTS for Prometheus. You can change it to be what you need.
wget https://github.com/prometheus/prometheus/releases/download/v3.5.0/prometheus-3.5.0.linux-amd64.tar.gz
tar zvxf prometheus*.tar.gz
cd prometheus*/
sudo mv prometheus /usr/local/bin
sudo mv promtool /usr/local/bin
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
sudo mv prometheus.yml /etc/prometheus
sudo chown prometheus:prometheus /etc/prometheus
sudo chown -R prometheus:prometheus /var/lib/prometheus
cd ..
rm -rf prom*
Configuring
The configuration guide can be found over here. It will step you through most every option in the most confusing way possible. I amd including a sample of my personal configuration in the hopes that a real world example makes more sense.
sudo nano /etc/prometheus/prometheus.yml
Here is what my simple configuration file looks like:
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
           - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
        labels:
          groups: 'monitors'
  - job_name: 'servers'
    static_configs:
      - targets:
          - 'atlas.faultycloud.lan:9182'
          - 'coeus.faultycloud.lan:9182'
          - 'gaia.faultycloud.lan:9182'
          - 'hyperion.faultycloud.lan:9182'
        labels:
          groups: 'win2022'
  - job_name: 'gitlab'
    static_configs:
      - targets:
          - '192.168.1.253:9090'
        labels:
          groups: 'development'
Run at Startup
sudo nano /etc/systemd/system/prometheus.service
with
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file /etc/prometheus/prometheus.yml \
    --storage.tsdb.path /var/lib/prometheus/ 
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
sudo systemctl status prometheus
Systems and Application Notifications with Alert Manager
So you have Prometheus, the next step is AlertManager, which will notify you when an something goes awry.
AlertManager?
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
Prerequisites
A machine that can run Ubuntu 22.04 ( or other LTS ).
You should also have basic administrative knowledge and an account that has sudo access on the above box.
Installation
Update the system
sudo apt update && sudo apt -y upgrade
Create the AlertManager User Account
sudo groupadd --system alertmanager
sudo useradd -s /sbin/nologin --system -g alertmanager alertmanager
Create Directories
These are for configuration files and libraries.
sudo mkdir /etc/alertmanager
sudo mkdir /var/lib/alertmanager
Install AlertManager
Now for the fun part!
// https://prometheus.io/download/#alertmanager
wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz
tar zvxf alertmanager*.tar.gz
cd alertmanager*/
sudo mv alertmanager /usr/local/bin
sudo mv amtool /usr/local/bin
sudo chown alertmanager:alertmanager /usr/local/bin/alertmanager
sudo chown alertmanager:alertmanager /usr/local/bin/amtool
sudo chown alertmanager:alertmanager /var/lib/alertmanager
sudo mv alertmanager.yml /etc/alertmanager
cd ..
rm -rf alert*
Run at Startup
sudo nano /etc/systemd/system/alertmanager.service
with
[Unit]
Description=AlertManager
Wants=network-online.target
After=network-online.target
[Service]
User=alertmanager
Group=alertmanager
Type=simple
ExecStart=/usr/local/bin/alertmanager \
    --config.file /etc/alertmanager/alertmanager.yml \
    --storage.path=/var/lib/alertmanager
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable alertmanager
sudo systemctl start alertmanager
sudo systemctl status alertmanager
Node Exporter
Runbooks
Runbooks are a way of documenting procedural Information Technology information that is repetitive in nature.
Most of the time, you see runbooks as a way of troubleshooting a problem, diagnosising an issue or a procedure.
This is a collection of the runbooks that I have decided to document for my area.
Table of Contents
Alerts
Create User for PostgreSQL
Overview
Pre-Requistes
- Access to login into database server via ssh.
- sudoaccess on database server
Steps
- Log into database server via SSH.
- Sudo into the postgresuser and execute thepsqlcommand line tool.- sudo -u postgres psql
 
- Create the user with and encrypted password:
- We recommend using a long and complex password since the values will only be saved here and can be input into the target system at the same time.
- The user name should reflect the database that they are primarily accessing.
- CREATE USER db01 WITH ENCRYPTED PASSWORD 'R3@llyL0ngP@ssw0rd123456790';
 
- Grant the new user dbcreate privileges:
- ALTER USER db01 CREATEDB;
 
Troubleshooting
- If the target system can not log into the database after the database/user creation process, you can simply re-run the above steps to make sure to make sure that they are correct.
- If the user is present, but the password has been forgotten, you may reset the password as follows:
- ALTER USER db01_user WITH ENCRYPTED PASSWORD '3^On9D4p59^4';
 
Completion and Verification
You can log into the target database from callisto.lan to verify if everything is setup correctly
psql -h 'callisto.lan' -U 'db01' -d 'db01_prod'
It will prompt you for the password, which we have available and should provide. If the login occurs, then the database/user/permissions should be okay.
Contacts
Not Applicable
Appendix
Changelog
* 2023/04/06 - Created