Category: Uncategorized

  • Long K8s ContainerCreating Times

    I’ve had an issue for quite awhile with a handful of my pods on k3s – when they get recreated they sit in “ContainerCreating” after it’s been properly assigned to a node, for upwards of 10-15mins. After which it starts normally and continues on it’s way.

    Well, I finally found what the issue was after digging into the logs via journalctl -u k3s as this was happening on my new server all the sudden too!

    Oct 02 13:02:10 mini01 k3s[1128132]: W1002 13:02:10.887356 1128132 volume_linux.go:49] Setting volume ownership for /var/lib/kubelet/pods/7ce5e4e3-8688-4b72-96f0-c1ff95508fa1/volumes/kubernetes.io~csi/pvc-0305f670-29a9-4be1-9b69-f9213369dde8/mount and fsGroup set. If the volume has a lot of files then setting volume ownership could be slow, see https://github.com/kubernetes/kubernetes/issues/69699

    TBH, these K8s 1.20 release notes would be a better link to reference than the github issue, but I’m sure that log item was pre-1.20.

    Turns out, by default, fsGroupPolicy is enabled in the NFS CSI helm chart. The affected pods both have a ton of files in their NFS PVC and it was trying to check/change ownership over all of them, taking upwards of 20 mins! Disabled that as part of the Helm chart, and TADA!

  • Degoogling Location Timeline with Owntracks

    Over the past few years, I’ve been moving to more of a self-hosted model instead of using free services where I am the product (aka my data). After receiving my latest google timeline email, I realized this was a good next area to focus on. I had already set the auto-delete to any activity over 3 months, but I really do like seeing all that data. Hence my quest to find a replacement.

    Ultimately I think I’ve landed on owntracks. This is primarily due to a few things a) decent updates b) a mobile app that only updates when it needs to…saving battery. I do feel as if I need to take another look at traccar, but the table is below – you can mix/match the clients and the backends to a certain extent:

    ComponentProductProsCons
    Client AppOverland– Pretty basic
    – Reports based on location change
    – Doesn’t seem to be updated anymore
    Owntracks– Reports based on location change
    – Friends/family function
    – Map visualization
    – Supports POST and MQTT
    ?
    Traccar– Basic, no frills– Only does time-base reporting
    BackendPhonetrack (NextCloud)– Already had nextcloud
    – Lots of filtering and sharing features
    – Nextcloud is clunky
    – Only support POST
    Owntracks– Supports POST and MQTT
    – Containerized backend and frontend
    – Recorder is complex for advanced usage
    – Documentation is rough
    Traccar– Supports lots of clients OOTB– Only supports POST
    – PWA frontend not containerized

    There’s a lot of complaints about how difficult it is to get Owntracks setup. I’m not going to lie, the documentation definitely leaves a bit to the reader to figure out. Once I’ve gotten my setup a bit more production ready I’ll probably post the code. In the interim, here are some quick things I learned along the way:

    • If you want to use HTTP POST method instead of MQTT on the Owntracks recorder, set OTR_PORT=0
    • For versions > 2.0 in eclipse mosquitto MQTT, you need to define a config file for accessing the broker from anywhere other than localhost
    • On the owntracks client, flipping between MQTT and HTTP changes all of the settings – including the locationDisplacement and locationInterval settings
    • When using MQTT, be sure to set the Tracker ID to something you want in the Identification section. Otherwise it defaults to something random
    • The owntracks script to import your Google Timeline doesn’t work anymore. See this PR for a working script. It appears as if they changed the timestamp name and/or no longer include the unix epoch timestamp anymore
    • Unless you want realtime tracking, you don’t need to expose your MQTT broker to the internet. The Owntracks client will buffer requests until it can sync.
    • Nginx-ingress controller allows exposing of TCP ports, but doesn’t have a way of securing them with TLS protections (via ACME/Let’s Encrypt automation)

  • Kubernetes ‘exec’ DNS failure – Updated

    UPDATE: While the below definitely works, the correct way to do this is to properly add a DNS suffix. This should be set in your DHCP configuration if your nodes are getting their IP info from DHCP. If you’re using static IP addresses, you should run the following commands on each node. Replace <ifname> with the name of your network interface (i.e. eno1, eth0, etc.) and <domain.name> with the domain suffix you want appended.

    # This change is immediate, but not persistent
    sudo resolvectl domain <ifname> <domain.name>
    # This makes it permanent
    ## Turns out, this sets the global search domain, but still fails
    ## echo "Domains=<domain.name>" | sudo cat /etc/systemd/resolved.conf -
    ## Netplan is what is setting the interface info, so be sure to edit its configuration
    sudo sed -i 's|search: \[\]|search: \[ <domain.name> \]|' /etc/netplan/<netplan file>
    

    From https://askubuntu.com/a/1211705


    I have finally migrated all of my containers from my docker-ce server to kubernetes (microk8s server). The point was so that I could wipe the docker-ce server and make a microk8s cluster – which has been done and was super easy!

    However, after getting the cluster setup I wasn’t able to exec into certain pods from a remote machine with kubectl. The error I was getting was below:

    Error from server: error dialing backend: dial tcp: lookup <node-name>: Temporary failure in name resolution
    

    As I had originally only had a single node, my kubectl config referenced the original nodes IP address directly. Additionally, I noticed that this error happened when the pod was located on the node that wasn’t the api server I was accessing. By changing my kube config api server to the node that hosted the pod, it then worked.

    After a lot of playing with kube-dns and coredns, it really came down to something easy/obvious. When I was on one node, I couldn’t resolve the shortname of the other node, and therefore node1 couldn’t proxy to node2 to run the exec.

    While there are multiple ways I could have fixed this (and I did get the right DNS suffixes added to DHCP too), I ended up editing the /etc/hosts on each node and ensuring there was an entry for the other node. Tada, exec works across nodes now.

  • Using Kubernetes Ingress for non-K8 Backends

    TL;DR – Make sure you name your ports when you create external endpoints.

    In my home environment, I need a reverse proxy that serves all port 80 and 443 requests and can interface easily with LetsEncrypt to ensure all those endpoints are secure. Originally I’ve been using Docker and Jwilder’s nginx proxy to support all these. As it’s just using nginx, you can use it to send stuff to backends that aren’t in docker pretty easily (like the few physical things that aren’t in docker). However, I’ve been transitioning over to Kubernetes and need a similar way to have a single endpoint on those ports that all services can use.

    Well, the good news is that the the internet is awash of articles about this. However, after attempting to implement any of them, I was consistently getting 502 errors – no live upstreams. This was happening on a Ubuntu 20.04 LTS system running microk8s v1.19.5.

    My original endpoint, service, and ingress configs were the following:

    apiVersion: v1
    kind: Endpoints
    metadata:
      name: external-service
    subsets:
      - addresses:
          - ip: <<IP>>
        ports:
          - port: <<PORT>>
            protocol: TCP
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: external-service
    spec:
      ports:
        - name: https
          protocol: TCP
          port: <<PORT>>
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: external-ingress
      annotations:
        kubernetes.io/ingress.class: "nginx"    
        cert-manager.io/cluster-issuer: letsencrypt-prod
        cert-manager.io/acme-challenge-type: http01
        nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    spec:
      tls:
      - hosts:
        - external.rebelpeon.com
        secretName: external-prod
      rules:                           
      - host: external.rebelpeon.com
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: external-service
                port: 
                  number: <<PORT>>
    

    This yaml deployed successfully, but as mentioned did not work. With it deployed, when describing the Endpoint:

    $ kubectl describe endpoints -n test
    Name:         external-service
    Namespace:    test
    Labels:       <none>
    Annotations:  <none>
    Subsets:
      Addresses:          <<IP>>
      NotReadyAddresses:  <<none>
      Ports:
        Name     Port  Protocol
        ----     ----  --------
        <unset>  443   TCP
    
    Events:  <none>
    

    When describing the service:

    $ kubectl describe services -n test
    Name:              external-service
    Namespace:         test
    Labels:            <none>
    Annotations:       <none>
    Selector:          <none>
    Type:              ClusterIP
    IP Families:       <none>
    IP:                10.152.183.182
    IPs:               <none>
    Port:              https  443/TCP
    TargetPort:        443/TCP
    Endpoints:
    Session Affinity:  None
    Events:            <none>
    

    Wait a minute, the service lists the endpoints as being blank – not undefined or properly defined as others. When I describe the endpoint of a working K8-managed endpoint, I see that the port has a name, and that’s the only difference.

    $ kubectl describe endpoints -n test
    Name:         external-service
    Namespace:    test
    Labels:       <none>
    Annotations:  <none>
    Subsets:
      Addresses:          <<IP>>
      NotReadyAddresses:  <none>
      Ports:
        Name   Port  Protocol
        ----   ----  --------
        https  443   TCP
    

    So, I changed my config to the following (one line change):

    apiVersion: v1
    kind: Endpoints
    metadata:
      name: external-service
    subsets:
      - addresses:
          - ip: <<IP>>
        ports:
          - port: <<PORT>>
            protocol: TCP
            name: https
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: external-service
    spec:
      ports:
        - name: https
          protocol: TCP
          port: <<PORT>>
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: external-ingress
      annotations:
        kubernetes.io/ingress.class: "nginx"    
        cert-manager.io/cluster-issuer: letsencrypt-prod
        cert-manager.io/acme-challenge-type: http01
        nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    spec:
      tls:
      - hosts:
        - external.rebelpeon.com
        secretName: external-prod
      rules:                           
      - host: external.rebelpeon.com
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: external-service
                port: 
                  number: <<PORT>>
    

    And, tada everything works! I can now access physical hosts outside of K8 via the K8 ingress! Sadly, that took about 4 hours of head bashing-in to realize…

  • WireGuard

    I’ve been using OpenVPN for a few things and I’ve been very interested in setting up WireGuard instead as it has a lot less overhead and is less cumbersome than OpenVPN. Well I finally took the plunge last night and it was surprisingly easy after only a few missteps!

    One of my use cases is to tunnel all traffic to the VPN server, so it appears as if my internet traffic originates from the VPN server. Here is how I set it up (with thanks to a few other articles).

    On the Server (Ubuntu 18.04 LTS)

    Install WireGuard on the server. I am running Ubuntu 18.04 and so I had to add the repository.

    Move to the /etc/wireguard directory (you may need to sudo su)

    Generate the public and private keys by running the following commands. This will create two files (privatekey and publickey) in the /etc/wireguard so you can re-reference them while building out the config.

    $ umask 077  # This makes sure credentials don't leak in a race condition.
    $ wg genkey | tee privatekey | wg pubkey > publickey

    Create the server config file (/etc/wireguard/wg0.conf). Things to note:

    1. The IP space used is specifically reserved for a shared address space per RFC6598
    2. I only care about IPv4. It is possible to add IPv6 address and routing capabilities into the configuration
    3. For routing, my server’s local interface name is eth0.
    4. You can choose any port number for ListenPort, but note that it is UDP.
    5. Add as many peer sections as you have clients.
    6. Use the key in the privatekey file in place of <Server Private Key>. Wireguard doesn’t support file references at this time.
    7. We haven’t generated the Client public keys yet, so those will be blank.
    [Interface]
    Address = 100.62.0.1/24
    PostUp = iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
    PostDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
    ListenPort = 51820
    PrivateKey = <Server Private Key>
    
    [Peer]
    PublicKey = <Client1 Public Key>
    AllowedIPs = 100.62.0.2/32
    
    [Peer]
    PublicKey = <Client2 Public Key>
    AllowedIPs = 100.62.0.3/32

    Test the configuration with wg-quick

    root@wg ~# wg-quick up wg0
    [#] ip link add wg0 type wireguard
    [#] wg setconf wg0 /dev/fd/63
    [#] ip address add 100.62.0.1/24 dev wg0
    [#] ip link set mtu 1420 up dev wg0

    Remove the interface with wg-quick

    root@wg ~# wg-quick down wg0
    [#] ip link delete dev wg0

    Use systemd service to start the interface automatically at boot

    systemctl start wg-quick@wg0
    systemctl enable wg-quick@wg0

    To forward traffic of the client through the server, we need to enable routing on the server

    echo "net.ipv4.ip_forward = 1" > /etc/sysctl.d/wg.conf
    sysctl --system

    On the Client (Android)

    1. Install the WireGuard App from the Play store
    2. Open the app and create a new profile (click the +)
    3. Create from scratch (you could move a pre-created config file too)
      1. Give the interface a name
      2. Generate a private key
      3. Set the address to the address listed in the peer section of your server config – 100.62.0.2/32
      4. (Optionally) Set DNS servers as your local DHCP servers will no longer work as all packets will encrypted and sent across the VPN
      5. Click Add Peer
        1. Enter the Server’s public key
        2. Set Allowed IPs to 0.0.0.0/0 to send all traffic across the VPN
        3. Set the endpoint to the IP address you’ll access the server on, along with the port (i.e. <InternetIP/Name>:51820)

    Revisit the Server Config

    Now that the client has a public key, you need to update /etc/wireguard/wg0.conf

    [Peer]
    PublicKey = <INSERT PUBLIC KEY>
    AllowedIPs = 100.62.0.2/32 

    Restart the wireguard service

    systemctl restart wg-quick@wg0 

    Connect to the Server from the Client

    Within the wireguard app, enable the VPN.

    You can validate by visiting ipleak.net to verify that traffic is going through the VPN.

  • Edge Beta to Stable

    As you may know, the new Edge based on Chromium went stable last week. Unfortunately, there is no automated way to move any of your settings from the Beta channel to Stable. That means, for those of us that were using the beta, you need to re-setup everything in stable.

    However, as it is based on Chromium, all the information is stored in a profile (or multiple profiles). That means you can move all your profile data from the Beta folder to the stable folder. I did this and the only issue I ran into was if you run multiple profiles that use custom images, the taskbar profile icon will retain the “BETA” tag as those icons are generated during profile creation and stored in the profile location. Unfortunately, deleting the icon in the profile folder does not seem to reset the icon.

    Stable Microsoft Edge
    %LocalAppData%\Microsoft\Edge\User Data
    
    Microsoft Edge Beta
    %LocalAppData%\Microsoft\Edge Beta\User Data

    UPDATE – If you have edge profiles assigned to a Microsoft account where your image is from O365 or another account, I found a way where you can regen the taskbar icons after doing the above steps.

    Just go to edge://settings/profiles and sign out of the account and then sign back in and it will recreate the profile icons. Make sure you do not check the box to clear all your settings though! For profiles not linked to a Microsoft, just change the profile image.

    Tada!

  • Backup Decision

    Tl;dr, I’m using Duplicacy with the new Web UI. This is hosted in a docker image, and currently pushes data to an Azure storage account.

    Also, wow, just had a slight heart-attack while writing this as I removed Docker from my NAS, which blew away a whole share of my Docker data (14 different containers including all my NextCloud personal files!). They were all backed up with Duplicacy, and while I had tested it before with a few files, you never know. It wasn’t as painless as I’d like – partially my fault with mounted drives to the container read only, partially the GUI isn’t super great yet, and really that Azure connections continually getting reset and the underlying CLI doesn’t account for that – but it’s all back and humming along again. Phew!

    Options Considered

    I’ve only included the main contenders below. In particular, I was interested in using non-proprietary storage backends that allowed me multiple options (B2, AWS, Azure, etc). The ones that were quickly removed and not tested:

    Now for the ones that were tested.

    CrashPlan

    CrashPlan has served me great for a large number of years. I have used it from two different continents successfully. There are definitely some good things about it: continuous backup, dedupe at the block level, compression, and you can provide your own encryption key. However, with the changes awhile ago (and continual changes I get emailed about), I knew it was time to look for other options. Plus, even with 1 device, it was going jump from $50/year to $120 – while not horrible, definitely a motivator.

    Synology’s Hyper Backup

    I store most of my data on my Synology NAS, and it comes with some built in tools (Glacier Backup, Hyper Backup, and Cloud Sync). I actually was running CrashPlan in a docker image on the NAS prior to doing this assessment. Of the 3 tools, Hyper Backup was really the only one I consider as Glacier is for snapshots and Cloud Sync isn’t really a backup product. For Hyper Backup, you can backup to multiple different storage providers, including Azure which was my preferred. Like CrashPlan it can do dedupe at the block level, compression, and allows you to specify your own encryption. Unlike CrashPlan it isn’t continuous (can do hourly), will send failure emails, and won’t automatically include new folders in a root if only some of the subfolders are selected. The service is free, you only pay for the storage you use.

    Duplicati

    With Duplicati I ran it from a docker image on my NUC. This meant I had access to some files that Hyper Backup could not access, which was good. Plus, you can backup to multiple different storage providers including Azure. Like CrashPlan it can do dedupe at the block level, compression, and allows you to specify your own encryption. Unlike CrashPlan it isn’t continuous (can do hourly), and I was getting lots of errors when adding new folders. Plus the database is notorious for becoming corrupt, which is not something you want with your backups. The service is free, you only pay for the storage you use.

    CloudBerry Linux

    With CloudBerry I ran it from a docker image on my NUC. This meant I had access to some files that Hyper Backup could not access, which was good. Plus, you can backup to multiple different storage providers including Azure. Like CrashPlan it can do dedupe at the block level, compression, and allows you to specify your own encryption. Unlike CrashPlan it isn’t continuous (can do hourly), I could receive notification emails. One of the really neat features is that CloudBerry understands Azure storage tiers (hot, cold, and archive) and can manage the lifecycle with regards to those. However, while the files are encrypted in the blob storage (you can’t open them), they retain their folder structure and name. Additionally, the GUI isn’t great and I was getting a few errors. The service is not free ($30), and you pay for the storage you use.

    Restic

    I tried to use restic, but wasn’t able to ever get it to work. I tried to run it in a docker, but the CLI and I just never go along (no GUI). It can use different storage providers including Azure, and it can dedupe and encrypt. However, it can’t compress, which means backups will be larger. The service is free, you only pay for the storage you use.

    Duplicacy

    With Duplicacy I ran it from a docker image on my NUC. The web-UI was still in beta when I was testing it, but fundamentally it met my needs, plus had a functional CLI (basically the UI just uses the CLI anyways). This meant I had access to some files that Hyper Backup could not access, which was good. Plus, you can backup to multiple different storage providers including Azure. Like CrashPlan it can do dedupe at the block level, compression, and allows you to specify your own encryption. Unlike CrashPlan it isn’t continuous (can do 15 minutely), but I could receive notification emails. It’s also blazingly fast and can do dedupe across machines if I was backing up more than one. The service is not free ($10), and you pay for the storage you use.

    Choosing

    For each of the ones listed above (except for Restic simply because I couldn’t get it to go), I setup test storage accounts on my Azure account and began backing up the same 50GB with each product. The key things I was looking for was: easy of use and setup, time to backup on an hourly basis, storage and transactions consumed to get an idea of ongoing costs, and any issues I ran into.

    Duplicati was the first to go simply because of the errors I was getting with it backing up the files. However, it was fast at 1:02 min for the incremental hourly scan and upload.

    CloudBerry Linux was the next to go. This was due to it being more expensive to run (storage costs), a few errors, it was second to last in speed at 1:23, and the folder/file names listed above.

    HyperBackup stuck it out the longest. Out of the box, it was definitely one of the easiest to setup. However, it was also the slowest to scan and backup (probably due to it running on the NAS and not on my NUC) a 1:32, and was uploading more data than Duplicacy. In order to have multiple copies, Hyper Backup would have to run 2 separate jobs that do the exact same thing.

    Duplicacy is what I am now using. It is incredibly fast (0:16 in the test, and only 2-5 mins every hour to scan and upload with my 900GB actual backups), and had the best cost usage for Azure. Additionally, I can easily clone to another online provider without having to rerun the drive scan, it just copies the new backup chunks. I have also setup a versioning solution that runs weekly to prune the hourly snapshots. This is based on the same pruning schedule that CrashPlan was using, and I’m seeing negligible storage increases month over month. The biggest risk is that this it is a newer piece of software that may have some bugs/issues. As mentioned in the tl;dr, my restore has taken way longer than it should’ve due to improper retries and timeouts with Azure (all the data is there though, and I can access it anywhere I install the Duplicacy CLI), but otherwise I’ve been very happy and have actually cancelled my CrashPlan account.

    Note: Technically using Azure is more expensive than if I had stuck with CrashPlan. My monthly storage costs for my backups storage account is $15-20. However, with credits, it works out to $0 for me. Plus, I’m now in more control of my backups than I was before, and I can choose what storage provider I want to use to minimize costs.

  • Powershell on Ubuntu on Windows

    Because you can (but you probably don’t want to). Party!Powershell-Ubuntu-Bash

  • Nginx + WordPress + Infinite Redirects

    As I was migrating my websites to a new host (I may blog about that later as it’s been an interesting ride), I had this lovely issue where one of my websites would go into an infinite redirect loop when sitting behind the Azure CDN (custom origin).

    Of course, it worked fine for all pages except for the root.  And it also worked fine when it wasn’t behind the Azure CDN.  For whatever reason, adding a bit of code to the functions.php theme seemed to work.

    remove_filter('template_redirect', 'redirect_canonical');

    I then had to add in a manual redirect in nginx via the below.  Still no idea why it doesn’t just “work” as it has before, but whatever. Now that it’s working, I should go back and figure out why it wasn’t with redirect_canonical…

    server {
       listen 80;
       server_name test.com;
       rewrite ^ $scheme://www.test.com$request_uri? permanent;
    }
    
    
  • SQL AlwaysOn Avail Group Failover and Client Disconnects

    We had an issue recently where an application was not properly getting disconnected from SQL during a failover of an AlwaysOn Availability Group (AOAG).  Some background:  The application was accessing the primary node, and after the failover the application continued to access the same node.  Unfortunately, as it was now read-only, the app was not very happy.

    Turns out it was due to the Read-Only configuration of the secondary.  We had it set to “Yes” which allows any connections to continue to access the secondary with the assumption the application is smart enough to know it can only read.  It appears while using this setting, connections aren’t forcefully closed, causing all sorts of issue.

    Setting it to either “No” or “Read-Intent Only” properly severed the connections for us.  Yay!

    For more info.