Replacing my ARM server, setting up a new NAS and migration

The server that I run right now to host all of my services, including this blog is around 6 years old. Using an experimental ARM development board. I bought it because at the time 10G networking is expensive and that board itself was basically a proper 10G NIC. Well, non standard hardware has non standard hardware issues.

See the write-up for setting up that machine.

I have outgrown it's capacity. It started out being my NAS and data analysis machine. Later on I used it as my center of development. Then started running Nextcloud and Jellyfin and the ARM Cortex A72s (even with 16 of them!) start to run into issues. It can't get anywhere close to transcoding 1080p h.264 in realtime. And loading my photos page on Nextcloud takes 40 seconds thanks to the weak single thread performance. Plus, between my photo backup, Anime archive, YouTube archive and well.. research data. It ran out of storage space. That's beyond the fact that the HDDs are shown signs of degradation. And the hardware mods I had to do to make it quiet and cool. Which is hacky and had failed once in the past.

It has to be replaced. The main board and the SSD is still good - I can do something with them. But I need a new box and more storage.

So I did.

The spec is way overkill for just a NAS. Because it is not. The 2x SSDs are used separately. One for system drive. And the other for ZFS SLOG and L2ARC - yes 1TB is a lot of L2ARC. I got the SSD cheap and the machine is designed as a gaming/video editing/large file serving machine:

Part Item (x number)
CPU Intel Core Ultra 245K
RAM 64G
SSD Kingston KC3000 1TB x2
HDD Toshiba N300 12T x4
CaseCooler Master N400

What could I say. There is not eventful setup process. It's plain old x64. Installed Arch Linux. I have gone for BTRFS instead of ext4 for it's snapshot capability and avoid dealing with ZFS modules on boot.

The reason for Intel instead of AMD even considering their reputation and performance right now is several folds:

  • More bandwidth to chipset (PCIe Gen4 x16 instead of Gen4 x8 like AMD)
  • Lower idle power
  • Better hardware encoding quality (for Jellyfin)

Here's a photo of the machine's interals (without the HDDs and with a Tenstorrent p100a, I used it temporary for development).

The internals of the machine (-HDDs, +p100a)
Image: The internals of the machine (-HDDs, +p100a)

Oh right, I haven't introduce you to the machines. The old system is called Nina, after Nina, the loli android in Plastic Memories. Because that machine was going to help me in my daily life. Plastic Memories is a good series, waitch it, now! It is the reason why I got myself into AI.

Nina, the character in Plastic Memories
Image: Nina, the character in Plastic Memories

And the new machine called Snowdrop, from Beatless. I wanted to call her Lacia (the main character) but I already used the name for my main research rig. Well, in the show Snowdrop has the ablity to mass control other androids. Good enough for a rig at the center of my workflow and other machines.

Snowdrop, the android of mass control in Beatless
Image: Snowdrop, the android of mass control in Beatless

The ZFS article on Arch Wiki did not say which version of the ZFS kernel module should be used. I tried to use the prebuilt binary kernel modules, so pacman and update both the kernel and the ZFS module at the same time without running into issues. Turns out bad idea, mirrors often have different kernel version vs what the binary module is built against.

After searching.. people on Reddit agreed that you should use the DKMS module as they are rebuilt each time the kernel updates,

yay -S zfs-dkms

With both Nina and Snowdrop running. The next problem is transferring 4TB of data from Nina to Snowdrop. Easier said then done. The A72 cores that runs Nina is very slow. Even just with a 1G link (don't want to mess with NXP's restool to setup 10G. And the 5400RPM HDDs on Nina can't run much faster anyway), the A72 can't saturate the entire 1G with rsync wile transferring large files. My friends recommended rclone and using their SFTP backend. It is much better, it supports multi threading and and can easily saturate the 1G link. However - it failed to sync the Nextcloud folder and only detected 600G out of several TBs of data.

I decided to not risk that. Back to rsync. It managed to run at 30MB/s for small files and 85MB/s for large files, and the SSH process running at 65% of the A72. Thrown that in tmux and waited. It took a whole 12 hours to copy things over. I can already imagine copying Showdrop's 48TB of total storage will be a nightmare.

To make things easier for me - my data/workspace and nextcloud folder runs on different users - I had to install nextcloud before hand, make sure both my user and nextcloud has the same UID as Nina has, temporarily enable root login so a single rysnc pass can copy both folders. Archive mode keeps user and group of files. There's probably is a better and safer way, but I am lazy. The following command is issued from Snowdrop to copy the data:

sudo rsync -aP --info=progress2 root@nina.localdomain:/media/storage/ /media/storage/

With data copied, the problem now is to migrate all services. And it happens this is the first time me migrating web services. I know it is RDBMS so most likely just copying the data/config folders and migrate the DB should work. That is, quite eventful. I decided to dump Nginx with Caddy. Nginx has served me well, but the configuration is complicated and I heard Caddy is a lot easier. Automated TLS certificates (though certbot is easy), simple configuration, good error messages, etc..

That's what I heard. But actually figuring out how well and easy it is is truly amazing. The following ins all what's needed for my blog' homepage.

clehaxze.tw {
    reverse_proxy 127.0.0.1:4001
}

That's it. Really. Let's compare to Nginx (well, at least a portion of it.. my full configuration also handles Tor and other stuff..). Caddy also gives you HTTP/3 by default. And getting that to work on Nginx is an entire journey.

http {
    server {
        listen 443 ssl;
    	listen [::]:443 ssl;
        server_name clehaxze.tw;

        location / {
            add_header Strict-Transport-Security "max-age=63072000; includeSubdomains; ";
    	    add_header X-Content-Type-Options "nosniff";
            proxy_pass http://localhost:4001;
    	    proxy_http_version 1.1;
    	    proxy_pass_header Server;
        }
        ssl_certificate /etc/letsencrypt/live/clehaxze.tw/fullchain.pem; # managed by Certbot
        ssl_certificate_key /etc/letsencrypt/live/clehaxze.tw/privkey.pem; # managed by Certbot
        http2 on;
    }
}

Anyway, most services are moved easily (won't get into them for privacy reasons). Nextcloud is fun. You really need to follow Arch Wiki to the word for Nextcloud to work. Somehow Nextcloud failed to access the configuration folder. I did all the usual UNIX permissions debugging and even learned enough PHP to see what Nextcloud is seeing.. Took me an entire day bouncing ideas around with my friends and LLMs to figure that I missed a single systemd settings for php-fpm.. sigh..

Besides that, since both servers stores the data on the same path. Nothing has to be changed after both DB and configuration has been migrated. Took me a while reading my old Nginx to make a Caddy equlivant. It is MUCH cleaner then Nginx.

cloud.mydomain.com {
    root * /usr/share/webapps/nextcloud
    php_fastcgi unix//run/nextcloud/nextcloud.sock
    file_server

    header {
        Strict-Transport-Security "max-age=15552000;"
        X-Content-Type-Options "nosniff"
        X-XSS-Protection "1; mode=block"
        X-Robots-Tag "noindex,nofollow"
        X-Frame-Options "SAMEORIGIN"
        -Server
    }

    # Redirects for Nextcloud well-known URLs
    redir /.well-known/carddav /remote.php/dav/ 301
    redir /.well-known/caldav /remote.php/dav/ 301
    redir /.well-known/webfinger /index.php/.well-known/webfinger 301
    redir /.well-known/nodeinfo /index.php/.well-known/nodeinfo 301

    # Ban non web paths
    @forbidden {
        path /.htaccess
        path /data/*
        path /config/*
        path /db_structure
        path /.xml
        path /README
        path /3rdparty/*
        path /lib/*
        path /templates/*
        path /occ
        path /console.php
    }

    respond @forbidden 404
}

Feel free to take it if you are also running Nextcloud on Arch with Caddy and php-fpm.

Snowdrop with it's Intel 245K is so much faster then Nina's LX2160. With hardware video encoding, Jellyfin is smooth. No more lag, no more buffering, no more waiting half a minute before watching video. The load time of my photos folder on Nextcloud drops 4x.

I don't know what I want to get out of this post. But here's a photo of all the machines I run. The black box on the left is Snowdrop, the new system. On the right is Nina, the old system. And in the back is a Tenstorrent QuietBox I use for related development. Named Isla (main character from the Plastic Memories series, again)

View of my homelab :)
Image: View of my homelab :)

I guess it's just an incoherent rambling. Snowdrop, welcome to the family. And Nina, thank you for the past few years. You will still have an important role.

Author's profile. Made my my friend.
Martin Chang
Systems software, HPC, GPGPU and AI. I mostly write stupid C++ code. Sometimes does AI research. Chronic VRChat addict

I run TLGS, a major search engine on Gemini. Used by Buran by default.


  • marty1885 \at protonmail.com
  • Matrix: @clehaxze:matrix.clehaxze.tw
  • Jami: a72b62ac04a958ca57739247aa1ed4fe0d11d2df