These are my self-amusing posts written in Gemini text/gemini format. Rendered to HTML. You can find the same content on my Gemini capsule gemini://gemini.clehaxze.tw[what is gemini://?]. In fact I recommend viewing through a Gemini browser (like this one) for the best experience. An Atom feed is also avaliable.

Latest Posts

2024-12-28: Building new GGML backends for novel accelerators, how, challenge and opportunities (FOSDEM 2025 draft)

GGML has emerged as one of the leading framework for efficient inference, particularly for large language modes. GGML is engineered with performance in mind and with great quantization support. More importantly. Its flexibility and growing community support make it an ideal platform for exploring...

2024-12-15: Announcing TurboQOA - Streaming QOA Codec

Something I have wanted to write but never got around to it, QOA (Quite Ok Audio) is a very simple, lossy audio codec by the same person who brought you QOI, the Quite Ok Image format. There is no real reason to use QOA. Until I found one and I find no one implemented it yet. So I did. The use ca...

2024-12-07: Setting up Perplexica on Arch Linux locally

Perplexica is an open source replica of the (quite famous) Perplexity AI based search engine. Which I use from time to time when I had enough parsing articles. The AI just does it faster (though can be wrong at times). I find it to be a great tool to quickly establish some understanding of a topic. ...

2024-12-05: Mikrotik is Awesome

Little known fact, I was a network admin back in the college days. I was a part of the Campus Dorm-net Promotion Association (CDPA) in National Sun-Yet-Sen University, Taiwan. The group is defunct now. But I had my fair share of messing with Juniper, Zyxel, Aruba switches and an Arista router. We a...

2024-11-13: Full LLMs running on Tenstorrent + GGML

See my last post for context Wins are just coming fast! Since my last post of less then a month ago. This is such a fun project. My GGML backend was able to run a single layer on TinyLLaMA on a Tenstorrent Wormhole n300. More layers causes the model to start spilling out gibberish. Whith lots of ...

2024-10-28: Getting the 1st token generated on Tenstorrent with GGML

I've been working on integrating Tenstorrent cards into GGML for a while, since early this year, on and off. I was at MOPCON 2024 in Kaohsiung during the weekends. I'm more to just meet people then attending the actual talks. Out of not knowing what else to do late night, after a meetup with the Op...

2024-10-08: Attack analysis of my mom's Facebook account

Recerntly, my mom's Facebook account was hacked. I want to discover the process of discovering the attack, how it's done and what we do to prevent it. I think the last part is particularly important as measurments accepted by my mom should be accessible to _everyone_. The case is simple. But a goo...

2024-10-05: BOOM! I blew up my home directory

I was copying some files from my phone to my computer. I made an temporary mounting point in my home directory. When I was done, I was going to delete the directory, but I accidentally moved a layer above it and a single tab made me run `rm -rf marty`. I noticed my mistake 2 seconds later, but it w...

2024-09-08: (Reupload) eBike for intra-city traveling in Taiwan

I live in Taiawn and UBike, the bike sharing system in Taiwan, has been a godsend. You just need to register once and most cities in Taiwan have UBike stations. Not just popular places, but literarly everywhere, like ever 300m. It is the major reason I don't need a car or a scooter for my daily co...

2024-07-30: Understanding the Gemini request and response (COSCUP 2024 draft)

The Gemini protocol is a simple documentation transport protocol published by Solderpunk in 2021 and have gained some traction in the tech community. Users of the protocol, among others forms what's referred to as the "small web" or the "small internet". The protocol itself is described across 3 ...

2024-07-11: Fixing electron apps and Steam not working on Wayland (Arch Linux, RDNA3, AMDGPU, broken Vulkan)

I have been bothered with a problem on my laptop, an Framework 13 AMD - Steam and every electron applications (VSCode, Element, electron itself, etc..) won't run on Wayland. They run but hangs without creat. I can run Steam with X11. But it fails to run every game. No matter which version of Proton...

2024-07-07: A gentle guide on getting your Tenstorrent card running on Arch Linux (with the Metalium stack)

Recently I got a message from Tenstorrent's community manager for helping with improving the installation documents. To make it easier for everyone. While that is still in progress, I wanted to document how I got my Tenstorrent card running on Arch Linux (since Tenstorrent officially only supports ...

2024-06-18: Slides for April talk at NSYSU for AI inference

I was invited to give a talk at National Sun Yat-sen University (NSYSU) in Kaohsiung, Taiwan on April 26, 2024. The talk was about AI inference, invited by my Prof for master's degree. I totally forgot to upload the slides, so here they are. The sldies are made with Revel.js so you'll need a web s...

2024-06-09: Compare nlohmann/json to Glaze

I've been wanting to replace nlohmann/json with something else in my codebase for a while now. Recently Glaze entered my radar and I decided to give it a try. Here are my thoughts after writing a PoC program comparing the two libraries. Glaze beats nlohmann/json. Just look at the numbers on glaze...

2024-06-04: My stance on (GitHub) Copilot

I saw Brodie Robertson's video about NetBSD banning AI generated code on YouTube and I wanted to share my thoughts on the topic. I think current AI generated code is fine. Here's why. For simple cases, like printing "Hello, World!" 10 times or centering a div, the code has shown up many, many tim...

2024-06-03: Pledging OpenGL applications on OpenBSD

I spent an hour or two trying to make the Lagrange browser more secure on OpenBSD by using the pledge system call. For those who don't know, pledge is a system call that allows a process to restrict the system calls it can make. This vastly reduces the attack surface of the process, and makes it mu...

2024-06-02: Thoughts and logs after messing with Tenstorrent Grayskull

I got my Tenstorrent card last week or so, and I set it up and gave it a test drive. My end goal is to develop it's software stack and applications such as it can be used as a replacement for Nvidia GPUs, for cheap and at a lower power consumption. But for now, it's time to get my hands wet and see...

2024-05-27: The Librem 5 is.. not that bad?

Recently Purism assounted they are dropping the price of their Librem 5 from USD $999 to $699, slightly higher the original price the Librem 5 launched at. I preordered one 5 years ago when it was new and never received mine. So I wrote an email asking for my device. And received the device in abou...

2024-05-15: Asynchronously read STDIN in Drogon/Trantor

I haven't blogged in quite a while. I've been busy on other stuff but they are just non blogable. At least for now. I hope one day I can publish them. Anyway, I was working on some web related code in Drogon and I need to read stdin so the user can control the application. Problem being, C++'s `std...

2024-03-31: My Vodka teir list

I've been trying different Vodka brands for a while now and I've come up with a teir list of my favorites. My perference for Vodka is that it should be smooth and not have a strong taste what so ever. I drink Vodka straight and use it as fuel for writing stupid code that is too hard for me usually....

2024-03-19: Just me or GTC 2024 feels like cyberpunk dystopia

I watched the GTC 2024 keynote and I was like. Gosh.. really.. this sounds so .. wrong! You can watch the short 16 minutes version by CENT here. The core accouncments are: The entire keynote gives me a vibe of Nvidia turning into what IBM was - mainframe and proprietary solution. But worse. Al...

2024-03-17: LLM Inference - the state and outlook

This post is a draft for a lecture I'll be giving later this year at National Sun-Yat Sen University, Kaohsiung, Taiwan. Where I'm a guest lecturer, at a course by Prof. Chang. Whom is my advisor during my college years. Making a slide deck is hard without a script, hence this post where I blabber ...

2024-03-06: This page contains QOI images

I'm working on adding support for QOI images in Lagrange out of boredom. QOI is a very simple lossless image compression format that is comparable to PNG and very fast to compress. This page constains a few QOI images encoded by different encoders to test my integration into the Lagrange browser. I...

2024-03-03: Learning the joy of buying outfits

I've been a advocate to reduce consumption to the absolutly minimum ever since I started considering climate change a crysis that needs addressing immidiatelly. I didn't get why people can buy so much clothing that their closets are full of clothes they never wear. But a order I made half a year ag...

2024-03-02: CORS on the Drogon web framework

CORS, the problem that every web developer has faced at least once in their life. Drogon does not come with built-in middleware to handle CORS, but it is easy to implement it yourself. Usually, you'll want to expose your APIs to the world. This code will allow any request to the `/api` path to go ...

2024-03-01: Optimizing SoCs for Large Language Models on the edge

Around December, 2023. I had a quick talk with a chip maker about porting and what hardware features is needed for fast LLM inferencing - due to my work on porting llama.cpp to the RK3588 NPU. I started writing this post as the condensation of my view and recommendations. But.. I got busy and forgo...

2024-02-18: In the name of sustainablity, I bought a power meter

It's probably not difficult to guess that I own several computers. It's a part of my job and there's no real way around it. Remember, the cloud is just someone else's computer. Today as I'm walking through a local mall, I saw an extension cord with a power meter on it. You know what, that's a good ...

2024-02-17: Tenstorrent first thoughts

I've looked into alternative AI accelerators to continue my saga of running GGML on lower power-consumption hardware. The most promising - and the only one that ever replied to my emails - was Tenstorrent. This post is me deeply thinking about if buying their hardware for development is a good inve...

2024-02-14: Benchmarking RK3588 NPU matrix multiplication performance EP3

Today is the last day of CNY and being honest, I have nothing to do. Out of nowhere, I decided to look deeper into RK3588's NPU performance characteristics. To figure out what it actually needs to be performant. Like how batch size and native/normal layout affects the performance. I haven't done an...

2024-02-11: RE: Dialectics and “Artificial Intelligence”

This post is my reply to some ideas in Roderic's post on altesq.net. It got me interred but I was abroad during CNY and I don't have much time to actually write down my thoughts. This post should have gone out a few days back. But that just gave me more time to think through my words, eh? By no me...

Pages:
Author's profile. Photo taken in VRChat by my friend Tast+
Martin Chang
Systems software, HPC, GPGPU and AI. I mostly write stupid C++ code. Sometimes does AI research. Chronic VRChat addict

I run TLGS, a major search engine on Gemini. Used by Buran by default.


  • marty1885 \at protonmail.com
  • Matrix: @clehaxze:matrix.clehaxze.tw
  • Jami: a72b62ac04a958ca57739247aa1ed4fe0d11d2df