These are my self-amusing posts written in Gemini text/gemini format. Rendered to HTML. You can find the same content on my Gemini capsule gemini://gemini.clehaxze.tw[what is gemini://?]. In fact I recommend viewing through a Gemini browser (like this one) for the best experience. An Atom feed is also avaliable.

Latest Posts

2024-03-31: My Vodka teir list

I've been trying different Vodka brands for a while now and I've come up with a teir list of my favorites. My perference for Vodka is that it should be smooth and not have a strong taste what so ever. I drink Vodka straight and use it as fuel for writing stupid code that is too hard for me usually....

2024-03-19: Just me or GTC 2024 feels like cyberpunk dystopia

I watched the GTC 2024 keynote and I was like. Gosh.. really.. this sounds so .. wrong! You can watch the short 16 minutes version by CENT here. The core accouncments are: The entire keynote gives me a vibe of Nvidia turning into what IBM was - mainframe and proprietary solution. But worse. Al...

2024-03-17: LLM Inference - the state and outlook

This post is a draft for a lecture I'll be giving later this year at National Sun-Yat Sen University, Kaohsiung, Taiwan. Where I'm a guest lecturer, at a course by Prof. Chang. Whom is my advisor during my college years. Making a slide deck is hard without a script, hence this post where I blabber ...

2024-03-06: This page contains QOI images

I'm working on adding support for QOI images in Lagrange out of boredom. QOI is a very simple lossless image compression format that is comparable to PNG and very fast to compress. This page constains a few QOI images encoded by different encoders to test my integration into the Lagrange browser. I...

2024-03-03: Learning the joy of buying outfits

I've been a advocate to reduce consumption to the absolutly minimum ever since I started considering climate change a crysis that needs addressing immidiatelly. I didn't get why people can buy so much clothing that their closets are full of clothes they never wear. But a order I made half a year ag...

2024-03-02: CORS on the Drogon web framework

CORS, the problem that every web developer has faced at least once in their life. Drogon does not come with built-in middleware to handle CORS, but it is easy to implement it yourself. Usually, you'll want to expose your APIs to the world. This code will allow any request to the `/api` path to go ...

2024-03-01: Optimizing SoCs for Large Language Models on the edge

Around December, 2023. I had a quick talk with a chip maker about porting and what hardware features is needed for fast LLM inferencing - due to my work on porting llama.cpp to the RK3588 NPU. I started writing this post as the condensation of my view and recommendations. But.. I got busy and forgo...

2024-02-18: In the name of sustainablity, I bought a power meter

It's probably not difficult to guess that I own several computers. It's a part of my job and there's no real way around it. Remember, the cloud is just someone else's computer. Today as I'm walking through a local mall, I saw an extension cord with a power meter on it. You know what, that's a good ...

2024-02-17: Tenstorrent first thoughts

I've looked into alternative AI accelerators to continue my saga of running GGML on lower power-consumption hardware. The most promising - and the only one that ever replied to my emails - was Tenstorrent. This post is me deeply thinking about if buying their hardware for development is a good inve...

2024-02-14: Benchmarking RK3588 NPU matrix multiplication performance EP3

Today is the last day of CNY and being honest, I have nothing to do. Out of nowhere, I decided to look deeper into RK3588's NPU performance characteristics. To figure out what it actually needs to be performant. Like how batch size and native/normal layout affects the performance. I haven't done an...

2024-02-11: RE: Dialectics and “Artificial Intelligence”

This post is my reply to some ideas in Roderic's post on altesq.net. It got me interred but I was abroad during CNY and I don't have much time to actually write down my thoughts. This post should have gone out a few days back. But that just gave me more time to think through my words, eh? By no me...

2024-02-05: Are we seeing diminishing returns on large larguage models?

Bla bla bla.. I'll ignore the introduction to LLMs since everyone knows what they are. Even my mom asked me about them. The question I want to propose is: are we seeing diminishing returns on large language model scaling? The key observation that kick started GPT-2 evolving to GPT-3 was that with ...

2024-02-02: More reflection on the climate and myself

2023 was horrible. It was HOT. This made me very worried about global warming. And I've gone through the 5 phases of grief. Denial, anger, bargaining, depression and acceptance. Well, not acceptance but the realization that I am not a part of the problem due to my (lack of) habits. And that I have ...

2024-01-26: Slides for porting Piper to RK3588 at local meetup

I'm invited to give a talk about porting Piper to RK3588 at local group. The talk is given in Chinese. But I've translated the slides into English. And both version are available at the end of this post. For reference, the slides are about a previous post of mine. There will not be any video reco...

2024-01-03: My journey setting up my Framework 13 AMD, Arch Linux and OpenBSD

On, December 16th, 2023, I got my shipment of Framework 13 that I ordered 3 days ago. Wow that's fast! But getting it working to my needs was another story. Things just.. go wrong at every turn. I don't blame Framework, I do have some specific needs that I recognized I will have to solve myself. I...

2024-01-02: Rethinking the usiblity correct horse battery staple

I was a fan of XKCD 936, correct horse battery staple. I agree that passphrases are much stronger then passwords. But there's a masive usiablity concern with passphrases. Namely, entering them is a pain in the butt. Recently I replaced one major password I use with a passphrase. Just 4 words long...

2023-12-31: For the greater good, let's do it the hard way

I wasn't planning on this being my 2024 resolution. I am not overly certain why I made the title. Must have felt like it last night. Nevertheless, this is the mood I'm in right now, at the end of 2023 going into 2024. Many things happened this year. A lot that I dislike. From climate change gettin...

2023-12-24: Accelerating Piper text-to-speech on the RK3588 NPU

Ho ho ho. Happy Hollidays! With Rockchip releasing rknn-toolkit2 1.6.0, the feature set becomes more and more complete. In this release, it's enough to be used to accelerate the Piper text to speech system. I want to document what I've done to make it work, what's my vision for it and what's more t...

2023-12-18: Using dynamic input shapes on RKNN/RK3588

Quick documentation for my self. Today, I dabbled into accelerating TTS using the RK3588's NPU. It works really well! I'm seeing a real time factor (RTF) of 0.15 during my initial tests, and I believe I can push it even further. One thing I had to do was to use dynamic input shapes. RKNN tradition...

2023-12-17: Update on GGML RKNPU2 backend and RKNPU2 1.6.0

Recently Rockchip released a new version of the RKNPU2 SDK. It enabled larger matrix multiplcation of up to K=10240 and int4 support. My last post describes briefly how I build the RKNPU2 backend for GGML. This time, I want to share what I am able to achieve with the new SDK release. Before I sta...

2023-12-14: Finally deciding to buy a new laptop

Here's the post where I justify buying myself a new laptop. I know exactly 0 human on the internet cares. But dedicating a post forces me to actually think about it and to not buying crap I don't NEED (that's the key word here). Even if no one reads it, publishing it on the internet feels like some...

2023-12-13: Be aware of Energy Transition doom proficy

So, I came across this Simon Michaux guy on YouTube yesterday. He is a geologist and claims that the energy transition is doomed to fail because of the lack of (rare earth) metals. Off by multiple orders of magnitude. His presentation was convinvcing to me at first. WHile I have some but very limit...

2023-12-08: Feeling of powerlessness of mine towards climate change

This post is not my usual rambling about tech. But I want to share my frustration with how I can't help with climate change. First of all. I'm not virtual signaling, but I think I'm likely one of the least consumerist people in the population - I host my blog on Gemini. Boring jokes aside, I truly...

2023-12-02: Status Report: Building the HTTP/2 Client for Drogon

I left my last job by the start on November, 2023 and have a week of free time before I join the new one. With time on hand, I decided to finish a feature request I made years ago in Drogon. A HTTP/2 client - And server, but that'll happen later. I never expected to get it almost finished in like 5...

2023-11-04: pledge(2)-ing and unveil(2)-ing the Drogon web application framework

I'm drunk on a Friday night again. This time I decide to investigate how to use OpenBSD's pledge(2) and unveil(2) to enhance the security of the Drogon web application framework. Drogon in written in the C++ programming language, not known to be safe. But in truth, as a maintainer of Drogon, non of...

2023-10-22: Experimental RKNPU2 backend for GGML/llama.cpp

This weekend, after a night of partying with my friend and somehow ending up hanging out at a near by McDonald. Back to home, I picked up my old work of running LLMs on the Rockchip RK3588's NPU. Last time I hacked around directly withing GGML and running the RWKV model. That was quite a failure, s...

2023-09-25: Using llama-cpp-python server with LangChain

Very quick and short one. I was trying to make something with LangChain. I already had a server running LLaMA 2 Chat using llama-cpp-python. It happens to provide a OpenAI like API. There must be a way to abuse the code to make it work with LangChain. Yet half and hour of Googling turned up nothing...

2023-09-17: Hardware accelerated playback on PineTab 2 (RK3566)

Want to quickly document how I got my PineTab 2 to play 1080p videos smooth(-ish) with hardware acceleration. With the defaul DanctNIX image, it was quite easy. Simply install `mpv` and ` ffmpeg-v4l2-request-git` from the AUR. Then, pass pass `--hwdec=drm` to `mpv` and you're good to go. This ...

2023-09-15: RE: On using Pinyin

I cam across two articles on Gemini, discussing what would happen if Chinese switched to using pinyin instead of characters. I want to share my thoughgts as a native speaker. First, I don't user pinyin to type Chinese. I'm from Taiwan and we use another system called Bopomofo(注音). I agree with...

2023-09-02: Benchmarking RK3588 NPU matrix multiplication performance EP2

Not long after my last benchmarking attempt. Rockchip releases a SDK update that fixes the crashing matrix multiplication API. Now I'm no longer restricted to using ONNX. Now I can directly do matrix multiplication from C! And now I can do an apple to apple comparison with OpenBLAS. That's benchmar...

Pages:
Author's profile. Photo taken in VRChat by my friend Tast+
Martin Chang
Systems software, HPC, GPGPU and AI. I mostly write stupid C++ code. Sometimes does AI research. Chronic VRChat addict

I run TLGS, a major search engine on Gemini. Used by Buran by default.


  • marty1885 \at protonmail.com
  • Matrix: @clehaxze:matrix.clehaxze.tw
  • Jami: a72b62ac04a958ca57739247aa1ed4fe0d11d2df