Martin Chang

HPC | Systems Engineer and AI Developer

About Me

Hi! I'm a software engineer located Taiwan. Specializing in building high performanmce sofware, researching state of the art AI systems (and occationally writes Linux drivers). Here's a few technology I use daily

As you might guessed - I don't excel at web development so this site is as barebones as it gets. However I'm a privacy advocate. So you can also reach this cite via Tor and the onion address. This way, I can't know who is reading. And I tried my best to make sure this cite is 90% functional even without JavaScript. I'm also an open source contributor who send patchs whenever I can. I'm an collabrator of the Drogon project And an editor of the Librem5 Wiki.


High Performance Software

I build high performance software like ML libraries and search engines in C++. I've deployed a search engine written in highly optimized C++ commerically in a scalable infrastructure. I also developed TLGS, a search engine for the Gemini protocol (proxy).

Arch Linux

I'm a passionate Linux user and being on the bleeding edge is important for me. I use Arch Linux on my laptop, PC, workstations, servers and embedded systems. I've been using Linux since 2013 and knows a bit about Linux internal and managment.


ROOT is a high performanmce data analysis framework in C++. I have carried out Big Data analysis that usually required a cluster using it on a single server. I also integrated ROOT into IC development flow to analyze performance in real time.

Macline Learning/AI

Besided knowing how to use common ML libraries like sklearn and PyTorch. I'm an active member of the Numenta community. Helping the development and research of HTM theory.


Besides writing code for the CPU. OpenCL allows me to run faster using GPUs or even FPGAs. I also have experience using AMD's HIP platform to write portable C++ kernels.



Drogon is an production ready, extremely fast C++ web application framework. I'm a maintainer, wrote the coroutine subsystem, various bug fixes and joins community development. I have developed commerical product using it. Drogon easily handles C10K on an embedded system and C100K on a standard server.


Etaler is a high performance implementation of Hierarchical Temporal Memory, a biologically inspired machine learning model developed by Numenta. At the time of release, Etaler is more than 20x faster than the community developed HTM.core on CPU and another 2x faster on a GPU.


embree-arm is the world's first functional port on Intel's Embree ray tracing kernels for ARM processors. Though not maintained. This project inspired the embree-aarch64 project (using the same porting approach) to provide up-to-date ray tracing kernels for ARM.

Conference Talks


I love games but why movies look so much better - An introduction to computer graphics and ray tracing.


Introduction to GPU computing with OpenCL - video


Building your own NumPy! Implementing NDArrays from scratch.


Hello! 你好! Salution!
I speek a varity of both natural and constructed languages. Chinese is my native toungue and have learned Enaglish since a very young age. Then I picked up Esperanto as a hobby.

In terms of programming languages. I use a varity of them daily from simple scripting to serious programming to office work.


I develope and host the TLGS search engine. A scalable seach engine for the Gemini protocol written in state of the art C++20. It scalses both vertically and horizontally, uses fairly little resource and is highly concurrent. The crawler can easily handle 100 concurrent pages on 2 slow ARM A72 cores. I encourage to check it out with a nice Gemini browser. However it is also viewable through various Web proxies to Gemini.

It runs on the same technology I deployed commerically in a scalable infrastructure, though it's a complete re-implementation. Capable of handling tens of thousands of requests per second with single instance and enough bandwidth. It is also first developed on my private, ARM server that I've build my own BSP; before trnsfering to a VPS.