Martin Chang

HPC | Systems Engineer and AI Developer

About Me

Hi! I'm a software engineer located Taiwan. Specializing in building high performanmce sofware, researching state of the art AI systems (and occationally writes Linux drivers). Here's a few technology I use daily

As you might guessed - I don't excel at web development so this site is as barebones as it gets. However I'm a privacy advocate. So you can also reach this cite via Tor and the onion address. This way, I can't know who is reading. And I tried my best to make sure this cite is 90% functional even without JavaScript. I'm also an open source contributor who send patchs whenever I can. I'm an collabrator of the Drogon project And an editor of the Librem5 Wiki.


High Performance Software

I build high performance software like ML libraries and search engines in C++. I've deployed a search engine written in highly optimized C++ commerically in a scalable infrastructure.

Arch Linux

I'm a passionate Linux user and being on the bleeding edge is important for me. I use Arch Linux on my laptop, PC, workstations, servers and embedded systems. I've been using Linux since 2013 and knows a bit about Linux internal and managment.


ROOT is a high performanmce data analysis framework in C++. I have carried out Big Data analysis that usually required a cluster using it on a single server. I also integrated ROOT into IC development flow to analyze performance in real time.

Macline Learning/AI

Besided knowing how to use common ML libraries like sklearn and PyTorch. I'm an active member of the Numenta community. Helping the development and research of HTM theory.


Besides writing code for the CPU. OpenCL allows me to run faster using GPUs or even FPGAs. I also have experience using AMD's HIP platform to write portable C++ kernels.



Drogon is an production ready, extremely fast C++ web application framework. I'm a maintainer, wrote the coroutine subsystem, various bug fixes and joins community development. I have developed commerical product using it. Drogon easily handles C10K on an embedded system and C100K on a standard server.


Etaler is a high performance implementation of Hierarchical Temporal Memory, a biologically inspired machine learning model developed by Numenta. At the time of release, Etaler is more than 20x faster than the community developed HTM.core on CPU and another 2x faster on a GPU.


embree-arm is the world's first functional port on Intel's Embree ray tracing kernels for ARM processors. Though not maintained. This project inspired the embree-aarch64 project (using the same porting approach) to provide up-to-date ray tracing kernels for ARM.

Conference Talks


I love games but why movies look so much better - An introduction to computer graphics and ray tracing.


Introduction to GPU computing with OpenCL - video


Building your own NumPy! Implementing NDArrays from scratch.


Hello! 你好! Salution!
I speek a varity of both natural and constructed languages. Chinese is my native toungue and have learned Enaglish since a very young age. Then I picked up Esperanto as a hobby.

In terms of programming languages. I use a varity of them daily from simple scripting to serious programming to office work.


You are looking at one! This website runs on the horrifying yet fantabulous technology of a fully customized C++ backend and VallinaJS!. I wrote which also runs on the same framework.

Joke aside. This is the same technology I deployed commerically in a scalable infrastructure. Capable of handling tens of thousands of requests per second with single instance and enough bandwidth. Besides being on a customized backend; this cite is also hosted on my private, ARM server that I've build my own BSP.