Martin Chang

HPC | Systems Engineer and AI Developer

About Me

Hi! I'm a software engineer located Taiwan. Specializing in building high performance sofware, researching state of the art AI systems (and occationally writes Linux drivers). Here's a few technology I use daily

As you might guessed - I don't excel at web development - this site is as barebones as it gets. However I'm a privacy advocate. So you can also reach this site via Tor and the onion address. This way, I can't know who is reading. I tried my best to make sure this site is 90% functional even without JavaScript. Everything is encrypted and nothing is loaded from a CDN.

I contribue to open source projects and send patchs whenever I can. I'm an collabrator of Drogon, a C++ web framework and a contributor of phosh, the GNOME phone shell. And I am the author of Etaler, a open, GPU accelerated Hierarichal Temporal Memory library.


High Performance Software

I build high performance software like ML libraries and search engines in C++. I've deploied a search engine written in highly optimized C++ commerically in a scalable infrastructure.

Arch Linux

I'm a passionate Linux user and being on the bleeding edge is important for me. I use Arch Linux on my laptop, PC, workstations, servers, embedded systems and mobile phones. I've been using Linux since 2013 and knows a bit about Linux internal and managment.


ROOT is a high performanmce data analysis framework in C++. I have carried out Big Data analysis that usually required a cluster using it on a single server. Integrated ROOT into IC development flow to analyze performance in real time. And also an early adoptor of ROOT7, an new and improved ROOT.

Machine Learning/AI

Besided knowing how to use common ML libraries like sklearn and PyTorch. I'm an active member of the Numenta community. Helping the development and research of HTM theory.


Besides writing code for the CPU. OpenCL allows me to run faster using GPUs or even FPGAs. I also have experience using AMD's HIP platform to write portable C++ kernels.



Etaler is a high performance implementation of Hierarchical Temporal Memory, a biologically inspired machine learning model developed by Numenta. At the time of release, Etaler is more than 20x faster than the community developed HTM.core on CPU and another 2x faster on a GPU.


embree-arm is the world's first functional port on Intel's Embree ray tracing kernels for ARM processors. Tho not maintained. This project inspired the embree-aarch64 project (using the same porting approach) to provide up-to-date ray tracing kernels for ARM.


MIOpen-everywhere is a day 1 port of AMD's MIOpen neural network acceleration library that runs on all OpenCL devices. Including CPU, Intel HD Graphics, Nvidia GPUs, etc... Several bugs were discovered in the porting process then subsequently reported to AMD.

Conference Talks


I love games but why movies look so much better - A introduction to computer graphics and ray tracing.


Introduction to GPU computing with OpenCL - video


Building your own NumPy! Implementing NDArrays from scratch.


Hello! 你好! Salution!
I speek a varity of both natural and constructed languages. Chinese is my native toungue and have learned Enaglish since a very young age. Then I picked up Esperanto as a hobby.

In terms of programming languages. I use a varity of them daily from simple scripting to serious programming to office work.


You are looking at one! This website runs on the horrifing yet fantabulous technology of a fully customized C++ backend and VallinaJS!

Joke aside. This is the same technology I deploied commerically in a scalable infrastructure. Capable of handling tens of thousands of requests per second with single instance and enough bandwidth. Besides being on a customized backend; this site is also hosted on my private, ARM server that I've build my own BSP.