Distrust of AI coding agents (Google Jules)

Oh man.. not this kind of post again. Boomer.

That's what I assume a lot of people's initial reaction would be. I just tried Google Jules, Google's AI coding agent, just because I happen to own a Pixel 9 and that comes with a 1 year free subscription to Google One. I figure I should give new things a shot. Maybe I will be wrong. Who knows.

Jules - An Asynchronous Coding Agent

So I started Jules and imported the source code of my blog. Tasked it with the simplest of all task

Read and understand the source code. Determine how performance can be improved by reducing client and server load.

Jules did some bold changes. Such as

Disabling HTML minimization
Remove ETag handling
Remove Content Security Policy
Increasing the threads available to the Gemini side of the server. Disregarding if that impacts the HTTP side under load

It's arguable if SHA256 as ETag is a performance problem. Sure SHA256 is not fast (that'd be the BLAKE2/3 family of algorithms). But deleting CSP along with it is more then problematic. I fear inexperienced developers would actually see some performance improvement in the cold path, did not see what happens to the hot path and let this through.

Image: Screenshot: Google Jules deleting CSP policy

In the output reasoning. Jules also mentioned converting all images to WebP or AVIF. Especially the PNGs that my blog services. No.. The PNGs existing on this blog are either small (early files) or are actually smaller then WebP (generated by ROOT, using 256 color mode). AVIF is also not a good idea. This blog also serves the Gemini protocol. Which, Lagrange does not support AVIF yet. I will not allow it.

Image: Screenshot: Google Jules's reasoning

Maybe Jules is just very web-brained. Let's try something different and more specific. "Please look into the C++ files, understand and suggest changes to reduce unnecessary server compute and IO". Jules decided it should delete the in memory caching for post entries. Citing "reduced server IO and workload".... Yes, but now every time someone loads the post archive or requests the Atom file, it walks the entire archive directory and reconstruct it from scratch. That is MUCH more expensive then just caching them in memory. No no no!

Image: Screenshot: Google Jules deleting the Atom cache

It also decided to add a gemlog cache warmup phase. Spawning a new thread just to invoke getGemlogEntries(). I am frowned with this idea. One, it should just use one of the threads via app().getIOLoop(index) to avoid tread creation or run it at construction time, even maybe adding a warmupAllCache() in main.cpp so warming the cache does not consume resource at serving time, at the cost of longer startup.

Image: Screenshot: Google Jules adding unnecessary warm up, in incorrect way

When prompted to further look into what getGemlogEntries() does and understand the change it made thoroughly. Jules decided to replace returning solid objects (that I am sure gets RVOed) with shared_ptr. That is an extra allocation for nothing. Plus more syntax for the same job.

At this point I am distrusting vibe coding agents as they are not anywhere near competent let along building production systems. If they can't even handle a personal website. I do not trust it reasoning about complex threading models and interaction between threads. I still use LLMs to get result quickly and the good autocomplete is great. But not LLMs acting autonomously. This is a disaster.

Martin Chang

Systems software, HPC, GPGPU and AI. I mostly write stupid C++ code. Sometimes does AI research. Chronic VRChat addict

I run TLGS, a major search engine on Gemini. Used by Buran by default.

martin \at clehaxze.tw
Matrix: @clehaxze:matrix.clehaxze.tw
Jami: a72b62ac04a958ca57739247aa1ed4fe0d11d2df