Reflecting on Art, AI and coherent arguments

Is AI generated art, art? This has been a question asked by many since MidJourney dropped the revolution product: Stable Diffusion. Now you can easily generate images by asking a chatbot like Gemini or ChatGPT to do so. Using interface via the many consumer minded products. Or via a quick API call to providers like DeepInfra. Artists thinks their work has been devalued and wants to ban AI. While the AI artists think that "well, obviously it works". I want to be honest to myself and explore the topic. And this is the log of my thoughts.

Obviously, I am a computer scientist. I work on AI. So I both know how these things work internally and I have strong opinions based on that.

Let's be honest for a moment. Painting isn't the only art form that has been performed via an automated mean. Singing for one has been very doable by computers since 2008, when the very popular signing synthesizer VOCALOID2 and the still very popular voice bank Hatsune Miku. Later on new synthesizers also shows up. Notibly SynthesizerV and DiffSinger. SynthesizerV for being indistinguishable from real signing by default, automatic multilingual and rap support. DiffSinger for being very good synthesis quality while also remains as an open source project.

Youtube video: Hatsune Miku: Live Concert (2013) - Cat food

Youtube video: Kasane Teto: Romeo & Cinderella (Synthesizer V Cover)

Youtube video: DiffSinger (绿坝子Nature): 我多想说再见啊

We did instrument synthesis way before singing. With the help of MIDI and high quality physical models of different instruments. We can replicate most of the aspect of any instrument trough a preprogrammed sequence or via a keyboard. Sure you can get better performance by recording a skilled musician playing the instrument. But why when you don't need that level of quality and the last bit of quality doesn't matter to the intended use case.

Youtube video: GTA San Andreas Theme Cover - LMMS

And obviously photography and camera overtook painting in the 18th century. So much so we now all have at least a camera we carry with us and snatch a quick photo when we want to. That used to be impossible unless you are insanely rich, can afford artists drawing every everyday item and have a palce to store the thousands of paintings. I took tens of photos with my phone today alone!

Image: A UBike station during night, with a lot or eBikes

That begs the question - what makes this time different? We had several examples of machines replacing human art in the past. And more for machines replacing human labor. Cows, wheels, airplane, the steam engine, combustion engine, computers... We adapted and we enjoy the convenience new technology brings. The debate on this subject has been.. incoherent

This time is different some argues. AI (more accurately machine learning with generative models) can seemly understand the request and generate content accordingly. There's very little space for human interjection. Further more. It is argued that image generation models is being trained on stolen (unauthorized) content and thus not legal. Let's explore that question a bit with full engineering hat on. All modern deep learning models are trained using a process called gradient descent. And how it works it by sending some input into the model, see what it produces, compare how far off the generated result to the reference data, do some calculus to figure out how to update the weights and repeat. At no point does the images actually becomes a part of the model. It is used to calculate the gradient and the gradient is used.

Also consider how humans learn to paint (or anything really). We look at how others draw and the final painting. Go do it ourselves. Get horrible results. Try different things and figure out how to make it better. Repeat. It's a bit similar ain't it. Why is it a problem when silicon does it but fine when it is a pile of meat? "Consciousness", "soul" some may say. Let's try a verificationism approach. I will grant you access to a garage full of all the most accurate scientific equipment. Say there's a random soul on the street. How do I know it is a soul and not something else? "it gives you a warm feeling" some say. So ginger tea is a soul? Obviously not. "you cannot detect it, it's beyond the physical world" others say. Great. So it doesn't interact with the physical world. By definition it does not exist. "magic" someone in the back yelled. Great now let's repeat the problem again. How do I recognize magic when I see it randomly on the street?

Ok hear me out. I have a brilliant idea. Let's train a generative model using genetic algorithm and humans as the fitness function. The model is fed some input. Generates garbage image. Shows that to the humans. Human evaluates how bad the generation is. The model randomly mutates a bit. Repeat. This way the model never touches any copyrighted content directly. It's impractical. And most likely there's not enough people, computing power and time to make this work. Nor generation quality will be good. But if through magic it checks all boxes and has phenomenal quality. Will people admit the model is now legal? Even after it is spread and every AI image generator?

Let's be really honest. It's unlikely such "ethically" trained model will be deemed legal under current weather. It's much less about what's technical accuracy, legal or art itself. It's a large part about economy and profits. Otherwise we will be having this debate about where to draw the line years ago.

It gets worse. Say we scanned someone's head. Obtained his connectome. Built a supercomputer powerful enough to simulate the subject's brain activity in real time. And enough physics to give him a virtual body. We ask this computer to paint. Intuitively most agrees it's art. WHAT'S DIFFERENT THIS TIME? Both a computer running a program that generates paintings. We bump the supercomputer to run at 200000x realtime and generate a ton of paintings. Is it now still art? Or we simplify how neurons are simulated. The paintings become weirder as the more we simplify. But definitely a painting. At which point the neuron is so simplified we no longer consider the result art?

At some point we gotta reconcile and forgo anthropocentrism. It only worked so far because humans happen to be the smartest thing on Earth for some time. And Earth is the only place humans know life exists. However, we are in fact, not special. We are mere dust within the vast universe (cue the Total Perspective Vortex). The space of all possible minds is like a sea. Biologically possible minds is an island on the vast, vast sea. And all possible human minds is a mere grain of sand on the island.

Let's go through another thought experiment. Say both a very tenanted human artist has dedicated his life to art and painting. While an AI learned from all of humanity and the internet with truly stupid amount of compute, reviling what's needed to build a time machine. Both entities in fact are so good at their craft, they have converged to a sense of peak art. Given any task. Both the artist and the AI will produce the exact same result given the exact same request. Down to the value of every pixel. In fact, if you remove the metadata and use the same encoder - the hash of both work is the exact same. We have no problem saying the artist's work is art. But.. is the AI's work art?

We show people both work, minding their origin on a screen. Is there a way for people to reliably distinguish between the artist's work and the AI? By definition you can't. That's what means if both are the exact same. Since we consider the human's creation art. We must then, logically, the AI's creation art.

Then let's consider the spectrum of the abstract sense of art-ness. The AI is not perfect, the generated output slightly differs from the humans. It's a slider that someone can control. The more it diverges, the easier it is to distinguish the origin base on the final output alone. At what point do we consider the output of the AI art? 60% accurate classification by human viewers? Any number or method seems arbitrary as we do not have a agreed and coherent definition for art yet. Especially when humans are not the only thing that can create art-like-things.

Art is in the process of creation, people say. The fact the human reflected their feelings into the work. And the viewers can understand and share the experience through the work. I call that non sequitur. The process does not truly matter. In fact I could have batshit myself some backstory for an AI generated work using more AI. And as long as you trust me, you still feel the expected emotion. Nor it's not like blue curtain is a common problem in art and literary. How do you know that the process of creation or the explanation is true? Most real artwork have a mundane creation process. Even a practical one like "Someone commissioned me" or "the publication is pressuring me to". Only then added stories around the meaning of the work later on.

"But, the AI generated work does not have actual feelings put into it. It can be proven" another says. Remember we assumed the AI generated the exact same result as the artist. On filesystems with deduplication, they become the exact underlying datablock - the difference is superficial at bast.

If it looks like a duck, swims like a duck, and quacks like a duck, then it is a duck. That's how verificationism works. The method in which you determine if something is X is your working definition of X. Art is not an exception. A pile of randomly generated pixels is definitely not art. But some orientated pixel is. We need to draw the line somewhere in between. These data points and through experiments serves at witness to our collective definition of art. Artist, let's think the problem through. Yelling "AI art is not art" is vast oversimplification and does not work.

Martin Chang

Systems software, HPC, GPGPU and AI. I mostly write stupid C++ code. Sometimes does AI research. Chronic VRChat addict

I run TLGS, a major search engine on Gemini. Used by Buran by default.

marty1885 \at protonmail.com
Matrix: @clehaxze:matrix.clehaxze.tw
Jami: a72b62ac04a958ca57739247aa1ed4fe0d11d2df