It started two weeks ago when ‘Spiral Town’ was uploaded to the Stable Diffusion Reddit forum. The image, which shows a pastel town under a blue sky amidst a series of concentric circles, blew up on the site formerly known as Twitter. Some of the reactions were expected: it’s sad that the art was AI generated, the beauty of the picture doesn’t withstand close scrutiny, and that AI art simply isn’t art.
First, it goes without saying that AI generated art raises a number of difficult ethical challenges. Consent to use pieces of artwork within the training corpus, the economic impact of AI art generators on the livelihoods of artists, the prevalence of biassed outputs, and the knock-on impact on human creativity are just some of the open questions that remain unanswered.
These were, though, issues that I expected the popularity of the image to raise (after all, you can pretty much guarantee that any sufficiently popular post will provoke instances of every common opinion you can think of on any given issue). But what felt to me to be a distinct vein of discussion was whether we ought to describe these types of images as a new style of art entirely.
I’ll introduce a few conceptual tools we can use for trying to determine whether or not you might describe the style as something new or distinct, but, before that, let’s take a step back to define precisely what we’re talking about. Spiral Town was one of the first popular examples of an approach to AI art generation using a model known as ControlNet that lets users add additional, well, controls to image generation using Stable Diffusion. If, for example, you provide a depth map, the ControlNet model generates an image that’ll preserve its spatial information. After Spiral Town paved the way for popular interest in the approach, users shared images incorporating other types of spirals, squares, and even memes. Some even built minigames in which you move uniform pieces around to form a final picture.
A quick word before we begin: this article is about whether we ought to view controllism (defined as images created using ControlNet or similar approaches designed to integrate words, shapes or symbols within an image in a way that substantially changes the overall artistic artefact) as a wholly new art style, not whether any AI generated image can be described as art.
Ctrl+C
For the remainder of this post, I’m going to run through some of the different factors that art historians use when classifying a particular field, movement, or style. For the avoidance of doubt, I take style to mean the specific techniques, methods, and aesthetic characteristics used by artists; a field refers to a broader category or discipline of art, such as sculpture or painting; and a movement is a collective trend or direction in art, often driven by a shared philosophy or goal.
It’s worth saying from the outset, though, that there are always disagreements between those who think that a type of art represents something new and those who do not. Perhaps, at least in that sense, controllism has something in common with Impressionism, Cubism or Surrealism.
Let's start with the formal elements that make up a piece of art. These are the visual tools that artists use such as lines, shapes, colours, textures, and spaces. Aside from the techniques and materials (more on that shortly) the most distinctive quality associated with controllism is the use of clever visual tricks to blend (or hide) shapes, images or words with the final composition. I admit that I have been impressed with some of the approaches taken towards this process, but what seems to me to be new is the ease with which this process can be done—rather than fundementally novel about the approach itself. To see what I mean, consider Canadian artist Rob Gonsalves who blends elements of ‘magic realism’ and optical illusion in his work. In one famous example, Gonsalves created a piece that blended a bridge with a ship sailing on the ocean. There are many other examples of hidden images within artwork, from artists like Oleg Shuplyak to the duck-rabbit image made famous by Ludwig Wittgenstein, who used it as a crutch for describing two different ways of viewing: "seeing that" versus "seeing as" to illustrate the difference between factual and interpretive seeing. Then there is of course Hans Holbein's 1533 ‘The Ambassadors,’ which to my knowledge is one of the earliest pieces of art including hidden images (here is another picture where you can see a skull hidden in the painting).
Generally speaking, new styles often introduce innovative methods or utilise materials in a unique way. Now, AI produces digital art, which has been around for some time. It deals in pixels, not paintbrushes. That being said, the techniques used to create the patterns described above do, I think, represent a distinct technological approach. So while we might say that the ultimate type of output and the substance of the style itself are not new, the process by which art is created is somewhat novel (though of course, there is a question as to how distinct the use of ControlNet is versus AI-generated art more broadly).
Other important element of new styles is the constellation of symbols, themes, and subjects depicted in artworks. One of the things I like about controllism is its generality: it’s a lot of fun to see the same techniques applied in a whole bunch of settings and in different contexts. The problem with that, though, is that generality is the antithesis of focus. Controllism might have made its name with landscape shots interwoven with spirals and squares, but it’s already being layered on top of different artistic styles that span the photographic to the surreal.
If you asked an art historian to tell you about the emergence of a style such as, for example, Pointillism, one of the areas they would likely talk about at great length is the context in which it emerged. Specifically, they might be interested in the style’s broader historical, social, and cultural environment (in this example, 19th century advancements in the understanding of colour theory and optics). Clearly, in this regard, controllism is deeply connected to the emergence of large neural networks as sophisticated image generating tools. That is partly what inspires some of the viscerally negative reactions about fundamental questions related to whether or not we can even describe any form of AI generated media as art.
Another important element are the philosophical or theoretical foundations of the style. Many art movements are grounded in specific philosophies that guide their approaches, which in some instances deeply influence the composition of the art in question. Surrealism, for example, was heavily influenced by the theories of Sigmund Freud, especially with respect to his ideas about the subconscious mind, dreams, and the role of irrationality in human behaviour. Surrealists believed that the subconscious was a source of truth and creativity that was suppressed by the rational mind and societal conventions. One issue I see for controllism is that it feels to be primarily an expression of technology, rather than an expression of deeper sociocultural factors. That isn’t to say that the two are mutually exclusive (new tools have been a driving force behind a whole bunch of art styles and the emergence of new art forms like photography) but rather that technology appears to be the primary driving force behind its emergence.
There is also the question of intentionality. What, honestly, do we feel are the intentions of the artists associated with controllism? In one very clear example from the past, we might say that the emergence of socialist realism was deeply tied to ideas about promoting and glorifying the socialist cause, emphasising the role of the working class, and aligning art with the political and ideological goals of the state. Clearly, a new art style is not dependent on a particular ideology, the broader point is that all art is ideologically conditioned (though whether that conditioning is implicit or explicit is another matter). Of course, Spiral Town and images like it seem particularly innocuous, so perhaps we’re barking up the wrong tree all together by looking at intentionality. What if, just for once, the intention is to have a bit of fun?
Finally, we have the impact of the art itself. That means studying the reception of the style by critics, scholars, and the public to help define both its characteristics and significance. While we don’t have a great deal to say about art critics and controllism, we do have commentary from the platform formerly known as Twitter. What I will say here is that, while controllism has emerged as a lightning rod for the challenges associated with AI generated art, lots of people do seem to be appreciating the aesthetics of these images. I am well aware that there are plenty of others who do not like the style—but for every person who doesn’t there is another who does. That, I think, is true for all art.
The benefit of the historical approach is that we have a long time to assess the impact of a particular style or movement. The problem with applying this lens to controllism, though, should be clear. Controllism is a very new thing, and trying to contemporaneously define a movement rarely sticks (though there are some examples, such as the cultural theorist Stuart Hall’s efforts to coin the term Thatcherism). That aside, the point is ultimately that we won’t truly know the impact of a particular movement or style until after the dust has settled.
Finishing touches
As you might have guessed, my own view is that controllism probably does not do enough to constitute a distinct movement or style. While it utilises some novel techniques made possible by Stable Diffusion and its ilk, the visual style itself does not appear to be fundamentally different from some pre-existing styles and artistic approaches. The lack of an intentionality or driving philosophy behind it makes controllism feel more like an emergent technical phenomenon rather than a deliberate artistic movement.
That being said, just as with other historically significant styles, only time will tell if controllism leaves a lasting impact on artistic practices and public reception. It's also worth considering that the meaning and assessment of controllism will likely shift as the ethical debates around AI art continue to unfold. For now, I think controllism is best described as an interesting aesthetic made possible by new technology, rather than a wholly original artistic style.
We should, however, reflect on the impact that tools like Stable Diffusion and ControlNet have on lowering barriers to entry and widening access to particular modes of artistic expression. That is clearly a good thing. I like controllism—though, as above, I do not think it does enough to constitute a new style—and I like AI art. Some of it is good and some of it is bad. That doesn’t mean that we should overlook the issues involved, but rather that I generally (though not always) believe it best to separate the art from the artist.
Perhaps AI art is the same.
Great post. You are really showing your range as a thinker and writer. here I love studying the rise and fall of different literary and visual styles/aesthetics. I have found Roman Jakobson's theory of the dominant to be quite useful over the years. He defines it as the central focus of a work or literary movement. As a formalist, Jakobson regard the feature as always in the text. As a rhetorical theorist, I tend to define focal points in terms of formal aspects and author-audience processes. Thinking of a dominant in terms of the dynamic interaction between intra- and extra-textual processes gives us greater precision --- and you are very much working in that tradition.
My mentor Jim Phelan defines literary change as a function of 4 interrelated dynamics: 1. extraliterary forces -- the stuff of history, 2. intra/interliterary dynamics -- debates between different schools and between different authors in the same school, 3. the innovation on change --- particularly since modernism, and 4. authorial experiences, preferences, choices, and purposes. To this list, we might need to add automated processes like AI. Or perhaps we can think of them as a function of history. I am not so sure how to deal with the intentionality issue either.
The additive nature of this new aesthetic reminds me of 1950s/1960s chance/probability-based aesthetics where randomized processes drove choice and creation in poetry, music, and cut-and-past novelistic experimentation. Some artists in the 1970s and 1980s enlisted early computers to do some of this randomization, but late postmodernism of the 1990s replaced these completely random surfaces with highly ornate, more baroque, encyclopedia surfaces. Think about the difference between John Barth's Lost in the Fun House vs. David Foster Wallace's Infinite Jest.
When I look at the images in your post, my gut says... AI-assisted Surrealism. Magritte fused with Escher. Magritte's emphasis on the substitution of expected forms with unexpected content. With Escher's orderly regressions embedded inside. In other words, I can clearly see the source material rising to the surface.
This is a really interesting moment in literary/visual history. In the case of unassisted human creation, the process through which Phelan's levels 1 and 2 flow to the surface of work are much more diffuse, but when AI is involved it is much more concrete and transparent. To me, this transparency seriously threatens any real claim to Phelan's level 3, innovation, despite level 4, the intention of the human working with AI. I think until we have AI products that are not simply re-organizing visual inputs in complicated but ultimately explainable way, but offer something close to the spark of creativity that an actual artist experiences after 10000 hours of lived human experience, practice, and cultivation, we will not arrive at a truly AI-inspired innovation in literary/visual aesthetics.
But that is just my two cents. Thanks for giving me the opportunity to think these things through...
Thank you for writing this article and explaining it with an open mind. I am working as a writer and sometimes feel a chill fear of what AI can achieve, which is much better than what I can do. However, I don't use my fear to judge the technology evolution here. Most of what I was concerned about was easier to see, like when AI suggested something that I can point out coming from other writers, which put users in the position of a potential thief (sometimes users don't know because they didn't read that book/story), and it is unfair to the writer who created the work. But in this case, you tell a different story, which leaves me a lot to think about how we were creative as writers/artists before and now with AI.