Computational Archaeologies is an ongoing project that employs AI text-to-image generators to create images of unspecified ancient artifacts, which are then re-interpreted as physical objects. Written prompts are input into synthetic image generators such as Midjourney and Stable Diffusion that probe how these models’ processing of linguistic concepts such as “ancient” and “artifact” correlates with their visual training datasets, which are assembled from billions of online images and their associated metadata. The resulting depictions, while superficially realistic in appearance, embody a context-annihilating conflation of radically different time periods and cultures–a kind of historical uncanny. In this way, they operate as computational analogues of the “contextual flattening” produced by the accelerated, recombinatory logic of market-driven ideologies, through which cultural and historical specificities are blurred and effaced, and the entirety of the past itself reduced to merely a collection of source material available for reappropriation and remix. By manifesting these images as material objects, the output of the model becomes concretized (along with whatever biases exist within its dataset). Projected into physicality and presented using standard museological vernaculars, these objects function as uneasy artifacts not of any actual time or place, but rather of a decontextualized, machine’s-eye view of the past–a past without any history at all.
Project Background
The current generation of text-to-image diffusion models can create highly realistic synthetic images using only a short text prompt as input. Diffusion models’ ability to process language and correctly associate semantic phrases with a corresponding visual space allows them to produce consistently coherent images of (for example) a black cat, when “black cat” is entered as a prompt.
The portion of the model that can parse natural language (a linguistic classifier trained on billions of parameters) acts in concert with another part that is able to distinguish images from one another—a visual discriminator that has been trained on hundreds of millions of images and their associated metadata. Together, these components of the model guide the generation of an image out of an initial field of randomly-generated noise—a process that describes the “diffusion” aspect of the model. The computational association of a specific semantic cue with a particular subset of visual data is sometimes described as a “latent space” within the model that is activated by the prompt.
If a diffusion model is a form of picture-making machine, then it shares an affinity with a camera: both are tools that automate the image-making process. But while a camera produces an image optically by being pointed at some aspect of the physical world, a diffusion model produces images computationally, by being “pointed” at a subset of a vast dataspace—a region we can’t apprehend directly. And unlike a camera, the mechanism for “pointing” a diffusion model at something to be visualized is language itself—specifically, the textual prompt that directs the model to synthesize an image using both the linguistic and visual data on which it was trained as a guide. The model can visualize anything, no matter how bizarre or unlikely, at least to the degree the desired content is captured within the model’s dataset.
In this way, all AI-generated images are ultimately images of relationships between data, regardless of whatever visual content may be depicted. The model produces representations not of real-world people, objects or places, but rather of how visual and semantic information about those things is represented within the massive datasets on which it was trained. They are representations of representations; products of a nonhuman, machine’s-eye-view into the informational space comprising the model’s dataset—a dataset that ultimately is a product of collective human activity.
Perhaps this is a key source of the immense societal fascination (and consternation) these tools have inspired. Consciously or not, we sense that diffusion models are more than just picture-making machines: they are a kind of computational funhouse mirror, producing endless reflections of our own representations of the world. Just as social media algorithms reflect us back to ourselves by translating our posts, comments, and likes into its recommendations and rankings, so too does a diffusion model produce reflections of the way we visually interpret and categorize the world, based on the patterns it uncovers in the dataset. But the way the model “sees” is not the same as the way we do; to say “it” even “sees” at all is already a profoundly distorting anthropomorphism that attributes a false agency to something that has none. The uncanny frisson of surprise, horror, and delight we tend to find in the images generated by diffusion models stems not only from their seemingly magical ability to produce a robust image of almost anything we might imagine out of nothing, but in the subtle and not-so-subtle ways in which the models’ apparent “understanding” of images aligns and mis-aligns with our own sensory processing of the world and understanding of visual relationships.
The current project probes the latent space activated within Stable Diffusion and Midjourney, both highly popular diffusion models, when prompted with requests for photographic images related to past cultures. When asked to produce “a photograph of an ancient stone figure,” these models can generate any number of what appear to be photos of real physical artifacts. But while the images produced by the prompt can be highly convincing, they tend to embody what can be described as the historical uncanny—a context-annihilating conflation of radically different time periods and cultures. Although the artifacts depicted superficially resemble products of any number of actual past civilizations, upon closer inspection they reveal a strange hodgepodge of influences that belong to no specific time or context. Hauntingly familiar, but also profoundly estranged, they invoke a sense of the past as a set of stylistic attributes rather than a set of actual historical conditions.
The intentional lack of specificity in the initial prompt serves as a tool to consider the underlying data shaping the model’s interpretation of general descriptors such as “ancient,” thereby providing insights into the model’s dataset and the biases, preferences, and blindspots encoded within it. Without providing any additional guidance to the model, we gain some sense of how it latently associates these general terms with each other. The model’s attempt at a visual interpretation of a generic description like “ancient stone figure” thus becomes a window into the dataset on which the model was trained—specifically, a window into what sort of images have been consistently associated with words like ancient, stone, figure, and photograph.
In the cultural logic produced by both current capitalist ideologies and technology, cultural and historical specificities become flattened and made equally available for consumption and recombination. The unreal artifacts visualized by the diffusion model embody a deep synthesis of this recombinatory logic, manifested at the computational level at which it operates. They are vessels of a denatured idea of the historical, one in which the entirety of the past exists only as stylistic source material available for remix. The model’s lack of any contextual understanding of different cultures or historical conditions can be viewed as both a product and a reflection of the contextual flattening that is an emergent property of our current age. The nonhuman, machine’s-eye conception of cultural difference manifested in the artifacts serves as an operative metaphor for the increasingly decontextualized worldview produced by the capitalist, neoliberal, and techno-positivist ideologies that represent the end-products of modernity—itself a paradigm arising from a Eurocentric, Enlightenment-era construction of the world.
By rendering the artifacts visualized by Stable Diffusion and Midjourney into physical form—using automated 3d printing technology that allows fabrication in actual stone—I’m interested in concretizing these representations of this uncanny space. The resulting physical objects are artifacts of a dehistoricized and flattened worldview; objects haunted not by any actual past, but by the absence of one—a past without any history at all.