AI-generated photo-based images

AI-generated photo-based images: their ontological status and interpretation

Workshop on “Inference: Critical Approaches to Visual Generative AI”, The University of Sheffield, Sheffield, UK, April 22, 2024.

Long abstract

In a recent paper in Science, Epstein, Hertzmann, et al. announce that “generative AI is not the harbinger of art’s demise, but rather is a new medium with its own distinct affordances” (Epstein, Hertzmann, et al., 2023, p. 1110). The authors recall that throughout the history of art, several stages emerged when new technologies seemed to threaten some, if not all, artistic practices. For instance, the invention of photography as a new technology for mechanically recording light values and producing depictions of scenes in front of the camera was perceived by many as bringing about the end of painting and drawing. Especially in Western art and culture, where realistic portrayal of scenes, relying on perspectival visual representation, was important at that time, the automatic and mind-independent nature of photography was easily viewed as “superior” to handmade, mind-dependent images in terms of realistic depiction. However, as Epstein, Hertzmann, et al. also remind us, photography did not by any means replace painting and drawing. Although portraiture did indeed become largely a photographic genre from that time onwards, photography also liberated painting from realism. Many view developments in painting, such as Impressionism, for instance, as the most welcome effect of this liberation.

Building on this analogy, Epstein, Hertzmann, et al. argue that artists and audiences need not perceive generative AI as a threat to current art practices. While it will inevitably bring about changes, artists will reformulate current practices by using generative AI as a new tool in their creative endeavours. Automatisms offered by generative AI should be seen as the rise of a new medium, providing new ways for creative artistic work done by humans. As Epstein, Hertzmann, et al. point out, one potentially misleading aspect of the perception and reception of works produced with generative AI lies in the term we use for it. The term “artificial intelligence” might misleadingly imply human-like intentions, agency, or even consciousness or self-awareness. Removing some misconceptions about generative AI may also foster its acceptance as a new technological tool in the hands of human artists.

The controversy surrounding the ontological status of AI-generated artworks often revolves around the question of authorship. Besides the human beings involved in the process, such as programmers and artists utilising the program, generative AI programs themselves have been suggested as possible candidates for being considered authors (see Elgammal, 2019, and Mazzone and Elgammal, 2019, for instance). However, others argue that, in the absence of consciousness and conscience, AI-generated artworks should be considered as artworks produced by human persons and mediated by a generative AI program, rather than being artworks produced by the AI program itself. In other words, the ontological status of artworks is derived from the connection between an artwork and consciousness and conscience (see Linson, 2016, for instance). Another sceptical argument about AI authorship is that, according to our current understanding, art is created by social agents. Therefore, until this understanding is changed, generative AI cannot be credited with the authorship of art (see Hertzmann, 2018).

In my paper I argue that the ontological status of AI-generated photo-based images, whether they are artworks or other photo-based images, is better understood in terms of their contextual interpretation rather than in terms of their connection to consciousness, conscience, or social agency. This also means that I do not consider artistic creativity and non-artistic image-making creativity to be fundamentally distinct from the point of view of ascribing authorship of such images to generative AI. (However, they are distinct in terms of interpreting them as artworks or non-artistic images.) For my arguments, I rely on the theory of pictorial illocutionary acts developed by Kjørup (1974, 1978) and Novitz (1975, 1977), as well as on the theory of photographic illocutionary acts proposed by Bátori (2015, 2018).

According to the theory of pictorial illocutionary acts (Kjørup, 1974, 1978, Novitz 1975, 1977), the production and presentation of images themselves are to be understood and interpreted as pictorial locutionary acts, similar to verbal locutionary acts, such as uttering words and sentences. At the locutionary act level, only the literal semantic pictorial meaning of the image is interpreted. This meaning is based on our visual recognition abilities, such as object recognition, face recognition, recognising spatial relations, arrangements, and perspective. Currie (1995) refers to this pictorial semantic content as ‘natural’ pictorial meaning because it is not learned, unlike the learned symbolic semantic content of words and other morphological meaning units in natural languages. At the level of pictorial locutionary acts, contextual information is not utilised. It is only at the level of pictorial illocutionary acts that we interpret the image in the context of its presentation and use. For instance, at the pictorial locutionary act level we merely recognise the visual characteristics of the picture of a human head in the barber shop window, while at the pictorial illocutionary act level, we interpret it as a possible statement (pictorial proposition) about the skills of the barber or as a promise of getting a similarly skilful haircut in that barber shop. As Bátori (2015, 2018) further elaborates, photographic illocutionary acts constitute a specific type of pictorial illocutionary act in which the interpretation process at the illocutionary act level necessarily includes interpreting the images as indexical photographs, as opposed to non-indexical, hand-made images.

When interpreting photo-based artworks and other photo-based images produced using generative AI, the locutionary and illocutionary acts involve the following components. At the locutionary act level, audiences identify the literal semantic pictorial content of the images, utilising their visual recognition capacities. This process yields pictorial mental representations of the image content for the mental processing of the audiences. At the illocutionary act level, audiences utilise their contextual knowledge that the image they are considering is a generative AI rendering of a photo-based image or images. They also take into account that the rendering was created using to a) the algorithms of the programmer and b) the ideas of the person (artist, creative professional, etc.) instructing the generative AI program. This means that the image as a whole will not be interpreted as an indexical depiction of a scene captured by the camera at the time of exposure, as the interpreter knows that the image has been altered. The role the original indexicality plays in the interpretation depends on the specific modifications and the extent of the interpreter's knowledge about them. However, in terms of their ontological status, AI-generated photo-based images will not be treated and interpreted as indexical photographs.

However, it is not clear whether this implies the emergence of a distinct genre in the process, as suggested by Epstein, Hertzmann, et al. Alternatively, there might simply be new technological means of producing composite images.

With regards to authorship, the interpretation at the illocutionary act level atributes authorship to the person (artist, creative professional, etc.) using the program, not to the programmer or the generative AI program. This is because the person utilising the generative AI program is the one who produces and presents the image (locutionary act) with the assistance of the generative AI program as an image manipulation tool. In the production and presentation process of the image, the programmer is attributed a role similar to that of camera and darkroom equipment constructors, or image manipulation software engineers. Meanwhile, the generative AI is regarded as a complex technical tool for rearranging parts of one or more indexical photographic images into a new, non-indexical image as a whole. Attributing authorship to generative AI is no more a part of the illocutionary act than attributing authorship to image manipulation software used to djust the contrast or saturation of an image or even to rearrange an indexical photograph or photographs into a new composite, non-indexical image.

Furthermore, the differentiation between "traditional" (non-AI-generated) images and AI-generated ones draws a parallel to the contrast between handmade and mass-produced items (like shoes, tableware, etc.). In the instance of "traditional" production, the object's creator retains complete control over all the encompassing processes, whereas the designer of a mass-produced item only creates the distinctive facets of the product, without direct involvement in each stage of production.

Based on the interpretation process described at the illocutionary act level, it can be concluded that audiences come to have true beliefs about the nature of photo-based images produced using generative AI, as long as the image's nature is readable form it or deducible from the context. Audiences are not deceived in such cases. However, if the image's nature is neither deducible from the context nor readable from the image itself, they might be deceived into interpreting it as an indexical photograph of a scene captured by a camera. During my talk, I will present examples of both deceptive and non-deceptive photo-based images produced using generative AI.

References:

Bátori, Zsolt. ‘Photographic Manipulation and Photographic Deception.’ Aisthesis 11:2 (2018):35-47. doi: 10.13128/Aisthesis-23863

Bátori, Zsolt. ‘Photographic Deception.’ Proceedings of the European Society for Aesthetics 7 (2015):68-78.

Currie, Gregory. Image and mind: Film, philosophy and cognitive science. Cambridge: Cambridge University Press, 1995.

Elgammal, Ahmed. ‘AI Is Blurring the Definition of Artist: Advanced algorithms are using machine learning to create art autonomously.’ American Scientist 107:1 (2019):18-21. doi: 10.1511/2019.107.1.18

Epstein, Ziv, Hertzmann, Aaron, et al. ‘Art and the science of generative AI.’ Science 380 2023):1110-1111. doi: 10.1126/science.adh4451

Hertzmann, Aaron. ‘Can Computers Create Art?’Arts 7:2 (2018):18. doi:10.3390/arts7020018

Kjørup, Søren. ‘George Inness and the Battle at Hastings, or Doing Things with Pictures.’ The Monist 58:2 (1974):216-235.

Kjørup, Søren. ‘Pictorial Speech Acts.’ Erkenntnis 12 (1978):55-71.

Linson, Adam. ‘Machine Art or Machine Artists?: Dennett, Danto, and the Expressive Stance.’ In V.C. Müller (ed.), Fundamental Issues of Artificial Intelligence. Switzerland: Springer International Publishing Switzerland, 2016, pp. 443-458. doi: 10.1007/978-3-319-26485-1_26

Mazzone, Marian, Elgammal, Ahmed. ‘Art, Creativity, and the Potential of Artificial Intelligence.’ Arts. 8:1 (2019):26. doi:10.3390/arts8010026

Novitz, David. ‘Picturing.’ Journal of Aesthetics and Art Criticism 34:2 (1975):145-155.

Novitz, David. Pictures and their Use in Communication: A Philosophical Essay, The Hague: Martinus Nijhoff, 1977.