Abstracts Track 2024

Area 1 - Complexity in Social Sciences

Nr: 17

The Utility of Creative Simulation in Understanding Language Models


Morgan Green

Abstract: This paper uses a creative simulation in its early development as a case example of the utility of artistic programming in examining complexity. In particular, this piece provides a new perspective on large language models. You can find the working version of the simulation at this link: https://howshekilledit.github.io/word-is-the-bird/ Art can provide poignant insights into complexity theory — Jane Prophet's TechnoSphere and Ian Cheng's Emissaries, for example, represent novel opportunities to witness complex patterns emerging from simple rules. Both works are simulations that propose worlds slightly different from, but related to, our world. The novelty of these worlds makes their emergent properties stand out anew, as opposed to the complex systems we see so frequently we don't always notice them. This paper analyzes a new simulation created by the author, which takes a similar tack by visualizing a world with alternative but human-inspired mechanics. The work focuses on the complexity of language. It also sheds light on the structures behind popular language models. The simulation proposes a world in which words have agency, like birds in a flock. This granting of birdlike agency to words gives the simulation its title: "Word is the Bird." The agentic vectors mimic patterns in natural language, following the model whereby traditional simulations mimic bird velocity. It examines the CMU Language Dictionary and NLTK's Wordnet by using them to vectorize boid movement across sound and meaning. Their mechanics are based on the linguistic properties of English but manifest as flocking texts rather than sentences. The author speaks from the intersection of computer science and artistic practice: they are a computer science professor at a small liberal arts college and have a substantial exhibition history in visual art. At the time of submission, "Word is the Bird" is run client-side, so a new simulation starts each time it loads. It will progress to server-side content generation, allowing it to run continuously and grow more complex with time. Patterns will emerge that allow for new insights into the large language models that drive it. This server-side version will also display the same feed to people connecting from different places, helping them to share observations and insights.