The Ethics of the Algorithm

Wed, 06/18/2014 - 5:00pm

By Todd Presner

Most of the time when we speak of the ethical dimensions of video testimony of the Holocaust, we refer to the viewer’s “duty to listen and to restore a dialogue.”[1] By this, it is understood that genocide victims who give testimony have been denied their humanity, and that isolation continues when their story is not heard. Video testimonies bind the testifier to the viewer in an ethical relationship that demands the viewer listen and understand what the survivor has shared. This is the ethical responsibility of the listener, who, in turn, becomes a secondary witness to the survivor’s story, and who is obliged to carry the message forward into the world.

Our research team wondered whether in the face of the USC Shoah Foundation’s Visual History Archive, which contains nearly 52,000 testimonies, an ethical relationship could be formed between an individual and the Archive as a whole. To answer such a question, we had to investigate the technologies that made the Visual History Archive possible in the first place. In other words, how can an information system or a database be “ethical?” That’s the question—at once seemingly simple and deeply fraught—that my research team in the Digital Humanities program at UCLA has been struggling to answer during the past two years. Our research team brings together the methodological insights of computer science, particularly data analysis and data visualization, with the history of the Holocaust, eyewitness testimony, and the ethical imperatives of listening.

In fact, how does one listen to 52,000 videos and more than 100,000 hours of witness testimony? The Visual History Archive’s scope—its sheer scale measured in terms of hours of testimony—is not readily comprehensible. To make the testimonies accessible to users, the Visual HistoryArchive requires a database and an information-management system to organize, categorize, and enable searches of the testimonies based on a series of parameters. This, in turn, allows individuals to engage with discrete pieces of the Visual HistoryArchive rather than with its entirety. Our project seeks an ethical approach to looking at the Visual HistoryArchive as a whole. We believe that by investigating the data and the systems that structure user interactions with the testimonies, we will be able to build ethical modes of computation in terms of digital interfaces, databases, metadata, and information systems. 

While the media specificity of the first generation of Holocaust testimony has been discussed at great length—ranging from David Boder’s wire recordings in DP camps and cassette tape to audio-visual documentation—there is no literature on the digitization of the Holocaust archive and its transformation into an information system. With regard to the USC Shoah Foundation’s Visual History Archive, this is particularly noteworthy because the underlying information architecture makes the testimonies searchable and therefore accessible in a way that would be impossible without it. This architecture consists of several components: First, there is the interface itself, which runs in a web browser, allowing a user to type in keywords, names, and other search terms; behind that is a relational and structured query language database (SQL database, for short) in which metadata are organized into tables, records, and fields; all of these data were inputted after the videos themselves were indexed with keywords based on standards and protocols developed for the construction, formatting, and management of a monolingual, controlled vocabulary (the thesaurus) to achieve consistency in the description of the content. Beyond this, we have the hardware, such as the archive servers and storage servers, where the videos are stored in digital formats for streaming in a video player. In fact, from 1996 to 2002, 11 patents were filed by inventor Samuel Gustman and the Survivors of the Shoah Visual History Foundation, the assignee, for the Visual History Archive information architecture, which includes the system for cataloguing multimedia data and the development of the digital library system itself.

To support our research, the USC Shoah Foundation graciously provided our research team with a copy of the entire database of metadata related to the testimonies, a database of more than 6 million tables, primarily the indexing terms associated with one-minute segments of testimony. With regard to the thesaurus of indexing terms, we are interested in how the main relationships (inheritance, whole/part, and associative) represent the experiences described in the testimonies and whether there are limits to its effectiveness.

To that end, our research team is also developing ways to deepen the indexing system beyond manifest content (what concretely is said in the testimonies in terms of events, people, and places) to include ways to find more subjective experiences and memories that may be connected to the performative, memorial, and figural dimensions of testimony. Among other things, this means recognizing the importance of tone and voice, of questioning and doubt, of the expressiveness of the face, silences, emotional realities, and larger narrative structures, tropes, and communities of experience.

Yet, perhaps paradoxically, the goal of information architecture is to be objective, to disambiguate the testimonial narratives and to render them operational within the logic of computational processing, to produce an indexing system that is complete and a digital library system that is modular and extensible (to accommodate any kind of testimony or experience). As exciting as this is from an information studies perspective, as well as from a comparative genocide studies perspective, I wonder how we might, in the process, rethink the very genre of the database as a representational form vis-à-vis the specific experiences of bearing witness, testifying, surviving, and narrating. How might the database reflect the fragility of life, the uncertainty, ambiguity, and figuration of narrative? How might it preserve the “hauntedness” that informs so much of the testimony? In other words, how might a database be open to the haunt of the past, the trace of the unknown, the spectral quality of the indeterminate, and, simultaneously, be oriented to the uncertainty of the future, and the possibility of the unknown.[2]

To do so, we are imagining how fluid or differential data ontologies might work by allowing multiple thesauruses, which recognize a range of knowledge models and standards. For example, what if verbs and adjectives that connected action and agent, experience and context were given more weight than hierarchies of nouns primarily in associative relationships? How can we help listeners find and contemplate silences, gaps, stuttering, and emotional realities at the heart of the testimonies? And what if a more participatory architecture allowed for other listeners to create tags that could responsibly proliferate indexing categories and keywords associated with the segments of testimonies? 

Such a structure of saying and unsaying the database would constantly reinterpret and reinscribe the survivors’ stories in ways that not only place the listener into an active relationship of responsibility but also unleash the potential of meaning in every act of indexing.  Narratives would be heard in their polyphony, with some listeners hearing some things and others hearing different things. We would never be done listening, watching, and processing the testimonies because there is always “more”—a surplus of meaning—that is never absolutely captured in data or databases. In essence, the “ethics of the algorithm” is an attempt—through ever thicker relationships between data and narrative, telling, and retelling—to bring together computation and information architecture with the ethical obligation of listening. In other words, it is a way to constantly transform the immensity of the archive and the “big data” in the database back into individual stories encountered through an ethic of active listening and participation.

To learn more about Presner’s research with the Visual History Archive, visit: http://www.toddpresner.com



[1] Geoffrey Hartman, The Longest Shadow: In the Aftermath of the Holocaust (Bloomington: Indiana UP, 1996), 133.

[2] Jacques Derrida, Archive Fever: A Freudian Impression, trans. Eric Prenowitz (Chicago: University of Chicago, 1996), 36.