Jews by the Numbers: An Introduction to Applied Statistics and Data Science for Students of Jewish Studies
Faculty Name:
Alexis Lerner
Institution Name:
University of Toronto

This upper-level undergraduate course introduced students to data science and applied statistics, with the focus on learning to analyze data through quantitative methods, research design and ethics, and digital humanities tools. Students learned to build datasets from archival material, and to form their own arguments based on data. The USC Visual History Archive was one of the datasets used for this purpose.


The course was based on a combination of hands-on experience with quantitative methods and digital humanities tools, readings and discussions, and written assignments. Students completed their research projects that involved analysis of metadata, coding, visual illustrations of data, and a written reflection on the value of quantitative methods, organizational systems, and data visualization for obtaining new insights about archival material.

The Visual History Archive was a principal dataset assigned for student research projects. Each student used text mining and content analysis skills to code one English-language Holocaust testimony from a pre-selected pool of Visual History Archive testimonies of survivors who lived in the Budapest Ghetto in Hungary. In addition to coding a testimony and creating a visual illustration of their data, students also performed an analysis of both testimony-related metadata and the metadata related to the archive as a whole. For their archive-wide analysis, students were required to investigate who are the archive’s subjects and which cases of genocide and mass violence are featured in it. They were encouraged to think about the ethics of the archive and its use, as well as the archive’s pros and cons. For their analysis of a specific survivor testimony, students were asked to examine the type of metadata provided about the survivor and to analyze this metadata.

Each student coded one testimony from a pre-selected pool of VHA testimonies. They coded the intensity of survivors’ emotional responses in relation to another variable of their choice, such as age and gender. While each student coded one testimony, they worked together to build collective datasets in a shared online spreadsheet and devised a collection of questions of interest, including whether women hesitate more when talking about their experience, or do Hungarians talk more positively about fellow Hungarians, and similar. Each of the students in class chose to focus on a specific dependent, target, or variable they identified in data. For example, students discovered that women pause more than men when recounting their traumatic experience.

As part of their research project, students wrote reflective essays about their experience, including any difficulties they encountered and their experience of engaging with the Visual History Archive. While they appreciated possibilities offered by quantitative methods and digital humanities tools, a number of students expressed concerns about the ethics of “reducing” survivor testimonies to numbers. However, these tools and methods also allowed for an in-depth analysis of survivors’ narratives, resulting in a greater appreciation of the opportunity to focus on the survivor.

“To me, the most compelling aspect of the project was my intense focus on the survivor, their human emotions and expressive nature.”


