Nikolaos Gkizis Chatziantoniou describes his internship with the CDHU working on the projects SweMPer and Quantifying Culture
During my internship at the Centre for Digital Humanities at Uppsala University, I worked both on digitizing physical books for the project “Swedish Medical Periodicals – (SweMPer)” as well as pre-processing, cleaning, and labeling images for the project “Quantifying Culture”.
Regarding the digitization project SweMPer, we had to create digital copies of Medical Periodicals, as a part of enriching medical history archives. The method used was destructive scanning, which is much faster and cost-efficient than other methods, partially sacrificing the book’s integrity but not the contents. The spine of the book is removed with a guillotine and the pages are scanned, digitized, and transcribed using Optical Character Recognition (OCR). The remaining physical pages are archived in case we need to rescan them.
The “Quantifying Culture” project is an effort aimed at using Artificial Intelligence (AI) within the cultural context, as well as reviewing existing methods. The task here was to train an algorithm that automatically classifies gender in a collection of digital images. During the data collection stage, we used a web scraper to download images from the World Culture Museum to create the datasets. Afterward, I manually cleaned and grouped the images in order to create the training data for the algorithm. Additionally, I had to experiment with ideas on how to automate parts of the process and think about solutions to the problem of overlapping datasets.
My involvement in those two projects gave me a clearer view of the definition of a digital humanist and its work within the cultural sector. Solving practical problems, automating processes using code, and ultimately thinking about the implications of the methods used regarding costs, and social or ethical issues that may arise, are all parts of the digital humanist’s identity.
In more detail, regarding the digitization project SweMPer, I had to think about the implications of fast digitization versus a slow more detailed one, in relation to funding, available resources, and the project’s timespan. As for the Quantifying Culture project, I got a glimpse of the inner workings of the creation of an algorithm that classifies gender. I realized that the most important part of the process is actually not the algorithm itself but rather finding the right training data, removing biases, consulting experts regarding the classification of certain ethnic groups, and thinking about the implications of choosing a binary model to label the images.
The role of the digital humanist is not only working around the digitalization of culture as a coding humanist. This role becomes clearer when a project lacks social sensitivity, is focused on the data instead of the human factor, or removes diversity in favor of clear and uniform results. The digital humanist involved in those projects reintroduces those elements to the research and changes the focus towards a more human and socially aware AI.
-Nikolaos Gkizis Chatziantoniou