By Professor Isto Huvila, Department of Archives, Museums, and Libraries, Uppsala University

Much of digital humanities research is highly data intensive. Especially in historical research, using and combining legacy data from a broad array of different sources is a key to getting enough data to work with. The importance of knowing what the data is has been acknowledge for some time already and there has been a lot of work done to develop schemes for describing data i.e. producing metadata. As the documentation of data has generally improved — albeit not always and everywhere — it has become increasingly apparent that knowing what data is, is not really enough. It is equally important to know how the data came about.

You can imagine that you are having a data file with height measurements of walls ranging from 1.09 meters to 4.6 meters. Even if the file looks meticulously compiled, without knowing how the measurements were taken, the data is not especially informative. Are the heights average heights or maximum heights, were they measured by a tape or by some other means, were they measured by only one of several individuals, what was the purpose of the measurements and so on. All of this has an impact on what the data is and how it can be used in the future.

A new research project based at the Deparment of ALM, CApturing Paradata for documenTing data creation and Use for the REsearch of the future (CAPTURE) will investigate the particular problem of what researchers need to know about the making and using of data  and how it would be possible to capture enough of that information in order to make the data usable in the future. In contrast to metadata that describes data, the data about the processes relating to data is usually denoted as paradata. The major problem with capturing paradata is in the practical impossibility to document and keep everything and the difficulty to determine how to capture just enough.

The empirical focus of CAPTURE is archaeological and cultural heritage data, which stands out by its extreme heterogeneity and rapid accumulation due to the scale of ongoing development-led archaeological fieldwork. Within and beyond this specific context, CAPTURE develops an in-depth understanding of how paradata is being created and used today, elicits methods for capturing paradata, tests new methods in field trials, and synthesises the findings in a reference model to inform the capturing of paradata and enabling data-intensive research using heterogeneous research data stemming from diverse origins.

The principal investigator of CAPTURE is professor Isto Huvila at the Department of ALM. This project is funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme grant agreement No 818210.

Read more about CAPTURE  here.

About the author

Isto Huvila is Professor in Information Studies at the Department of Archives, Libraries, and Museums at Uppsala University.  Professor Huvila’s  research interests include information and knowledge management, information work, knowledge organisation, documentation, and social and participatory information practices. The contexts of his research ranges from archaeology and cultural heritage, archives, libraries and museums to health information and e-health, social media, virtual worlds and corporate and public organisations.  For more info, please see here.