Bill Kretzschmar is Harry and Jane Willson Professor in Humanities at the University of Georgia and is a visiting professor at the University of Oulu in Finland. He edited the American Linguistic Atlas Project for 34 years, the oldest national research project to survey how people speak differently across the country, which led to his preparation of American pronunciations for the online Oxford English Dictionary. He has been active in corpus linguistics, including work on tobacco industry documents. He has been influential in the development of digital methods for analysis and presentation of language variation, including applications of complexity science.


In May, Bill Kretzschmar visited Digital Humanities Uppsala in collaboration with the language GIS research network to deliver a speech for the DH Uppsala Seminar Series. Kretzschmar addressed the theme of sustainability in an institutional setting and proposed collaboration with the university library as the only realistic option for long-term sustainability – drawing upon his long experience at the University of Georgia (and the Digital Humanities Laboratory, or DigiLab, more specifically). A video recording from the seminar is available here (in progress).

We also used the occasion to conduct a short interview with Bill where he among other things touches upon the technological history of early DH (humanities computing) as experienced from his perspective as well as the matter of sustainability. It is published below.


Could you talk to us about the transformation of The Linguistic Atlas Project from a printed publication to its early digital versions – especially in relation to the material and technological conditions that surrounded this process?

When I first started work on the Linguistic Atlas Project in 1977 (!), as a graduate assistant, the whole project that I could see was on paper. The team of graduate assistants was working on recopying some of the field records, and I was using a typewriter to prepare camera-ready copy of some of the field records for publication in the University of Chicago Press series of fascicles of the Atlas of the Middle and South Atlantic States. There were a few audio recordings on reel-to-reel tape, but we weren’t working with those at the time. Our very first step in making the paper records digital was a grant from the National Endowment for the Humanities to start a database of responses; at that point, in 1983, I made the decision to use PCs instead of the university mainframe as something we could control ourselves and because of the new availability of a Winchester hard drive (all 10Mb its storage). After I took over the Atlas in 1984, we were still working with paper records and my first task was to invent the digital technology necessary to prepare camera-ready copy for the fascicle series, so I learned about type founding and created phonetic fonts that we could see on the computer screen using a special graphics board that let me design phonetic characters in “high ASCII” (codes 128-255) and print them out as dot designs using the newly available Hewlett Packard laser printer. About that time the University of Chicago Press cancelled our print publication contract, so these methods were just used to produce the camera-ready copy for our Middle and South Atlantic Handbook. I could use the new publication system to make camera-ready copy for the Journal of English Linguistics, which I edited at the time and had printed privately until the mid 1990s. Also in the late 1980s, I taught myself how to use the RBase database system because the programmer for our earlier NEH grant failed to get it to work, and designed the database structure for the Atlas that we still use today. We got another NEH grant in the early 1990s to keyboard Atlas data–and found out that it was just too time consuming and expensive to enter massive amounts of phonetic data–but we completed entry of about 15% of the data and that was enough to launch the whole digital process. That digital data allowed all of our new developments with GIS: interactive GIS for that data first on Macs and later the Web, and then applications of technical geography like spatial autocorrelation, density estimation, and Kohonen self-organizing maps.

As is exemplified by your previous answer, the particular needs and conditions for humanities infrastructures will always be in flux – and of course other factors than technology (institutional, political, financial), play an equally important role. What is your impression of current conditions within American and European academia?

When I started working with computers on humanities tasks in the 1980s, some people were using mainframes but we decided to use separate PCs because we could control them ourselves, and not have to wait for our low-priority programs to run in the middle of the night, and because we could get at least some mass storage. But my work with Linguistic Atlas data always pushed the limits of storage, and of processing once we started using statistics. We were noticed by the U of Georgia Computing and networking Service when we tried to run statistics on their mainframe, and used more computing time than they realized we needed. While personal computers will always be our choice for data entry and writing/running small programs, we now have to have larger infrastructure for our large data sets and for Web distribution of our materials. This means cooperation with units in the university that manage bigger infrastructure than we can run in the office. It took a long time for us to realize that our natural partner was not the computing and networking unit at the university, but instead the university library. Many of our colleagues in the sciences need ultra fast processing, and that is what our Georgia computing administration has provided. But what we need in the humanities is the ability to create interactive programs to store and present great masses of information, and the library is the unit of the university whose mission it is to do that. My grant resources have helped the library to create the infrastructure that we need for the Atlas, and the library has gone even further to make such infrastructure available to others in the humanities. In Europe the situation is very uneven. I have heard of impressive humanities computing networks in Germany and Norway. Not so much, yet, in the Nordic countries even though there are lots of great digital humanities projects in Finland, Norway, and Sweden. Support for the digital humanities in England seems to have declined, for example with the demise of its digital humanities institutional organization and the removal of digital humanities from the Oxford Computing and Networking Service to the Bodleian Library. In Eastern and Southern Europe (perhaps with Italy as exception) the situation is much worse. The bottom line is that those of us in the humanities really have to have institutional partners for infrastructure. We cannot sit by ourselves with laptops in our ivory towers.

While cooperation is essential for building digital research infrastructures for the humanities, there is also always the difficult question of sustainability and maintenance. What happens after a project is finished – and what can we do to keep the digital resources sustained?

The end of projects is inevitable. All of our digital humanities developments begin with smart people who dream them up and find a way to implement them. But those smart people are usually not followed by people quite as smart, or at least not as interested in the aging projects as the original inventors. When projects lose their momentum there simply is no current way to keep them alive, even as not-working images instead of interactive programs. I have tried to plan for this on the Linguistic Atlas Project by creating a maintenance-free part of our site, the Data Download Center, a file structure from which users can download all of our data.  Our interactive elements in the Web site will eventually go dark when there is nobody to maintain them. Our partner the library cannot afford to pay people to maintain our site. That’s been my biggest job over the decades, not to invent the tools and sites, but to find money consistently to pay for their maintenance and development. While we do not have access to as much grant money as natural and physical scientists do, not to mention medical professionals, there has been enough money for me to keep the office open for decades. But when I retire, none of that money will be coming in. The best we can hope for is that maintenance-free portions of sites will still be available even after the fancier parts that need maintenance have failed. Maybe this is a gloomy prediction, but at the moment I do not have a better plan.