What Are the Digital Humanities?

Introduction

As new technology has revolutionized the ways all fields gather information, scholars have integrated the use of digital software to enhance traditional models of research. While digital software may seem only relevant in scientific research, digital projects play a crucial role in disciplines not traditionally associated with computer science. One of the biggest digital initiatives actually takes place in fields such as English, History, Philosophy, and more in what is known as the digital humanities. The digital humanities are an innovative way to incorporate digital data and computer science within the confines of humanities-based research. Although some aspects of the digital humanities are exclusive to specific fields, most digital humanities projects are interdisciplinary in nature. Below are three general impacts that projects within the digital humanities have enhanced the approaches to humanities research for scholars in these fields.

Digital Access to Resources

Digital access is a way of taking items necessary for humanities research and creating a system where users can easily access these resources. This work involves digitizing physical items and formatting them to store them on a database that permits access to its contents. Since some of these databases may hold thousands or millions of items, digital humanists also work to find ways so that users may locate these specific items quickly and easily. Thus, digital access requires both the digitization of physical items and their storage on a database as well as creating a path for scholars to find them for research purposes.

Providing Tools to Enhance Interpretation of Data and Sources

The digital humanities can also change how we can interpret sources and other items used in the digital humanities. Data Visualization software, for example, helps simplify large, complex datasets and presents this data in ways more visually appealing. Likewise, text mining software uncovers trends through analyzing text that potentially saves hours or even days for digital humanists had they analyzed the text through analog methods. Finally, Geographic Information Systems (GIS) software allows for users working on humanities projects to create special types of maps that can both assist in visualizing and analyzing data. These software programs and more have dramatically transformed the ways digital humanists interpret and visualize their research.

Digital Publishing

The digital humanities have opened new opportunities for scholars to publish their work. In some cases, digital publishing is simply digitizing an article or item in print to expand the reach of a given publication to readers who may not have direct access to the physical version. Other times, some digital publishing initiatives publish research that is only accessible in a digital format. One benefit to digital publishing is that it opens more opportunities for scholars to publish their research and expands the audience for their research than just publishing in print. As a result, the digital humanities provide scholars more opportunities to publish their research while also expanding the reach of their publications.

How Can I Learn More About the Digital Humanities?

There are many ways to get involved both at the University of Illinois as well as around the globe. Here is just a list of a few examples that can help you get started on your own digital humanities project:

  • HathiTrust is a partnership through the Big Ten Academic Alliance that holds over 17 million items in its collection.
  • Internet Archive is a public, multimedia database that allows for open access to a wide range of materials.
  • The Scholarly Commons page on the digital humanities offers many of the tools used for data visualization, text mining, GIS software, and other resources that enhance analysis within a humanities project. There are also a couple of upcoming Savvy Researcher workshops that will go over how to use software used in the digital humanities
  • Sourcelab is an initiative through the History Department that works to publish and preserve digital history projects. Many other humanities fields have equivalents to Sourcelab that serves the specific needs of a given discipline.

Big Ten Academic Alliance Open Access Developments

Last month, the Big Ten Academic Alliance (BTAA) made a series of announcements regarding its support of Open Access (OA) initiatives across its member libraries. Open Access is the free, immediate, online availability of research articles coupled with the rights to use those articles fully in the digital environment. Put plainly, Open Access ensures that anyone, anywhere, can access and use information. By supporting these developments in OA, the BTAA aims to make information more accessible to the university community, to benefit scholars by eliminating paywalls to research, and to help researchers to publish their own work.

Big ten academic alliance logo

On July 19, the BTAA announced the finalization of a three-year collective agreement with the Open Library of Humanities (OLH), a charitable organization dedicated to publishing open access scholarship with no author-facing article processing charges. OLH publishes academic journals from across the humanities disciplines, as well as hosting its own multidisciplinary journal. This move was made possible thanks to the OLH Open Consortial Offer, an initiative that offers consortia, societies, networks and scholarly projects the opportunity to join the Open Library of Humanities Library Partnership Subsidy system as a bloc, enabling each institution to benefit from a discount. Through this agreement, the BTAA hopes to expand scholarly publishing opportunities available to its member libraries, including the University of Illinois.

Following the finalization of the OLH agreement, the BTAA announced on July 21 the finalization of a three-year collective action agreement with MIT Press that provides Direct to Open (D2O) access for all fifteen BTAA member libraries. Developed over two years with the support of the Arcadia Fund, D2O gives institutions the opportunity to harness collective action to support access to knowledge. As participating libraries, the Big Ten members will help open access to all new MIT Press scholarly monographs and edited collections from 2022. As a BTAA member, the University of Illinois will support the shifting publication of new MIT Press titles to open access. The agreement also gives the University of Illinois community access to MIT Press eBook backfiles that were not previously published open access.

By entering into these agreements, the BTAA aims to promote open access publishing across its member libraries. On how these initiatives will impact the University of Illinois scholarly community, Head of Scholarly Communication & Publishing Librarian Dan Tracy said:

“The Library’s support of OLH and MIT Press is a crucial investment in open access publishing infrastructure. The expansion of open access publishing is a great opportunity to increase the reach and impact of faculty research, but common models of funding open access through article processing charges makes it challenging for authors in the humanities and social sciences particularly to publish open access. The work of OLH to publish open access journals, and MIT Press to publish open access books, without any author fees while also providing high quality, peer reviewed scholarly publishing opportunities provides greater equity across disciplines.”

Since these announcements, the BTAA has continued to support open access initiatives among its member libraries. Most recently, the BTAA and the University of Michigan Press signed a three-year agreement on August 5 that provides multi-year support for the University of Michigan Press’ new open access model Fund to Mission. Based on principles of equity, justice, inclusion, and accessibility, Fund to Mission aims to transition upwards of 75% of the press’ monograph publications into open access resources by the end of 2023. This initiative demonstrates a move toward a more open, sustainable infrastructure for the humanities and social sciences, and is one of several programs that university presses are developing to expand the reach of their specialist publications. As part of this agreement, select BTAA members, University of Illinois included, will have greater access to significant portions of the University of Michigan’s backlist content.

The full release and more information about recent BTAA announcements can be found on the BTAA website. To learn more about Open Access efforts at the University of Illinois, visit our OA Guide.

Comparison: Human vs. Computer Transcription of an “It Takes a Campus” Episode

Providing transcripts of audio or video content is critical for making these experiences accessible to a wide variety of audiences, especially those who are deaf or hard of hearing. Even those with perfect hearing might prefer to skim over a transcript of text rather than listen to audio sometimes. However, often times the slowest part of the audio and video publishing process is the transcribing portion of the workflow. This was certainly true with the recent interview I did with Ted Underwood, which I conducted on March 2 but did not release until March 31. The majority of that time was spent transcribing the interview; editing and quality control were significantly less time consuming.

Theoretically, one way we could speed up this process is to have computers do it for us. Over the years I’ve had many people ask me whether automatic speech-to-text transcription is a viable alternative to human transcription in dealing with oral history or podcast transcription. The short answer to that question is: “sort of, but not really.”

Speech to text or speech recognition technology has come a long way particularly in recent years. Its performance has improved to the point where human users can give auditory commands to a virtual assistant such as Alexa, Siri, or Google Home, and the device usually gives an appropriate response to the person’s request. However, recognizing a simple command like “Remind me at 5 pm to transcribe the podcast” is not quite the same as correctly recognizing and transcribing a 30-minute interview. It has to handle differences between two speakers and lengthy blocks of text.

To see how good of a job the best speech recognition tools do today, I decided to have one of these tools attempt to transcribe the Ted Underwood podcast interview and compare it to the actual transcript I did by hand. The specific tool I selected was Amazon Transcribe, which is part of the Amazon Web Services (AWS) suite of tools. This service is considered one of the best options available and uses cloud computing to convert audio data to textual data, presumably like how Amazon’s Alexa works.

It’s important to note that Amazon Transcribe is not free, however, it only costs $0.0004 per second of text, so Ted Underwood’s interview only cost me 85 cents to transcribe. For more on Amazon Transcribe’s costs, see this page.

In any case, here is a comparison between my manual transcript vs. Amazon Transcribe. To begin, here is the intro to the podcast as spoken and later transcribed by me:

Ben Ostermeier: Hello and welcome back to another episode of “It Takes
a Campus.” My name is Ben, and I am currently a graduate assistant at
the Scholarly Commons, and today I am joined with Dr. Ted Underwood,
who is a professor at the iSchool here at the University of Illinois.
Dr. Underwood, welcome to the podcast and thank you for taking time
to talk to me today.

And here is Amazon Transcribe’s interpretation of that same section of audio, with changes highlighted:

Hello and welcome back to another episode of it takes a campus. My
name is Ben, and I am currently a graduate assistant at Scali Commons.
And today I'm joined with Dr Ted Underwood, who is a professor 
at the high school here at the University of Illinois. 
Dr. Underwood, welcome to the podcast. Thank you for taking 
time to talk to me today.

As you can see, Amazon Transcribe did a pretty good job, but there are some mistakes and changes from the transcript I hand wrote. It particularly had trouble with proper nouns like “Scholarly Commons” and “iSchool,” along with some minor issues like not putting a dot after “Dr” and missing an “and” conjunction in the last sentence.

Screenshot of text comparison between Amazon-generated and human-generated transcripts.

Screenshot of text comparison between Amazon-generated (left) and human-generated (right) transcripts of the podcast episode.

You can see the complete changes between the two transcripts at this link.

Please note that the raw text I received from Amazon Transcribe was not separated into paragraphs initially. I had to do that myself in order to make the comparison easier to see.

In general, Amazon Transcribe does a pretty good job in recognizing speech but makes a decent number of mistakes that require cleaning up afterwards. For me, I actually find it faster and less frustrating to transcribe by hand instead of correcting a ‘dirty’ transcript, but others may prefer the alternative. Additionally, in some cases an institution may have a very large number of untranscribed oral histories, for example, and if the choice is between having a dirty transcript vs. no transcript at all, a dirty transcript is naturally preferable.

Also, while I did not have time to do this, there are ways to train Amazon Transcribe to do a better job with your audio, particularly with proper nouns like “Scholarly Commons.” You can read more about it on the AWS blog.

That said, there is very much an art to transcription, and I’m not sure if computers will ever be able to totally replicate it. When transcribing, I often have to make judgement calls about whether to include aspects of speech like “um”s and “uh”s. People also tend to start a thought and then stop and say something else, so I have to decide whether to include “false starts” like these or not. All of these judgement calls can have a significant impact on how researchers interpret a text, and to me it is crucial that a human sensitive to their implications makes these decisions. This is especially critical when transcribing an oral history that involves a power imbalance between the interviewer and interviewee.

In any case, speech to text technology is becoming increasingly powerful, and there may come a day, perhaps very soon, when computers can do just as good of a job as humans. In the meantime, though, we will still need to rely in at least some human input to make sure transcripts are accurate.

Meet our Graduate Assistants: Ben Ostermeier

What is your background education and work experience?

I graduated from Southern Illinois University Edwardsville with a Bachelor of Arts in History, with a minor in Computer Science. I was also the first SIUE student to receive an additional minor in Digital Humanities and Social Sciences. In undergrad I worked on a variety of digital humanities projects with the IRIS Center for the digital humanities, and after graduating I was hired as the technician for the IRIS Center. In that role, I was responsible for supporting the technical needs of digital humanities projects affiliated with the IRIS Center and provided guidance to professors and students starting their own digital scholarship projects.

What led you to your field?

I have been drawn to applied humanities, particularly history, since high school, and I have long enjoyed tinkering with software and making information available online. When I was young this usually manifested in reading and writing information on fan wikis. More recently, I have particularly enjoyed working on digital archives that focus on local community history, such as the SIUE Madison Historical project at madison-historical.siue.edu.

What are your favorite projects you’ve worked on?

While working for the Scholarly Commons, I have had the opportunity to work with my fellow graduate assistant Mallory Untch to publish our new podcast, It Takes a Campus, on iTunes and other popular podcast libraries. Recently, I recorded and published an episode with Dr. Ted Underwood. Mallory and I also created an interactive timeline showcasing the history of the Scholarly Commons for the unit’s tenth anniversary last fall.

What are some of your favorite underutilized Scholarly Commons resources that you would recommend?

We offer consultations to patrons looking for in-depth assistance with their digital scholarship. You can request a consultation through our online form!

When you graduate, what would your ideal job position look like?

I would love to work as a Digital Archivist in some form, responsible for ensuring the long term preservation of digital artifacts, as well as the best way to make these objects accessible to users. It is especially important to me that these digital spaces relate to and are accessible to the people and cultures represented in the items, so I hope I am able to make these sorts of community connections wherever I end up working.

Meet Our Graduate Assistants: Sarah Appedu

In this interview series we ask our graduate assistants questions for our readers to get to know them better. Our first interview this year is with Sarah Appedu!
Headshot of Sarah Appedu from the shoulders up

What is your background education and work experience?

Before attending graduate school, I worked as the Scholarly Communications Assistant in the academic library of a small liberal arts college. My work included overseeing the institutional repository, working with undergraduate journal editors, and assisting in our efforts to address the high cost of course materials through the promotion of open educational resources. This work inspired me to get my M.S. LIS and sparked my interest in pedagogy, open access publishing, digital scholarship, and copyright. My undergraduate background is in Philosophy and Women, Gender, & Sexuality studies, and I enjoy utilizing my critical thinking skills and love of theory to inform and improve my library practice.

What led you to your field?

It was actually a complete accident! After graduating from undergrad, I found myself interviewing for a temporary Administrative Assistant position at the college library. I had never considered working in a library before, but I quickly realized that many of my skills and interests are compatible with library work. I especially enjoyed the service-oriented nature of libraries and the desire to improve communities. My interest in social justice was welcomed in my position and it wasn’t long before I realized that I may have found my career path!

What are your research interests?

I’m developing an interest in the ways in which technology impacts our ability to seek and evaluation information, particularly in the context of algorithmic bias and surveillance capitalism. I am currently involved in organizing a reading group about artificial intelligence and information seeking behavior, and it is helping expand my conception of how libraries can serve their communities. I think libraries can have an even more prevalent role in educating students and others about the ways in which platforms like Google manipulate what we see online, and I’m looking forward to continue to investigate this topic.

What are some of your favorite underutilized Scholarly Commons resources that you would recommend?

Our Ask a Librarian chat service! The Scholarly Commons is on chat from 10am-2pm Monday-Friday every week and we are available to answer your questions. Feel free to write us about data analysis support, GIS needs, copyright, software, and more!

When you graduate, what would your ideal job position look like?

I’m starting to see the position of Student Success Librarian pop up, and I love the idea of having a job like that. Everything I do in the library always seems to come back to my interest in teaching students and working to make sure all students have the opportunity to succeed, particularly students who traditionally have been excluded from library support and services.