Spotlight: Unexpected Surprises in the Internet Archive

Image of data banks with the Internet Archive logo on them.

The Internet Archive.

For most of us, our introduction to the Internet Archive was the Wayback Machine, a search engine that can show you snapshots of websites from the past. It’s always fun to watch a popular webpage like Google evolve from November 1998 to July 2004 and beyond, but there is so much more that the Internet Archive has to offer. Today, I’m going to go through a few highlights from the Internet Archive’s treasure trove of material, just to get a glimpse at all of the things you can do and see on this amazing website.

Folksoundomy: A Library of Sound

Folksoundomy is the Internet Archives’ collection of sounds, music and speech. The collection is collaboratively created and tagged, with many participants from outside of the library sphere. There are more than a million items in the Folksoundomy collection, with items that range in date back to the invention of Thomas Edison’s invention of recorded sound in 1877. From Hip Hop Mixtapes to Russian audiobooks, sermons to stand-up comedy, music, podcasts, radio shows and more, Folksoundomy is an incredible resource for scholars looking at the history of recorded sound.

TV News Archive

With over countless clips collected since 2009, the TV News Archive includes everything from this morning’s news to curated series of fact-checked clips. Special collections within the TV News Archive include Understanding 9/11, Political Ads, and the TV NSA Clip Library. With the ability to search closed captions from US TV new shows, the Internet Archive provides a unique research opportunity for those studying modern US media.

Software Library: MS-DOS Games

Ready to die of dysentery on The Oregon Trail again? Now is your chance! The Internet Archive’s MS-DOS Games Software Library uses an EM-DOSBOX in-browser emulator that lets you go through and play games that would otherwise seem lost to time. Relive your childhood memories or start researching trends in video games throughout the years with this incredible collection of playable games!

National Security Internet Archive (NSIA)

Created in March 2015, the NSIA collects files from muckracking and national security organizations, as well as historians and activists. With over 2 million files split into several collections, the NSIA helps collect everything from CIA Lessons Learned from Czechoslovakia to the UFO Files, a collection of declassified UFO files from around the world. Having these files accessible and together is incredibly helpful to researchers studying the history of national security, both in the US and elsewhere in the world.

University of Illinois at Urbana-Champaign

That’s right! We’re also on the Internet Archive. The U of I adds content in several areas: Illinois history, culture and natural resources; US railroad history; rural studies and agriculture; works in translation; as well as 19th century “triple-decker” novels and emblem books. Click on the above link to see what your alma mater is contributing to the Internet Archive today!

Of course, this is nowhere near everything! With Classic TV CommercialsGrateful Dead, Community Software and more, it’s definitely worth your time to see what on the Internet Archive will help you!

Public Domain and Creativity

This post was guest authored by Scholarly Communication & Publishing Graduate Assistant Nicole Moriah Rhodes.


The first American copyright law protected works for fourteen years after they were published and gave the copyright owner the opportunity to renew the copyright for another fourteen years. Few did, and works passed quickly into the public domain.

The copyright term is much longer now–it varies, but you, a human, will likely own many copyrights until 70 years after you die. Some people argue that a long copyright term increases the incentive to make creative work.

However, despite the longer term, statistical analysis of the number of copyright registrations through changes in population, economy, US law, and available technology doesn’t find that increasing copyright protection increases the number of copyrighted works. Raymond Shih Ray Ku, Jiayang Sun, & Yiying Fan (2009) find that the people advocating for broader copyright laws probably aren’t advocating for an increase in the amount of creative work: the best indicator of the number of new creative works among the variables in their study is population. Their data suggest that “Laws that reduce or otherwise limit copyright protection are actually more likely to increase the number of new works” (1673) than laws granting more protection.

Such a long period of copyright protection leaves a lot of content unusable to other creators. This comic about documentary filmmakers demonstrates how stringent copyright protections can prevent creative remixing and impede the accurate representation of the world. Work in the public domain can be shared freely, but our real lives are full of content protected by copyright, and people trying to make documentaries can be inhibited by copyright even on incidental work. When they want to use copyrighted material under the fair use doctrine, the threat of lawsuits can have a chilling effect.

Lawrence Lessig (2004) uses the phrase “Walt Disney creativity” to describe “a form of expression and genius that builds upon the culture around us and makes it something different” (24). Disney’s Cinderella, Disney’s live-action Cinderella, fanfiction, and The Lizzie Bennet Diaries could all be considered examples of Walt Disney creativity. But Disney had access to fairly recent work in his time. As Lessig writes:

“Thus, most of the content from the nineteenth century was free for Disney to use and build upon in 1928. It was free for anyone— whether connected or not, whether rich or not, whether approved or not—to use and build upon.

“From 1790 until 1978, the average copyright term was never more than thirty-two years, meaning that most culture just a generation and a half old was free for anyone to build upon without the permission of anyone else. Today’s equivalent would be for creative work from the 1960s and 1970s to now be free for the next Walt Disney to build upon without permission. Yet today, the public domain is presumptive only for content from before the Great Depression.” (24-25)

Michael Hart, the creator of Project Gutenberg and a longtime Urbana resident, viewed copyright law as impeding the abundance that technology could create, beginning with the very first copyright laws after the invention of the Gutenberg Press. While Ku, Sun, & Fan (2009) do find that copyright law helps create and protect both wealth and jobs and allows creators to be rewarded for their work rather than requiring sponsorship, they advocate for reducing copyright protection where it impedes distribution or creativity.

“Because copyright law works in the negative—effectively saying ‘do not use this work, do not copy this work, do not imitate this work’—we are not sending a message that society values the creation of new works. We are only sending the message that we should stay away from those works already created” (1722).

Creative Commons is one venture designed to allow creators to share their work for other creators’ use while preserving the level of protection they choose. However, the default is still a system that restricts access to cultural works past the time when the creator might care, and can even keep works from being preserved so they will be usable when they enter the public domain. Creators should be able to benefit from the work they create, but increasing protections does not necessarily increase those benefits. Excessive copyright terms keep us from being able to discuss and rethink our common culture.

Copyright as a Tool for Censorship

This post was guest authored by Scholarly Communication & Publishing Graduate Assistant Nicole Moriah Rhodes.


Copyright should be used to encourage speech and not to silence it. The stories below demonstrate that copyright can be used to limit the rights of technology users and censor criticism.

“In practical terms, the DMCA legalized technical controls on access to electronic works; it renders obsolete traditional rules for reading and sharing print materials and, simultaneously, enables content owners to implement a pay-per-use system that controls who has access, when, how much and from where. So, for instance, you can lend a paperback to friends, but you aren’t allowed to do the same thing with an electronic book.”

“The database shows that Ares Rights has filed at least 186 complaints since 2011, with 87 made on behalf of politicians, political parties, state media, and state agencies in the Americas.” (CPJ)

“They were received by political commentators who used images of Correa, transmitted on Ecuadoran public television, in videos uploaded to YouTube, in order to make visible the resistance of local communities to the onslaught of mining communities in the country’s inland provinces. The same thing happened with videos that used stock footage to illustrate the inconsistencies of the President’s statements together with videos of protests against the exploitation of Yasuní national park, and images of repression against students.” (Derechos Digitales)

  • Electronic Frontier Foundation: To be eligible under the DMCA’s safe harbor provisions, companies must comply with legitimate takedown notices. But many hosts end up taking down content that can be legally shared. Copyright takedown notices can be used to hassle critics. Punishing bogus claims is difficult, and the damages for failing to comply can be severe.

“According to the latest numbers, Twitter does not comply with nearly 1 in 4 takedown notices it receives; Wikimedia complies with less than half; and WordPress complies with less than two-thirds. Each organization explains in its report that the notices with which they don’t comply are either incomplete or abusive.”

Closed Doors or Open Access?: Envisioning the Future of the United States Copyright Office

Copyright Librarian Sara Benson

It’s Copyright Week! For today’s theme of “transparency”, Copyright Librarian Sara Benson discusses her thoughts on the Copyright Office activities to review Section 108.


In 2005, the Copyright Office, under the guidance of the Register of Copyrights at the time, Mary Beth Peters, called for a Study Group to convene and review possible amendments to Section 108. A follow up meeting was held in 2012. These meetings were not unusual, but what followed them, was both strange and unsettling.

The procedures after the Study Group, which took place in the summer of 2016 under the guidance of Maria Pallante, were unusual in that they took place in face-to-face meetings between concerned citizens and members of the Copyright Office rather than in a call for online communications between citizens and the Office. On the one hand, this gave the members of the Office a chance to engage in a dialogue with the concerned citizens. On the other, it meant that generally only those with the resources to travel to Washington, D.C. were privileged with the ability to engage with the members of the Office. However, the Office did note that it would engage in telephone conversations, if necessary. In any event, none of these conversations were ever made public.

At that time, it seemed that the Copyright Office was making an intentional move away from a public debate about copyright to a cloistered room with a privileged few. In my view, that move was undemocratic and should be discouraged in the future. Indeed, although the Copyright Office did publish a list of individuals and organizations it met with to discuss Section 108, but the actual subject and content of those discussions remains a mystery.

Notably, shortly after taking office as the new Librarian of Congress, Dr. Carla Hayden removed Maria Pallante from her position as Register of Copyrights. Does this signal a move away from the process that was undertaken to review Section 108? Likely it does, as Librarian of Congress Dr. Hayden has recently taken further steps towards listening to the views of the multitude by openly polling the public about what we would like to see in the next Register of Copyrights.

This is an exciting time to engage with the Copyright Office under Dr. Hayden’s leadership. I encourage everyone reading this essay to add your voice to the ongoing discussions about the changes to the Office, including the selection of the new Register of Copyrights and beyond.

Meet Dan Tracy, Information Sciences and Digital Humanities Librarian

This latest installment of our series of interviews with Scholarly Commons experts and affiliates features Dan Tracy, Information Sciences and Digital Humanities Librarian.


What is your background and work experience?

I originally come from a humanities background and completed a PhD in literature specializing in 20th century American literature, followed by teaching as a lecturer for two years. I had worked a lot with librarians during that time with my research and teaching. When you’re a PhD student in English, you teach a lot of rhetoric, and I also taught some literature classes. As a rhetoric instructor I worked closely with the Undergraduate Library’s instruction services, which exposed me to the work librarians do with instruction.

Then I did a Master’s in Library and Information Science here, knowing that I was interested in being an academic librarian, probably something in the area of being a subject librarian in the humanities. And then I began this job about five years ago. So I’ve been here about five years now in this role. And just began doing Digital Humanities over the summer. I had previously done some liaison work related to digital humanities, especially related to digital publishing, and I had been doing some research related to user experience and digital publishing as related to DH publishing tools.

What led you to this field?

A number of things. One was having known quite a number of people who went into librarianship who really liked it and talked about their work. Another was my experience working with librarians in terms of their instruction capacity. I was interested in working in an academic environment and I was interested in academic librarianship and teaching. And also, especially as things evolved, after I went back for the degree in library and information science, I also found a lot of other things to be interested in as well, including things like digital humanities and data issues.

What is your research agenda?

My research looks at user experience in digital publishing. Primarily in the context of both ebook formats and newer experimental forms of publication such as web and multi-modal publishing with tools like Scalar, especially from the reader side, but also from the creator side of these platforms.

Do you have any favorite work-related duties?

As I mentioned before, instruction was an initial draw to librarianship. I like anytime I can teach and work with students, or faculty for that matter, and help them learn new things. That would probably be a top thing. And I think increasingly the chances I get to work with digital collections issues as well. I think there’s a lot of exciting work to do there in terms of delivering our digital collections to scholars to complete both traditional and new forms of research projects.

What are some of your favorite underutilized resources that you would recommend to researchers?

I think there’s a lot. I think researchers are already aware of digital primary sources in general, but I do think there’s a lot more for people to explore in terms of collections we’ve digitized and things we can do with those through our digital library, and through other digital library platforms, like DPLA (Digital Public Library of America).

I think that a lot of our digital image collections are especially underutilized. I think people are more aware that we have digitized text sources, but not aware of our digitized primary sources that are images that have value of research objects, including analyzed computational analysis. We also have more and more access to the text data behind our various vendor platforms, which is a resource various researchers on campus increasingly need but don’t always know is available.

If you could recommend one book to beginning researchers in your field, what would you recommend?

If you’re just getting started, I think a good place to look is at the Debates in the Digital Humanities books, which are collections of essays that touch on a variety of critical issues in digital humanities research and teaching. This is a good place to start if you want to get a taste of the ongoing debates and issues. There are open access copies of them available online, so they are easy to get to.

Dan Tracy can be reached at dtracy@illinois.edu.