Lightning Review: Text Analysis with R for Students of Literature

Cover of Text Analysis with R book

My undergraduate degree is in Classical Humanities and French, and like many humanities and liberal arts students, computers were mostly used for accessing Oxford Reference Online and double checking that “bonjour” meant “hello” before term papers were turned in. Actual critical analysis of literature came from my mind and my research, and nothing else. Recently, scholars in the humanities began seeing the potential of computational methods for their study, and coined these methods “digital humanities.” Computational text analysis provides insights that in many cases, aren’t possible for a human mind to complete. When was the last time you read 100 books to count occurrences of a certain word, or looked at thousands of documents to group their contents by topic? In Text Analysis with R for Students of Literature, Matthew Jockers presents programming concepts specifically how they relate to literature study, with plenty of help to make the most technophobic English student a digital humanist.

Jockers’ book caters to the beginning coder. You download practice text from his website that is already formatted to use in the tutorials presented, and he doesn’t dwell too much on pounding programming concepts into your head. I came into this text having already taken a course on Python, where we did edit text and complete exercises similar to the ones in this book, but even a complete beginner would find Jockers’ explanations perfect for diving into computational text analysis. There are some advanced statistical concepts presented which may turn those less mathematically inclined, but these are mentioned only as furthering understanding of what R does in the background, and can be left to the computer scientists. Practice-based and easy to get through, Text Analysis with R for Students of Literature serves its primary purpose of bringing the possibilities of programming to those used to traditional literature research methods.

Ready to start using a computer to study literature? Visit the Scholarly Commons to view the physical book, or download the eBook through the Illinois library.

Exploring Data Visualization #5 – R edition

In this monthly series, I share a combination of cool data visualizations, useful tools and resources, and other visualization miscellany. The field of data visualization is full of experts who publish insights in books and on blogs, and I’ll be using this series to introduce you to a few of them. You can find previous posts by looking at the Exploring Data Visualization tag.

This month, I wanted to share some resources specifically for learning to visualize data using R.

1) R is a free, open source programming language that is heavily used for statistical analysis, but has also expanded to encompass nearly any kind of data analysis you would want to do. In the Scholarly Commons, we have R and RStudio (a user-friendly R development environment) installed on all of our lab computers. RStudio’s website provides links to a lot of ways for you to get started with R.

2) R guru Hadley Wickham gave a public lecture at the University of Notre Dame last August. (Note that his talk starts about 37 minutes into the video.) In the lecture, he walks through a simple example of the iterative process of data visualization in R, and gives additional related advice for doing data science. You can learn from his lecture without knowing any R, but you will find it easier to understand if you have basic experience with programming in general.

3) If you want a book to help you learn more in depth, Wickham and a colleague wrote R for data science: Import, tidy, transform, visualize, and model data. You can read R for data science online, or you can come in to the Scholarly Commons to read the physical book while practicing on one of our lab computers.

4) You can also find a number of specific R courses at Lynda.com, such as “Data Visualization in R with ggplot2.” Just make sure to log in with your U of I credentials so you can access the courses for free.

I hope you enjoyed this data visualization news! If you have any data visualization questions, please feel free to email me and set up an appointment at the Scholarly Commons.

New Digital Humanities Books in the Scholarly Commons!

Is there anything quite as satisfying as a new book? We just got a new shipment of books here in the Scholarly Commons that complement all our services, including digital humanities. Our books are non-circulating, so you cannot check them out, but these DH books are always available for your perusal in our space.

Stack of books in the Scholarly Commons

Two brand new and two mostly new DH books

Digital Humanities: Knowledge and Critique in a Digital Age by David M. Berry and Anders Fagerjord

Two media studies scholars examine the history and future of digital humanities. DH is a relatively new field, and one that is still not clearly defined. Berry and Fagerjord take a deep dive into the methods that digital humanists gravitate towards, and critique their use in relation to the broader cultural context. They are more critical of the “digital” than the “humanities,” meaning they consider more how use of digital tools affects the society as a whole (there’s that media studies!) than how scholars use digital methods in humanities work. They caution against using digital tools just because they are “better,” and instead encourage the reader to examine their role in the DH field to contribute to its ongoing growth. Berry has previously edited Understanding Digital Humanities (eBook available through Illinois library), which discusses similar issues. For a theoretical understanding of digital humanities, and to examine the issues in the field, read Digital Humanities.

Text Mining with R: A Tidy Approach by Julia Silge and David Robinson

Working with data can be messy, and text even messier. It never behaves how you expect it to, so approaching text analysis in a “tidy” manner is crucial. In Text Mining with R, Silge and Robinson present their tidytext framework for R, and instruct the reader in applying this package to natural language processing (NLP). NLP can be applied to derive meaning from unstructured text by way of unsupervised machine learning (wherein you train the computer to organize or otherwise analyze your text and then you go get coffee while it does all the work). This book is most helpful for those with programming experience, but no knowledge of text mining or natural language processing is required. With practical examples and easy to follow, step-by-step guides, Text Mining with R serves as an excellent introduction to tidying text for use in sentiment analysis, topic modeling, and classification.

No programming or R experience? Try some of our other books, like R Cookbook for an in-depth introduction, or Text Analysis with R for Students of Literature for a step-by-step learning experience focused on humanities people.

Visit us in the Scholarly Commons, 306 Main Library, to read some of our new books. Summer hours are Monday through Friday, 10 AM-5 PM. Hope to see you soon!

Introducing Clay Alsup, Scholarly Commons Intern

This latest installment of our series of interviews with Scholarly Commons experts and affiliates features Clay Alsup, Scholarly Commons Intern. Clay started working at the Scholarly Commons in August 2017.


What is your background education and work experience?

I started out in community college, and got my BA at Salisbury University on the Eastern Shore of Maryland, majoring in Philosophy. After that, I received an MA in Philosophy from Louisiana State University. I did a year in AmeriCorps working for Hospitality Homes, a wonderful not-for-profit in Boston that provides free volunteer host housing for people who travel to Boston for medical care. I spent another year in Boston working at Eureka!, a puzzle and game store, which was an enormous amount of fun. I came to the University of Illinois in 2012 to enter the PhD program in Philosophy.

What led you to this field?

I’d say that I didn’t really have a choice; I was never all that good at anything else, and I was always quite good at philosophy. Neither of my parents were surprised with where I ended up. I love reading and teaching philosophy and just never really get tired of it.

What is your research agenda?

My project has to do with what conspiracy theories are and why we find them so compelling. There’s a small literature on what a conspiracy theory is and whether anything is wrong with them in philosophy. In most of the work, however, very little (if any) empirical evidence is given for why a particular definition is settled upon, and little focus is given to what is so compelling about them. In psychological and sociological literature, a definition of “conspiracy theory” is normally imposed without much questioning, though lots of work is done on why people believe them. My hope is to use empirical evidence to support a particular definition of conspiracy theory (or distinguish between different types), so that a more satisfactory account can be given about their epistemic adequacy and psychological appeal. In order to do this, I am working through a comprehensive collection of conspiracy theories with a couple undergraduates and determining what features tend to be present. In addition, I will be analyzing the text using various tools in R in order to uncover other, less obvious features about conspiracy theories in order to work out which are most typical.

Do you have any favorite work-related duties?

I just started, so I haven’t had an opportunity to do a whole lot yet. Working on my project is one of my duties, though, and I’m enjoying that very much!

What are some of your favorite underutilized resources that you would recommend to researchers?

I’m not sure I have much to say about this. Certainly, I think that philosophers often make unsupported empirical claims in support of their arguments, and instead of waiting for a study to be conducted on the matter, it’s not a bad idea for philosophers to learn about experimental design and pursue the question themselves.

If you could recommend only one book to beginning researchers in your field, what would you recommend?

Yikes! I don’t think there’s any hope of picking a single philosophy text. For an example of an endlessly dissatisfied intellect that is always looking for a deeper understanding of various phenomena, it’s hard to outdo Nietzsche, so perhaps his On the Genealogy of Morals.

Want to get in touch with Clay? Send him an email or come visit him at the Scholarly Commons!

Register for Spring 2017 Workshops at CITL!

Exciting news for anyone interested in learning the basics of statistical and qualitative analysis software! Registration is open for workshops to be held throughout spring semester at the Center for Innovation in Teaching and Learning! There will be workshops on ATLAS.ti, R, SAS, Stata, SPSS, and Questionnaire Design on Tuesdays and Wednesdays in February and March from 5:30-7:30 pm. To learn more details and to register click here to go to the workshops offered by CITL page. And if you need a place to use these statistical and qualitative software packages, such as to practice the skills you gained at the workshops stop by Scholarly Commons, Monday-Friday 9 am- 6 pm! And don’t forget, you can also schedule a consultation with our experts here for specific questions about using statistical and qualitative analysis software for your research!

Register for Fall 2016 CITL Workshops

The University of Illinois Center for Innovation in Teaching & Learning (CITL) has registration open for their fall line-up of workshops. These are the same workshops that have been offered by ATLAS in the past.  The workshops show participants how to use statistical and qualitative analysis software, as well as social science data. Registration is free of charge to UIUC faculty, instructors, staff, and students. All workshops run from 5:30 to 7:30 PM, and all but the ATLAS.ti workshops will take place in room G8a in the Foreign Languages Building (the ATLAS.ti workshop’s location will be announced soon). This semester’s schedule is as follows:

  • 9/20: R I: Getting Started with R
  • 9/21: SAS I: Getting Started with SAS
  • 9/27: R II: Inferential Statistics
  • 9/28: SAS II: Inferential Statistics with SAS
  • 10/4: Stata I: Getting Started with Stata
  • 10/5: SPSS I: Getting Started with SPSS
  • 10/6: ATLAS.ti I: Introduction – Qualitative Coding
  • 10/11: Stata II: Inferential Statistics with Stata
  • 10/12: SPSS II: Inferential Statistics with SPSS
  • 10/13: ATLAS.ti II: Data Exploration and Analysis
  • 10/18: Questionnaire Design

For more information about the individual workshops, or to get a look at the workshop tutorials, head to the Statistics, Data and Survey Wiki. To register for these workshops, head to the CITL Workshop Registration Form. If you have any questions or concerns, please e-mail atlas-training@illinois.edu.

ATLAS Workshops

This semester, ATLAS is conducting several short evening courses to show participants how to use statistical, GIS (geographic information systems), and qualitative software, as well as social science data. Statistical & GIS workshops are available to all University of Illinois faculty, instructors, staff, and students.

The Spring 2015 Workshop Schedule
02/24/2015 – ATLAS.ti 1: Introduction – Qualitative Coding
03/03/2015 – ATLAS.ti 2: Data Exploration and Analysis

02/11/2015 – ArcGIS 1: Introduction to ArcCatalog and ArcMap
02/18/2015 – ArcGIS 2: Introduction to ArcToolbox

02/25/2015 – SPSS 1: Getting Started with SPSS
03/04/2015 – SPSS 2: Inferential Statistics with SPSS

03/11/2015 – Stata 1: Getting Started with Stata
03/18/2015 – Stata 2: Inferential Statistics with Stata

03/10/2015 – SAS 1: Getting Started with SAS
03/17/2015 – SAS 2: Inferential Statistics with SAS

02/10/2015 – R 1: Getting Started with R
02/17/2015 – R 2: Inferential Statistics
04/01/2015 – R 3: R Studio

03/31/2015 – Questionnaire Design

Registration Details:
http://www.surveygizmo.com/s3/1957979/workshop-registration

ATLAS Offers Data Consulting, Free Online Tools, Workshops, and More

The college of LAS has recently purchased college-wide licenses for the online survey research tools: Surveygizmo and Qualtrics.  These accounts are free to faculty and students.  To request an account please use the following link: http://www.surveygizmo.com/s3/1781704/surveytool

ATLAS also offers:

— A free questionnaire design workshop  and assistance with programing online surveys:http://www.atlas.illinois.edu/services/stats/consulting/

–An open computer lab with knowledgeable staff ready to answer your questions about quantitative and qualitative research and programs:  2043 Lincoln Hall,(9-5 M-Th, 9-3 F)

–Free workshops: http://www.surveygizmo.com/s3/1708941/workshop-registration

–Classroom demonstrations  for our supported programs: http://www.atlas.illinois.edu/services/stats/tutorials/


Here is the ATLAS Fall Workshop Schedule
http://www.surveygizmo.com/s3/1708941/workshop-registration

10/08/2014  – ATLAS.ti Introduction – Qualitative Coding

9/24/2014 – ArcGIS 1: Introduction to ArcCatalog and ArcMap
10/01/2014 – ArcGIS 2: Introduction to ArcToolbox

10/07/2014 – SPSS 1: Getting Started with SPSS
10/142014 – SPSS 2: Inferential Statistics with SPSS

10/22/2014 – Stata 1: Getting Started with Stata
10/29/2014 – Stata 2: Inferential Statistics with Stata

10/21/2014 – SAS 1: Getting Started with SAS 
10/28/2014 – SAS 2: Inferential Statistics with SAS

09/23/2014 – R: Getting Started with R
09/30/2014 – R 2: Inferential Statistics

11/04/2014 – Survey Research


Do you need help locating data for a project or thesis?  Do you need assistance preparing your data for analysis?  ATLAS holds Data Service hours in the Library’s Scholarly Commons (306 Main Library).  For more information please visit: http://www.library.illinois.edu/datagis/


For more information about any of these services, please visit:
http://www.atlas.illinois.edu/services/stats/consulting/