Data Feminism and Data Justice

“Data” can seem like an abstract term – What counts as data? Who decides what is counted? How is data created? What is it used for?

Outline of a figure surrounded by a pie chart, speach bubble, book, bar chart, and venn diagram to represent different types of data

“Data”. Olena Panasovska. Licensed under a CC BY license. https://thenounproject.com/search/?q=data&i=3819883

These questions are some of the ones you might ask when applying a Data Feminist framework to you research. Data Feminism goes beyond looking at the mechanics and logistics of data collection and analysis to undercover the influences of structural power and erasure in the collection, analysis, and application of data.

Data Feminism was developed by Catherine D’Ignazio and Lauren Kline, authors of the book Data Feminism. Their ideas are grounded in the work of Kimberle Crenshaw, the legal scholar credited with developing the concept of intersectionality. Using this lens, they seek to undercover the ways data science has caused harm to marginalized communities and the ways data justice can be used to remedy those harms in partnership with the communities we aim to help.

The Seven Principles of Data Feminism include:

  • Examine power
  • Challenge power
  • Rethink binaries and hierarchies
  • Elevate emotion and embodiment
  • Embrace pluralism
  • Consider context
  • Make labor visible

Applying data feminist principles to your research might involve working with local communities to co-create consent forms, using data collection to fill gaps in available data about marginalized groups, prioritizing the use of open source, community-created tools, and properly acknowledging and compensating people involved in all stages of the research process. At the heart of this work is the questioning of whose interests drive research and how we can reorient those interests around social justice, equity, and community.

The Feminist Data Manifest-No, authored in part by Anita Say Chan, Associate Professor in the School of Information Sciences and the College of Media, provides additional principles to commit to in data feminist research. These resources, and the scholars and communities engaged in this work, demonstrate how data and research can be used to advance justice, reject neutrality, and prioritize those who have historically experienced the greatest harm at the hands of researchers.

The Data + Feminism Lab at the Massachusetts Institute of Technology, directed by D’Ignazio, is a research organization that “uses data and computational methods to work towards gender and racial equity, particularly as they relate to space and place”. They are members of the Design Justice Network, which seeks to bring together people interested in research that centers marginalized people and aims to address the ways research and data are used to cause harm. These groups provide examples for how to engage in data feminist and data-justice inspired research and action.

Learning how to use tools like SPSS and NVivo is an important aspect of data-related research, but thinking about the seven principles of Data Feminism can inspire us to think critically about our work and engage more fully in our communities.  For more information about data feminism, check out these resources:

Introductions: What is Digital Scholarship, anyways?

This is the beginning of a new series where we introduce you to the various topics that we cover in the Scholarly Commons. Maybe you’re new to the field or you’re just to the point where you’re just too afraid to ask… Fear not! We are here to take it back to the basics!

What is digital scholarship, anyways?

Digital scholarship is an all-encompassing term and it can be used very broadly. Digital scholarship refers to the use of digital tools, methods, evidence, or any other digital materials to complete a scholarly project. So, if you are using digital means to construct, analyze, or present your research, you’re doing digital scholarship!

It seems really basic to say that digital scholarship is any project that uses digital means because nowadays, isn’t that every project? Yes and No. We use the term digital quite liberally…If you used Microsoft Word to just write your essay about a lab you did during class – that is not digital scholarship however if you used specialized software to analyze the results from a survey you used to gather data then you wrote about it in an essay that you then typed in Microsoft Word, then that is digital scholarship! If you then wanted to get this essay published and hosted in an online repository so that other researchers can find your essay, then that is digital scholarship too!

Many higher education institutions have digital scholarship centers at their campus that focus on providing specialized support for these types of projects. The Scholarly Commons is a digital scholarship space in the University Main Library! Digital scholarship centers are often pushing for new and innovative means of discovery. They have access to specialized software and hardware and provide a space for collaboration and consultations with subject experts that can help you achieve your project goals.

At the Scholarly Commons, we support a wide array of topics that support digital and data-driven scholarship that this series will cover in the future. We have established partners throughout the library and across the wider University campus to support students, staff, and faculty in their digital scholarship endeavors.

Here is a list of the digital scholarship service points we support:

You can find a list of all the software the Scholarly Commons has to support digital scholarship here and a list of the Scholarly Commons hardware here. If you’re interested in learning more about the foundations of digital scholarship follow along to our Introductions series as we got back to the basics.

As always, if you’re interested in learning more about digital scholarship and how to  support your own projects you can fill out a consultation request form, attend a Savvy Researcher Workshop, Live Chat with us on Ask a Librarian, or send us an email. We are always happy to help!

Simple NetInt: A New Data Visualization Tool from Illinois Assistant Professor, Juan Salamanca

Juan Salamanca Ph.D, Assistant Professor in the School of Art and Design at the University of Illinois Urbana-Champaign recently created a new data visualization tool called Simple NetInt. Though developed from a tool he created a few years ago, this tool brings entirely new opportunities to digital scholarship! This week we had the chance to talk to Juan about this new tool in data visualization. Here’s what he said…

Simple NetInt is a JavaScript version of NetInt, a Java-based node-link visualization prototype designed to support the visual discovery of patterns across large dataset by displaying disjoint clusters of vertices that could be filtered, zoomed in or drilled down interactively. The visualization strategy used in Simple NetInt is to place clustered nodes in independent 3D spaces and draw links between nodes across multiple spaces. The result is a simple graphic user interface that enables visual depth as an intuitive dimension for data exploration.

Simple NetInt InterfaceCheck out the Simple NetInt tool here!

In collaboration with Professor Eric Benson, Salamanca tested a prototype of Simple NetInt with a dataset about academic publications, episodes, and story locations of the Sci-Fi TV series Firefly. The tool shows a network of research relationships between these three sets of entities similar to a citation map but on a timeline following the episodes chronology.

What inspired you to create this new tool?

This tool is an extension of a prototype I built five years ago for the visualization of financial transactions between bank clients. It is a software to visualize networks based on the representation of entities and their relationships and nodes and edges. This new version is used for the visualization of a totally different dataset:  scholarly work published in papers, episodes of a TV Series, and the narrative of the series itself. So, the network representation portrays relationships between journal articles, episode scripts, and fictional characters. I am also using it to design a large mural for the Siebel Center for Design.

What are your hopes for the future use of this project?

The final goal of this project is to develop an augmented reality visualization of networks to be used in the field of digital humanities. This proof of concept shows that scholars in the humanities come across datasets with different dimensional systems that might not be compatible across them. For instance, a timeline of scholarly publications may encompass 10 or 15 years, but the content of what is been discussed in that body of work may encompass centuries of history. Therefore, these two different temporal dimensions need to be represented in such a way that helps scholars in their interpretations. I believe that an immersive visualization may drive new questions for researchers or convey new findings to the public.

What were the major challenges that came with creating this tool?

The major challenge was to find a way to represent three different systems of coordinates in the same space. The tool has a universal space that contains relative subspaces for each dataset loaded. So, the nodes instantiated from each dataset are positioned in their own coordinate system, which could be a timeline, a position relative to a map, or just clusters by proximities. But the edges that connect nodes jump from one coordinate system to the other. This creates the idea of a system of nested spaces that works well with few subspaces, but I am still figuring out what is the most intuitive way to navigate larger multidimensional spaces.

What are your own research interests and how does this project support those?

My research focuses on understanding how designed artifacts affect the viscosity of social action. What I do is to investigate how the design of artifacts facilitates or hinders the cooperation of collaboration between people. I use visual analytics methods to conduct my research so the analysis of networks is an essential tool. I have built several custom-made tools for the observation of the interaction between people and things, and this is one of them.

If you would like to learn more about Simple NetInt you can find contact information for Professor Juan Salamanca here and more information on his research!

If you’re interested in learning more about data visualizations for your own projects, check out our guide on visualizing your data, attend a Savvy Researcher Workshop, Live Chat with us on Ask a Librarian, or send us an email. We are always happy to help!

The Art Institute of Chicago Launches Public API

Application Programming Interfaces, or APIs, are a major feature of the web today. Almost every major website has one, including Google Maps, Facebook, Twitter, Spotify, Wikipedia, and Netflix. If you Google the name of your favorite website and API, chances are you will find an API for it.

Last week, another institution joined the millions of public APIs available today: The Art Institute of Chicago. While they are not the first museum to release a public API, their blog article announcing the release of the API states that it holds the largest amount of data released to the public through an API from a museum. It is also the first museum API to hold all of their public data in one location, including data about their art collection, every exhibition ever held by the Institute since 1879, blog articles, full publication texts, and more than 1,000 gift shop products.

But what exactly is an API, and why should we be excited that we can now interact with the Art Institute of Chicago in this way? An API is basically a particular way to interact with a software application, usually a website. Normally when you visit a website in a browser, such as wikipedia.org, the browser requests an HTML document in order to render the images, fonts, text, and many other bits of data related to the appearance of the web page. This is a useful way to interact as a human consuming information, but if you wanted to perform some sort of data analysis on the data it would be much more difficult to do it this way. For example, if you wanted to answer even a simple question like “Which US president has the longest Wikipedia article?” it would be time consuming to do it the traditional way of viewing webpages.

Instead, an API allows you or other programs to request just the data from a web server. Using a programming language, you could use the Wikipedia API to request the text of each US President’s Wikipedia page and then simply calculate which text is the longest. API responses usually come in the form of data objects with various attributes. The format of these objects vary between websites.

“A Sunday on La Grande Jatte” by Georges Seurat, the data for which is now publicly available from the Art Institute of Chicago’s API.

The same is now true for the vast collections of the Art Institute of Chicago. As a human user you can view the web page for the work “A Sunday on La Grande Jatte” by Georges Seurat at this URL:

 https://www.artic.edu/artworks/27992/a-sunday-on-la-grande-jatte-1884

If you wanted to get the data for this work through an API to do data analysis though, you could make an API request at this URL:

https://api.artic.edu/api/v1/artworks/27992

Notice how both URLs contain “27992”, which is the unique ID for that artwork.

If you open that link in a browser, you will get a bunch of formatted text (if you’re interested, it’s formatted as JSON, a format that is designed to be manipulated by a programming language). If you were to request this data in a program, you could then perform all sorts of analysis on it.

To get an idea of what’s possible with an art museum API, check out this FiveThirtyEight article about the collections of New York’s Metropolitan Museum of Art, which includes charts of which countries are most represented at the Met and which artistic mediums are most popular.

It is possible now to ask the same questions about the Art Institute of Chicago’s collections, along with many others, such as “what is the average size of an impressionist painting?” or “which years was surrealist art most popular?” The possibilities are endless.

To get started with their API, check out their documentation. If you’re familiar with Python and possibly python’s data analysis library pandas, you could check out this article about using APIs in python to perform data analysis to start playing with the Art Institute’s API. You may also want to look at our LibGuide about qualitative data analysis to see what you could do with the data once you have it.

Holiday Data Visualizations

The fall 2020 semester is almost over, which means that it is the holiday season again! We would especially like to wish everyone in the Jewish community a happy first night of Hanukkah tonight.

To celebrate the end of this semester, here are some fun Christmas and Hanukkah-related data visualizations to explore.

Popular Christmas Songs

First up, in 2018 data journalist Jon Keegan analyzed a dataset of 122 hours of airtime from a New York radio station in early December. He was particularly interested in discovering if there was a particular “golden age” of Christmas music, since nowadays it seems that most artists who release Christmas albums simply cover the same popular songs instead of writing a new song. This is a graph of what he discovered:

Based on this dataset, 65% of popular Christmas songs were originally released in the 1940s, 50s, and 60s. Despite the notable exception of Mariah Carey’s “All I Want for Christmas is You” from the 90s, most of the beloved “Holiday Hits” come from the mid-20th century.

As for why this is the case, the popular webcomic XKCD claims that every year American culture tries to “carefully recreate the Christmases of Baby Boomers’ childhoods.” Regardless of whether Christmas music reflects the enduring impact of the postwar generation on America, Keegan’s dataset is available online to download for further exploration.

Christmas Trees

Last year, Washington Post reporters Tim Meko and Lauren Tierney wrote an article about where Americans get their live Christmas trees from. The article includes this map:

The green areas are forests primarily composed of evergreen Christmas trees, and purple dots represent Choose-and-cut Christmas tree farms. 98% of Christmas trees in America are grown on farms, whether it’s a choose-and-cut farm where Americans come to select themselves or a farm that ships trees to stores and lots.

This next map shows which counties produce the most Christmas trees:

As you can see, the biggest Christmas tree producing areas are New England, the Appalachians, the Upper Midwest, and the Pacific Northwest, though there are farms throughout the country.

The First Night of Hanukkah

This year, Hanukkah starts tonight, December 10, but its start date varies every year. However, this is not the case on the primarily lunar-based Hebrew Calendar, in which Hanukkah starts on the 25th night of the month of Kislev. As a result, the days of Hanukkah vary year-to-year on other calendars, particularly the solar-based Gregorian calendar. It can occur as early as November 28 and as late as December 26.

In 2016, Hannukah began on December 24, Christmas Eve, so Vox author Zachary Crockett created this graphic to show the varying dates on which the first night of Hannukah has taken place from 1900 to 2016:

The Spelling of Hanukkah

Hanukkah is a Hebrew word, so as a result there is no definitive spelling of the word in the Latin alphabet I am using to write this blog post. In Hebrew it is written as חנוכה and pronounced hɑːnəkə in the phonetic alphabet.

According to Encyclopædia Britannica, when transliterating the pronounced word into English writing, the first letter ח, for example, is pronounced like the ch in loch. As a result, 17th century transliterations spell the holiday as Chanukah. However, ח does not sounds like the way ch does when its at the start of an English word, such as in chew, so in the 18th century the spelling Hanukkah became common. However, the H on its own is not quite correct either. More than twenty other spelling variations have been recorded due to various other transliteration issues.

It’s become pretty common to use Google Trends to discover which spellings are most common, and various journalists have explored this in past years. Here is the most recent Google search data comparing the two most commons spellings, Hanukkah and Chanukah going back to 2004:

You can also click this link if you are reading this article after December 2020 and want even more recent data.

As you would expect, the terms are more common every December. It warrants further analysis, but it appears that Chanukah is becoming less common in favor of Hanukkah, possibly reflecting some standardization going on. At some point, the latter may be considered the standard term.

You can also use Google Trends to see what the data looks like for Google searches in Israel:

Again, here is a link to see the most recent version of this data.

In Israel, it also appears as though the Hanukkah spelling is also becoming increasingly common, though early on there were years in which Chanukah was the more popular spelling.


I hope you’ve enjoyed seeing these brief explorations into data analysis related to Christmas and Hanukkah and the quick discoveries we made with them. But more importantly, I hope you have a happy and relaxing holiday season!

Mapping Native Land

Fall break is fast approaching and with it will be Thanksgiving! No matter what your traditions are, we all know that this year’s holiday season will look a little bit different. As we move into the Thanksgiving holiday, I wanted to share a mapping project to give thanks and recognize the native lands we live on.

Native Land is an open-source mapping project that shows the indigenous territories across the world. This interactive map allows you to input your address or click and explore to determine what indigenous land you reside on. Not only that but Native Land shares educational information about these nations, their languages, or treaties.  They also include a Teacher’s Guide for various wide age range from children to adults. Users are able to export images of their map, too!

Native Land Map

NativeLand.ca Map Interface

Canadian based and indigenous-led, Native Land Digital aims to educate and bring awareness to the complex histories of the land we inhibit. This platform strives to create conversations about indigenous communities between those with native heritage as well as those without. Native Land Digital values the sacredness of land and they use this platform to honor the history of where we reside. Learn more about their mission and impact on their “Why It Matters” page.

Native Land uses MapBox and WordPress to generate their interactive map. MapBox is an open source mapping platform for custom designed maps. Native Land is available as an App for iOS and Android and they have a texting service, as well. You can find more information about how it works here.

If you’d like to learn more about mapping software, the Scholarly Commons has Geographic Information Systems (GIS) software, consultations, and workshops available. The Scholarly Commons webpage on GIS is a great place to get started.

 The University of Illinois is a land-grant institution and resides on Kickapoo territory. Where do you stand?

University of Illinois Urbana-Champaign Land Acknowledgement Statement

As a land-grant institution, the University of Illinois at Urbana-Champaign has a responsibility to acknowledge the historical context in which it exists. In order to remind ourselves and our community, we will begin this event with the following statement. We are currently on the lands of the Peoria, Kaskaskia, Piankashaw, Wea, Miami, Mascoutin, Odawa, Sauk, Mesquaki, Kickapoo, Potawatomi, Ojibwe, and Chickasaw Nations. It is necessary for us to acknowledge these Native Nations and for us to work with them as we move forward as an institution. Over the next 150 years, we will be a vibrant community inclusive of all our differences, with Native peoples at the core of our efforts.

Tomorrow! Big Ten Academic Alliance GIS Conference 2020

Save the date! Tomorrow is the Big Ten Academic Alliance (BTAA) GIS Conference 2020. This event is 100% virtual and free of charge to anyone who wants to engage with the community of GIS specialists and researchers from Big Ten institutions.

The conference kicks off tonight with a GIS Day Trivia Night event at 5:30PM CST! There is a Map Gallery that is open to view from now until November 13th, 2020. The gallery features research that incorporates GIS from Big Ten institutions, so be sure to check it out! There will be lighting talks, presentations, social hours, and a keynote address from Dr. Orhun Aydin, Senior Researcher at Esri, so be sure to check out the full schedule of events and register here.

This event is a great way to network and learn more applications of GIS for research. If you are interested in GIS but don’t know where to start, this event is a great place to get inspired. If you are an experienced GIS researcher, this event is an opportunity to meet colleagues and learn from your peers. Overall this is a great event for anyone interested in GIS and the perfect way to start Geography Awareness Week, which goes from November 15th-21st this year!

Statistical Analysis at the Scholarly Commons

The Scholarly Commons is a wonderful resource if you are working on a project that involves statistical analysis. In this post, I will highlight some of the great resources the Scholarly Commons has for our researchers. No matter what point you are at in your project, whether you need to find and analyze data or just need to figure out which software to use, the Scholarly Commons has what you need!

Continue reading

GIS Resources for Distance Learning and Working from Home

Planet Earth wearing a doctor's maskThe past couple of weeks have been a whirlwind for everyone as we’ve all sought to adjust to working, attending school, socializing, and just carrying out our daily lives online. Here at the Scholarly Commons, we’ve been working hard to ensure that this transition is as smooth as possible for those of you relying on specialized software to conduct your research or do your classwork. That’s why this week we wanted to highlight some resources essential to anyone using or teaching with GIS as we work through this period of social distancing. 

Continue reading

Introducing the Illinois Open Publishing Network: Digital Publishing from the University of Illinois Library

The face of scholarly publishing is changing and libraries are taking on the role of publisher for many scholarly publications, including those that don’t fit the mold of traditional presses. Initiatives at the University of Illinois at Urbana-Champaign are working to address strides in digital publishing, increasing momentum for open access research, and the need for sustainable publishing models. This year alone, The Illinois Open Publishing Network (IOPN) has released five new open-access multi-modal scholarly publications. IOPN represents a network of publications and publishing initiatives hosted at the University Library, working towards high-quality open-access scholarship in digital media. IOPN assists authors with a host of publishing services—copyright, peer review, and even providing assistance in learning the publishing tools themselves and strategizing their publications in what for many is a new mode of writing.

Continue reading