Of Maps and Memes: A Bit of Cartographic Fun

Co-Authored by Zhaneille Green

We use maps to communicate all the time. Historically, they have been used to navigate the world and to stand as visual, physical manifestations of defined spaces and places. What do you think of when we say “map”: a topographic map1 a transportation map2 or a city map3?

You can use maps to represent just about anything you want to say, far beyond these typical examples. We wrote this blog to invite you to have a little cartographic fun of your own.

If you’re on any kind of social media, you’ve probably seen maps like the one below, highlighting anything from each state’s favorite kind of candy to what the continental US would look like if all of the states’ borders were drawn along rivers and mountain ranges. People definitely seem to enjoy sharing these maps, curious to see what grocery store most people shop at in their home state, or laughing about California’s lack of popularity with the states in the surrounding area.

Map of most popular halloween candy in each US state. View the interactive version on candystore.com

Try your hand at creating your own silly map by using our programs in the Scholarly Commons. Start a war by creating a map that ranks the Southern states with the best barbecue using Adobe Photoshop or Illustrator, or explore a personal hobby like creating a map of all the creatures Sam & Dean Winchester met through the 15 seasons of Supernatural using ArcGIS.

If you’re feeling a bit more serious, don’t fret! Even if these meme-like maps aren’t portraying the most critical information, they do demonstrate how maps can be a great tool for data visualization. In many ways, location can make data feel more personal, because we all have personal connections to place. Admit it: the first thing you checked on the favorite candy map was your home state. Maps also tend to be more visually engaging than a simple table with, for example, states in one column and favorite animal in the other.

Using geotagging data, each dot represents where a photo was taken: blue for locals, red for tourists, and yellow for unknown. Locals and Tourists #1 (GTWA #2): London. Erica Fischer, CC BY-SA 2.0 via Flickr.

Regardless of what you want to map, the Scholarly Commons has the tools to help bring your vision to life. Learn about software access on our website, and check out these LinkedIn Learning resources for an introduction to ArcGIS Online or Photoshop, which are available with University of Illinois login credentials. If you need more assistance, feel free to ask us questions. Go forth and meme!

You can’t analyze data if you ain’t cute: Data Visualization

Meme from Reno 911 with the original text stating "You can't fight crime if you ain't cute" but the "fight crime" is crossed out and above is written "analyze Data"

Humans are highly visual creatures, even more so in our hyper-graphic world of ultra-filtered images and short aesthetic videos. Great ideas are ignored into oblivion in favor of shiny graphics and slick illustrations, so even data analysts need to be aware of how they present their findings. A well-designed infographic will be much more impactful, widely shared, and remembered than columns and rows of numbers. Even a simple graph can help people better come to conclusions and absorb information than they ever would with just numbers alone. People who can not only crunch numbers but also create stunning communications about those numbers are a real asset on the job market, so it behooves any hopeful data analyst to at least learn the basics of visualization.

LinkedIn Learning 

  1. Learning Data Visualization 
    1. This course clocks in at just under two hours and aims to give learners the scaffolding for a strong understanding of data visualization. Geared towards true beginners, this course challenges learners to think about their data, audience, and goals to create visuals that maximize impact. Learners will also learn about visual perception and chart selection strategies, which in turn can set users up for a deep understanding of visualization. 
  1. Data Visualization: Best Practices 
    1. A poorly designed visualization can be criminally misleading, causing viewers to come to biased and inaccurate conclusions that can negatively affect everything from their investment choices to their health practices. This 98-minute course will give learners the tools to avoid common visualization missteps and the tricks to make their visualizations better fit their data, audience, and goals. This course uses Adobe Illustrator, so those who are unfamiliar with the program should first check out this quick start introduction to the program on LinkedIn Learning. Remember, UIUC students have free access many Adobe products, including Adobe Illustrator!  
  2. Excel Data Visualization: Mastering 20+ Charts and Graphs 
    1. Once again, we will focus on this data skillset within the context of a familiar software, Excel. While it is not the first software that comes to mind when thinking about visualization, Excel has surprisingly powerful visualization functions that will certainly come in handy when analyzing data. This course covers the humble pie chart to the complex geospatial heat maps and 3D power maps. In just two hours, learners will be able to quickly take their data from tables to graphics.  

O’Reilly Books and Videos

Make sure you are logged into O’Reilly before clicking these links. The best way to login is to go to the library catalog’s record for a book offered through O’Reilly (Like this book on Python) and then follow the instructions on this Libguide to log in.

  1. Fundamentals of Data Visualization 
    1. This handy book goes deep into the technical aspects of data visualizations. Learners will learn basic concepts like color theory along side more complex practices like redundant coding. This eBook also provides a helpful directory of visualizations so users can quickly find visualizations that fit their needs.
  2. The Data Visualization Lifecycle 
    1. This 4-hour course covers the basics of data visualization but looks at the actual process of professional data visualization that the other resources on this list do not address. Learners will gain technical skills in building visualization and a broader understanding of data visualization as a collaborative process based on external and internal stakeholders and audiences. This course teaches users how to interact with different data cultures, collaborate with colleagues, and how to treat visualization as a product.
  3. Interactive Data Visualization for the Web 
    1. Interactive data visualization is a trending skill in almost all fields that rely on data analysis and visualization of any kind. Allowing others to interact with your data and its visualization can make the data more accessible and memorable than ever before. This book gives users the skills to make interactive visuals with the fundamental concepts and methods of D3, the most powerful JavaScript library for expressing data visually in a web browser. Even those who are new to web programming will learn the basics of HTML, CSS, JavaScript, and SVG alongside the data visualization skills.

In the Catalog 

  1. #MakeoverMonday : improving how we visualize and analyze data, one chart at a time by Andrew Michael Kriebel and Eva Katharina Murray 
    1. Hashtags can be the start of beautiful movements, as those in the data analysis field learned as their #MakeoverMonday tag sparked a complete reimagining of how professionals approach data visualization. Readers will learn concepts of data visualization while viewing the real-life results of these concepts as shown by the hashtag-inspired graphics. #MakeoverMonday shows readers the “many ways to walk the line between simple reporting and design artistry to create exactly the visualization the situation requires”.  
  2. The functional art : an introduction to information graphics and visualization by Alberto Cairo 
    1. If there are data visualization celebrities, then Alberto Cairo is an A-lister. Known for his visualization journalism, he is a self-described information designer who has become famous for his gripping visualizations that stand as both formal art and excellent communication of data. This book allows users to learn the ins and outs of design all while strolling through a gallery of amazing visualization examples. This resource leans heavily on the theory of art and design, which makes it stand out from the other resources on this list. Alberto Cairo’s other works, The Truthful Art: data, charts, and maps for communication and How Charts Lie : getting smarter about visual information  are also worthwhile and insightful reads!  
  3. Data visualisation : a handbook for data driven design by Andy Kirk 
    1. Pivoting back to the more practical side of things, this handbook offers clear and useful processes for data driven designing. Readers will learn more about the visualization workflow, formulating briefs, working with data in the context of visualization, representing data accurately, integrating interactivity, and visualization literacy. 

And that’s it, folks!

With these visualization resources, the Winter Break Data Analysis series is ending on a pretty note. Hopefully, you have been able to keep your mind sharp and develop a new skill over the last month, but even if the timing was off, these resources and many more are available to students all year long! Did you enjoy one of these resources or posts? Do you have questions about any of these topics or suggestions for future series? Please tell us about it at sc@library.illinois.edu or on twitter at @ScholCommons. Thank you for joining this series and happy analyzing!  

*hacker voice* “I’m in” – Coding and Software for Data Analysis

While data analysis has existed in one form or another for centuries, its modern concept is highly tied to a digital environment, which means that people who are looking to move into the data science field will undoubtedly need some technology skills. In the data field, the primary coding languages include Python, R, and SQL. Software is a bit more complicated, with numerous different programs and services used depending on the situation, including Power BI, Spark, SAS, Excel, to name a few. While this is overwhelming, remember that it is not important to become an expert in all of the languages and software. Becoming skilled in one language and a few of the software options, depending on your interest or on the in-demand skills on job listings, will give you the transferable skills to quickly pick up the other languages and software as needed. If this still seems to be  an overwhelming prospect, remember that the best way to eat an elephant is one bite at a time. Take your time, break up the task, and focus on one step at a time! 

LinkedIn Learning

  1. Python for Data Science Essential Training Part 1 
    1.  This 6 hour course guides users through an entire data science project that includes web scrapers, data cleaning and reformatting, generate visualizations, preform simple data analysis and create interactive graphs. The project will have users coding in Python with confidence and give learners a foundation in the Plotly library. Once completed, learners will be able to design and run their own data science projects.  
  1. R for Excel Users 
    1. With Excel being a familiar platform for many interested in data, it is an ideal bridge to more technical skills, like coding in the R language. This course is specifically designed for data analytics with its focus on statistical tasks and operations. It will take user’s Excel skills to another level while also laying a solid foundation for their new R skills. Users will be able to switch between Excel and the R Desctools package to complete tasks seamlessly, using the best of each software to calculate descriptive statistics, run bivariate analyses, and more. This course is for people who are truly proficient in Excel but new to R, so if you need to brush up your Excel skills, go back to the first post in this series and go over the Excel resources!   
  1. SQL Essential Training 
    1. SQL is the language of relational databases, so it is of interest to anyone looking to expand their data handling skills. This training is designed to give data wranglers the tools they need to use SQL effectively using the SQLiteStudio Software. Learners will soon be able to create tables, define relationships, manipulate strings, use triggers to automate actions, and use sub selects and views. Real world examples are used throughout the course and learners will finish the course by building their own SQL application. If you want a gentler introduction to SQL, check out our earlier post on SQL Murder Mystery  

O’Reilly Books and Videos (Make sure to follow these instructions for logging in!) 

  1. Data Analysts Toolbox – Excel, Python, Power BI, Alteryx, Qlik Sense, R, Tableau 
    1. This 46 hour course is not for the faint of heart, but by the end, users will be a Swiss army knife data analyst. This isn’t for true beginners, but rather people who are already familiar with the basic data analysis concepts and have a good grasp of Excel. It is included in this list because it is a great source for learning the basics of the myriad of software and programming languages that data analysts are expected to know, all in one place. The course starts with teaching users about advanced pivot tables, so if users have already mastered the basic pivot table, they should be ready for this course.  
  1. Programming for Data Science: Beginner to Intermediate 
    1. This is an expert curated playlist of courses and book chapters that is designed to help people who are familiar with the math side of data analysis, but not the computer science side. This playlist gives users an introduction to NumPy, Pandas, Python, Spark and other technical data skills. Some previous experience with coding may be helpful in this course, but patience will make up for lack of experience.  

In the Catalog

  1. Python crash course : a hands-on, project-based introduction to programming 
    1. Python is often lauded as one of the most approachable coding languages to learn and its functionality makes it popular in the data science field. So it is no surprise that there are a lot of resources on and off campus for learning Python. This approachable guide is just one of the many resources available to UIUC students, but it stands out with its contents and overall outcomes. “Python Crash Course” covers general programming concepts, Python fundamentals, and problem solving. Unlike some other resources, this guide focuses on many of Python’s uses, not just its data analytics capabilities, which can be appealing to people who want to be more versatile with their skills. However, it is the three projects that make this resource stand out from the rest. Readers will be guided in how to create a simple video game, use data visualization techniques to make graphs and charts, and build an interactive web application.  
  1. The Book of R : a first course in programming and statistics 
    1. R is the most popular coding language for statistical analysis, so it’s clearly important for data analysts to learn. The Book of R is a comprehensive and beginner friendly guide designed for readers who have no previous programming experience or a shaky mathematical foundation as readers will learn both concurrently through the book’s lessons. Starting with writing simple programs and data handling skills, learners will then move forward to producing statistical summaries of data, preforming statistical tests and modeling, create visualizations with contributed packages like ggplot2 and ggvis, write data frames, create functions, and use variables, statements, and loops; statistical concepts like exploratory data analysis, probabilities, hypothesis tests, and regression modeling, and how to execute them in R; how to access R’s thousands of functions, libraries, and data sets; how to draw valid and useful conclusions from your data; and how to create publication-quality graphics of your results.  

Join us next week for our final installment of the Winter Break Data Analysis series: “You can’t analyze data if you ain’t cute: Data Visualization for Data Analysis”    

Learn Data Analysis: What’s Math Got to do With It?

What’s math got to do with data analysis? Unfortunately, for those of us who are chronic humanities people, math has a lot to do with it. This might seem like a daunting barrier, especially if the last time you looked at a math problem was in a high school algebra class. This is also true for learners who are already skilled with the technological aspect of data analysis but are not familiar with the mathematics side of thing. However, there are so many resources available to help self-directed students learn the basics and get up to speed for the purposes of data analytics! Using the resource platforms described in last week’s blog post, these resources will have even chronic humanities people playing with numbers in no time!  

LinkedIn Learning 

  • Learning Everyday Math 
    • Look, some of us did not absorb or retain the basic math lessons of our early education. That’s okay! This is a no-judgment zone, and this 2 hour course will help users learn how to calculate percentages for tips and taxes, compare prices while shopping, find the area and volume for home-improvement projects, and learn the basics of probability.  
  • Become a Data Scientist 
    • This 21 hour Learning Path is made up of 12 courses that focus more on the statistical side of data analysis than the technical steps of the process. This course is more geared toward users with experience in IT and computers, so it is not the best for people who do not have a strong technical background. However, for those who are familiar with computer science and want to pivot into data analytics, this is an ideal curriculum.   

O’Reilly Books and Videos (Make sure to follow these instructions for logging in!)

  • Essential Math for Data Science 
    • This eBook mixes basic coding skills with math lessons to cover the essential analytical skills needed for data science work. Relevant aspects of calculus, probability, linear algebra, and statistics and how they apply to techniques like linear regression, logistic regression, and neural networks are covered in plain English. The chapters include exercises with answers for self-assessment as well as career advice for budding data analysts.   
  • Statistics for Data Science using Python 
    • Besides books, O’Reilly also has expert curated playlists that consist of chapters of several different books, videos and more. This is a great way of getting the most out of several resources to focus on a single skill. This playlist covers the essential statistic concepts found in 11 different resources. Learn about Normal distribution, hypothesis tests, p-values, central limit theorem and more without having to dig for the resources yourself!  
  •   Data Science 101: Methodology, Python, and Essential Math 
    • On top of books and playlists, O’Reilly also has video-based courses. This course covers a lot of data analytics basics, but those who want to focus on the math aspect will benefit from Chapters 15-19. These chapters cover linear algebra, mathematical structures, probability, random variables and multiple variables, and statistical inference.  

In the Catalog 

Be sure to come back next week for the thrilling continuation with “*hacker voice* I’m In: Coding and Software for Data Analysis! 

Learn Data Analysis Over the Winter Break!

In the last twenty years, humanity has become super proficient in collecting data. Therefore, It is no surprise that the skills to analyze that massive collection of data is in ever increasing demand on the job market. For those of us who are worried about future job prospects, learning these in-demand data analysis skills seems like a logical next step, even if they do not fit into our current degree program. Fortunately, the university has a plethora of self-guided resources available for students looking to build their data skills. What better time to use these resources than during the long winter break?! Over the next few weeks, this blog will delve into the available resources that cover the three main skill areas of data analysis: math, coding and software, and visualization. 

Before diving into those areas, it is wise briefly look at the foundations of data analysis as well as the resources that will be showcased this month. Take this week to get acquainted with these different resource platforms and learn a few starting skills! 

LinkedIn Learning

All UIUC students have access to LinkedIn Learning. Simply login with your NetID credentials, just be sure you are logging into LinkedIn Learning, not the main LinkedIn site.  You will have access to a whole trove of high-quality videos and courses designed to help you learn career-building skills. Not only are the videos professional grade, but they often have accompanying exercise files, learning groups, certificates and exams. The collection ranging from short 5-15 minute videos that teach specific function or skills to dozen hours long courses that are designed to give a comprehensive foundation. The best part of using LinkedIn Learning is that the course and certificates completed here are then displayed on personal LinkedIn pages, showing potential employers that users have the skills they are looking for. 

  • Data Analytics for Students
    • This course is for the true data analytics babies out there. This introduction gives users the basic understanding of what data analytics is, the skills users will need to be successful,  the software and tools common in the field and what careers in data analytics look like. This 1 hour course is well worth the time for those who aren’t sure where to start their data journey.
  • Career Essentials in Data Analysis by Microsoft and LinkedIn
    • Discover the skills needed for a career in data analysis. Learn foundational concepts used in data analysis and practice using software tools for data analytics and data visualization. This is a Learning Path made up of 3 different courses that has about 9 hours of content for students to work through on their own schedule. The courses have exams for self-evaluation as well as a final exam that earns users a professional certificate. 
  • Excel: Managing and Analyzing Data
    • We have all put “proficient in Excel” on a resume, but wouldn’t it be nice if that was actually true? Unlike other data analytics courses, this course focuses on one program that most modern users are already familiar with but do not truly harness the power of. This is ideal for baby data analysts as it doesn’t bombard learners with a whole new software ecosystem but still teaches the transferable skills all data analysts use. Running at just under 4 hours, this course efficiently and comprehensively teaches users impressive data analytics skills. 

O’Reilly Books and Videos

This is a lesser known resource available at UIUC but it has some great online books and videos that tend to focus on the scientific and technical fields. Logging in is not straightforward, unfortunately. The best way to get there is to go to the Library Catalog’s record for a book offered through O’Reilly (Like this book on Python) and then follow the instructions on this LibGuide to log in. Once you are in, you will see a sizable collection of e-books and courses. The materials skew towards the more experienced users, but there are a few resources that will help baby data folks really develop their skills. 

Library Catalog

Learn data science the old fashion way, with books! There are a lot of books available at UIUC libraries for students who want to teach themselves a new skill. Here are a few choices for people looking for an easy introduction to data analysis. The Scholarly Commons collection is easily accessible and found just to the right of the main entrance to the stacks. 

Be sure to check back here next week for our next installment, “What’s Math got to do with it?”!

Making Infographics in Canva: a Guide and Review

Introduction

If you’ve ever had to design a poster for class, you’re probably familiar with Canva. This online and app-based graphic design tool, with free and subscription-based versions, features a large selection of templates and stock graphics that make it pretty easy to create decent-looking infographics. While it is far from perfect, the ease of use makes Canva worth trying out if you want to add a bit of color and fun to your data presentation.

Getting Started

Starting with a blank document can be intimidating, especially for someone without any graphic design experience. Luckily, Canva has a bunch of templates to help you get started.

Canva infographic templates

I recommend picking a template based on the color scheme and general aesthetic. It’s unlikely you’ll find a template that looks exactly how you want, so you can think of a template as a selection of colors, fonts, and graphics to use in your design, rather than something to just copy and paste things into. For example, see the image below – I recently used the template on the left to create the infographic on the right.

An infographic template compared to the resulting infographic

General Design Principles

Before you get started on your infographic, it’s important to remember some general design guidelines:

  1. Contrast. High levels of contrast between your background and foreground help keep everything legible.
  2. Simplicity. Too many different colors and fonts can be an eyesore. Stick to no more than two fonts at a time.
  3. Space. Leave whitespace to keep things from looking cluttered.
  4. Alignment and balance. People generally enjoy looking at things that are lined up neatly and don’t have too much visual weight on one side or another.
An exaggerated example of a design that ignores the above advice.

Adding Graphs and Graphics

Now that you have a template in hand and graphic design principles in mind, you can start actually creating your infographic. Under “Elements,” Canva includes several types of basic charts. Once you’ve added a chart to your graphic, you can edit the data associated with the chart directly in the provided spreadsheet, by uploading a csv file, or by linking to a google spreadsheet.

Canva interface for creating charts

The settings tab allows you to decide whether you want the chart to include a legend or labels. The options bar at the top allows for further customization of colors and bar or dot appearance. Finally, adding a few simple graphics from Canva’s library such as shapes and icons can make your infographic more interesting. 

Examples of charts available in Canva, with a variety of customizations.

Limitations and Frustrations

The main downsides to Canva are the number of features locked behind a paywall and the inability to see only the free options. Elements cannot be filtered by price and it seems that more and more graphics are being claimed by Canva Pro, so searching for graphics can be frustrating. Templates can be filtered, but it will still bring up results where the template itself is free, but there are paid elements within the template. So, you might choose a template based on a graphic that you really like, only to find out that you need a Canva Pro subscription to include that graphic.

The charts in Canva also have limitations. Pie charts do not allow for the selection of colors for each individual slice; you have to pick one color, and Canva will generate the rest. However, if you want to have more control over your charts, or wish to include more complicated data representations, you can upload charts to Canva, which even supports transparency.

Conclusion

As mentioned above, Canva has its downsides. However, Canva’s templates, graphics, and charts still make it a super useful tool for creating infographics that are visually appealing. Try it out the next time you need to present some data!

There’s been a Murder in SQL City!

by Libby Cave
Detective faces board with files, a map and pictures connected with red string.

If you are interested in data or relational databases, then you have heard of SQL. SQL, or Structured Query Language, is designed to handle structured data in order to assist in data query, data manipulation, data definition and data access control. It is a very user-friendly language to learn with a simple code structure and minimal use of special characters. Because of this, SQL is the industry standard for database management, and this is reflected in the job market as there is a strong demand for employees with SQL skills.  

Enter SQL Murder Mystery

In an effort to promote the learning of this valuable language, Knight Labs, a specialized subsidiary of Northwestern University, created SQL Murder Mystery. Combining the known benefits of gamification and the popularity of whodunit detective work, SQL Murder Mystery aims to help SQL beginners become familiar with the language and have some fun with a normally dry subject. Players take on the role of a gumshoe detective tasked with solving a murder. The problem is you have misplaced the crime scene report and you now must dive into the police department’s database to find the clues. For true beginners with no experience, the website provides a walkthrough to help get players started. More experienced learners can jump right in and practice their skills. 

I’m on the case!

I have no experience with SQL but I am interested in database design and information retrieval, so I knew it was high time that I learn the basics. As a fan of both games and detective stories, SQL Murder Mystery seemed like a great place to start. Since I am a true beginner, I started with the walkthrough. As promised on the website, this walkthrough did not give me a complete, exhaustive introduction to SQL as a language, but instead gave me the tools needed to get started on the case. SQL as a language, relational databases and Entity Relationship Diagrams (ERD) were briefly explained in an approachable manner. In the walk through, I was introduced to vital SQL functions like “Select:, “Where”, wildcards, and “Between”. My one issue with the game was in the joining tables section. I learned later that the reason I was having issues was due to the tables each having columns with the same title, which is apparently a foundational SQL feature. The guide did not explain that this could be an issue and I had to do some digging on my own to find out how to fix it. It seems like the walkthrough should have anticipated this issue and mentioned it. That aside, By the end of the walkthrough, I could join tables, search for partial information matches, and search within ranges. With some common sense, the database’s ERD, and the new SQL coding skills, I was able to solve the crime! If users weren’t challenged enough with that task, there is an additional challenge that suggests users find the accomplice while only using 2 queries.

User interface of SQL Murder Mystery
Example of SQL Murder Mystery user interface

The Verdict is In

I really loved this game! It served as a great introduction to a language I had never used before but still managed to be really engaging. It reminded me of those escape room mystery boxes like Hunt a Killer that has users solve puzzles to get to a larger final solution. Anyone who loves logic puzzles or mysteries will enjoy this game, even if they have no experience with or even interest in coding or databases.  If you have some free time and a desire to explore a new skill, you should absolutely give SQL Murder Mystery a try!

A Different Kind of Data Cleaning: Making Your Data Visualizations Accessible

Introduction: Why Does Accessibility Matter?

Data visualizations are a fast and effective manner for communicating information and are increasingly becoming a more popular way for researchers to share their data with a broad audience. Because of this rising importance, it is also necessary to ensure that data visualizations are accessible to everyone. Accessible data visualizations not only help an audience who may require a screen reader or other accessible tool to read a document but are also helpful to the creators of the data visualization as it brings their data to a much wider audience than through a non-accessible data visualization. This post will offer three tips on how you can make your visualization accessible!

TIP #1: Color Selection

One of the most important choices when making a data visualization are the colors used in the chart. One suggestion would be to use a color blindness simulator to check the colors in the data visualization and experiment to find the right amount of contrast between colors. Look at the example regarding the top ice cream flavors:

A data visualization about the top flavors of ice cream. Chocolate was the top flavor (40%) followed by Vanilla (30%), Strawberry (20%), and Other (10%).

At first glance, these colors may seem acceptable to use for this kind of data. But when ran through the colorblindness simulator, one of the results creates an accessibility concern:

This is the same pie chart above, but placed under a tritanopia color blindness lens. The colors used for strawberry and vanilla now look the exact same and blend into one another because of this, making it harder to discern the amount of space they take in the pie chart.

Although the colors contrasted well enough in the normal view, the color palettes used for the strawberry and vanilla categories look the same for those with tritanopia color blindness. The result is that these sections blend into one another and make it more difficult to distinguish their values. Most color palettes incorporated in current data visualization software are already designed to ensure the colors do not contrast, but it is still a good practice to check to ensure the colors do not blend in with one another!

TIP #2: Adding Alt Text

Since most data visualizations often appear as images in either published work or reports, alt text is a crucial need for accessibility purposes. Take the visualization below. If there was no alt text provided, then the visualization is meaningless to those who rely on alt text to read a given document. Alt text should be short and summarize the key takeaways from the data (there is no need to describe each individual point, but it should provide enough information to describe the trends occurring in the data).

This is a chart showing the population size of each town in a given county. Towns are labeled A-E and continue to grow in population size as they go down the alphabet (town A has 1,000 people while town E has 100,000 people).

TIP #3: Clearly Labeling Your Data

A simple but crucial component of any visualization is having clear labels on your data. Let’s look at two examples to see what makes having labels a vital aspect of any data visualization:

This is a chart for how much money was earned/spent at a lemonade stand by month. There is no y-axis labels to describe how much money is earned/spent and no key to discern the two lines that represent the money made and the money spent.

There is nothing in this graph that provides any useful information regarding the money earned or spent at the lemonade stand. How much money was earned or spent each month? What do these two lines represent? Now, look at a more clearly labeled version of the same data:

This is a cleaned version of the previous visualization regarding how much money was earned/spent at a lemonade stand. The addition of a Y-axis and key now show that more money was spent in January/February than earned, but then changes in March peaking in July, and then continuing to fall until December where more money is spent than earned again.

In adding a labeled Y-axis, we can now quantify the difference in distance between the two lines at any point and have a better idea of the money earned/spent in any given month. Furthermore, the addition of a key at the bottom of the visualization distinguishes the lines telling the audience what each represents. By clearly labeling the data, it is now in a position where audience members can interpret and analyze it properly.

Conclusion: Can My Data Still be Visually Appealing?

While it may appear that some of these recommendations detract from the creative designs of data visualizations, this is not the case at all. Designing a visually appealing data visualization is another crucial aspect of data visualization and should be heavily considered when creating one. Accessibility concerns, however, should have priority over the visual appeal of the data visualization. That said, accessibility in many respects encourages creativity in the design, as it makes the creator carefully consider how they want to present their data in a way that is both accessible and visually appealing. Thus, accessibility makes for a more creative and transmissive data visualization and will benefit everyone!

Meet Our Graduate Assistants: Ryan Yoakum

In this interview series, we ask our graduate assistants questions for our readers to get to know them better. Our first interview this year is with Ryan Yoakum!

This is a headshot of Ryan Yoakum.

What is your background education and work experience?

I came to graduate school directly after receiving my bachelor’s degree in May 2021 in History and Religion here at the University of Illinois. During my undergraduate, I had taken a role working for the University of Illinois Residence Hall Libraries (which was super convenient as I lived in the same building I worked in!) and absolutely loved helping patrons find resources they were interested in. I eventually took a second position with them as a processing assistant, which gave me a taste for working on the back end as I primarily prepared materials bought to be shelved at each of the libraries within the system. I really loved my work with the Residence Hall Libraries and wanted to shift my career to working in a library of some form, which has led me here today!

What are your favorite projects you’ve worked on?

I have really enjoyed projects where I have gotten to work with data (both for patrons as well as internal data). Such projects have allowed me to explore my growing interest in data science (which is the last thing I would have initially expected when I began the master’s program in August 2021). I have also really enjoyed teaching some of the Savvy Researcher workshops, which have included ones on optical character recognition (OCR) and creative commons licensing!

What are some of your favorite underutilized Scholarly Commons resources that you would
recommend?

The two that come to mind are the software on our lab computers as well as our consultation services. If I were still in history, using ABBYY FineReader for OCR would have been a tremendous help as well as supplementing that with qualitative data analysis tools such as ATLAS.ti. I also appreciate the expertise of the many talented people who work here in the library. Carissa Phillips and Sandi Caldrone, for example, have been very influential in helping me explore my interests in data. Likewise, Wenjie Wang, JP Goguen, and Jess Hagman (all of whom now have drop-in consultation hours) have all guided me in working with software related to their specific interests, and I have benefitted greatly by bringing my questions to each of them.

When you graduate, what would your ideal job position look like?

I currently have two competing job interests in mind. The first is that I would love to work in a theological library. The theological library could be either in a seminary or an academic library focusing on religious studies. Pursuing the MSLIS has also shifted my interests in working with data, so I would also love to work a job where I can manage, analyze, and visualize data!

What is the one thing you would want people to know about your field?

Library and Information science is not a field limited to working in the stereotypical way society pictures what a librarian’s work looks like (there was a good satirical article recently on this). It is also far from being a dead field (and one that will likely gain more relevance over time). As part of the program, I am slowly gaining skills that have prepared me for working in data which can apply in any field. There are so many job opportunities for MSLIS students that I strongly encourage people to join the field if they are interested in library and information science but have doubts about its career prospects!

Going Down the Jane Austen Rabbit Hole

This post is part of a series for Love Data Week, which takes place February 14-18 2022.

Written by Heidi Imker, Director of the Library Research Data Service

When you think of data, your mind probably doesn’t jump right to Pride and Prejudice. That is, unless you’re Heidi Imker, Director of the Research Data Service and amateur Jane Austen internet sleuth. “In late 2020,” Heidi says, “I was in desperate need of a post-Outlander spiritual cleanse. Naturally, I turned to Pride and Prejudice. Over a year later, I’m still in the midst of a fantastic, out-of-control Jane Austin binge, and I’ve got oodles of related resources worthy of Love Data Week.”

Join Heidi on a virtual tour of some of her favorite data resources about Austen, her works, and historical England.

  1. janeaustenr: Jane Austen’s Complete Novels

In this fabulous R package, data scientist Julia Silge used text data for the Austen novels available from the also fabulous Project Gutenberg. The package offers cleaned data, documentation, and scripts to play with and analyze the novels.

  1. Word Frequencies in English-Language Literature, 1700-1922

Randomly, sifting through the janeaustenr dataset gave me a new level of appreciation for the word “ignore.” Austen didn’t use “ignore” once in any of her novels. It turns out that no one was really using it because it hadn’t caught on yet. In fact, according to Google’s ngram viewer, “ignore” didn’t start getting traction until circa 1845. And now you might be thinking word frequency data is fun, and it is! Like this word frequencies dataset available from the HathiTrust Research Center.

  1. Napoleon Series

One of the things I learned during this binge was that dating the events in Pride and Prejudice has been a subject of debate for some time (as in, about a century). I found it downright fascinating that scholars could map parts of the book to the 1811 calendar year and others to the year 1794. I had never really thought about the characters existing in a specific year, but now I wondered what else was happening in those years? I discovered the Waterloo Association, a community of military historians behind the Napoleon Series. This immense archive contains articles on military history, biographies, and documentation of thousands of officers and soldiers (such as Challis’s Peninsula Roll Call).

  1. London Lives

Provides searchable access to >240,000 digitized pages of archival documents, with special focus on crime, poverty, and social policy. Not only is the source material available, but the people behind London Lives have made it a point to keep humanity at the forefront by constructing biographies of the individuals caught in the crime and poverty cycle in London between 1690 and 1800.

  1. Calendar of London Concerts 1750-1800

My favorite dataset of all time, it was thoughtfully and painstakingly created by Professor Simon McVeigh at Goldsmiths, University of London over many decades. It lists 4,001 concert events, as found through locating and documenting adverts in archival newspapers—by hand. When Lady Catharine tells Elizabeth that “it will be in my power to take one of you as far as London, for I am going there early in June, for a week,” what could that self-professed music aficionado have heard in June 1794? Voila! Perhaps it was Handel’s Messiah at St Margaret’s Church in Westminster on Thursday, June 5th.

I appreciate the Calendar of London Concerts dataset for my odd little hobby, but I love it as an information professional. The sheer dedication it took assemble the data, especially with such strict attention to detail, is incredible. Let me explicitly gush about the documentation for a moment. Context! References! Abbreviations! All explained! What’s “HM”? His Majesty’s something or other? No, it’s the Half-Moon Tavern in Cheapside. Currency conversions! Syntax for nearly impossible to standardize programme content! It’s forty-four glorious pages! Swoon!

Related resources on London concerts

What started out as a casual, online-friendly hobby ended up introducing me to a wealth of enlightening open data resources, and I’m in love with every one of them. Since my Austen binge is apparently nowhere near over, you may well get another link-laden post for next year’s Love Data Week. <3

Headshot of HeidiHeidi Imker is the Director of the Research Data Service (RDS) and an Associate Professor at the University of Illinois at Urbana-Champaign. The RDS helps researchers across the Urbana-Champaign campus manage and share research data, and in her role as Director, she ensures the RDS takes a collaborative, user-oriented, and practical approach to research support. Heidi holds a Ph.D. in Biochemistry from the University of Illinois and did her postdoctoral research at the Harvard Medical School.