Exploring Data Visualization #10

In this monthly series, I share a combination of cool data visualizations, useful tools and resources, and other visualization miscellany. The field of data visualization is full of experts who publish insights in books and on blogs, and I’ll be using this series to introduce you to a few of them. You can find previous posts by looking at the Exploring Data Visualization tag.

A collage of images of sticky notes in different configurations from the article "stickies!"

Sticky notes in all different shapes, sizes, and colors provide a perfect medium for project planning.

1. Sometimes when you want to visualize your thinking, digital tools just don’t cut it and you have to go back to cold, hard paper. At the beginning of November, Cole Nussbaumer Knaflic at Storytelling with Data make a #SWDchallenge for readers to use sticky notes to represent their thinking and plan out a data visualization the old fashioned way! The images that resulted from that challenge, seen in the post stickies!, are an office-supply lover’s dream. I’ve taken inspiration from these posts in my own project planning for the past month—here’s a sneak peek of my thoughts for a sign that will be displayed in a library study space:

A piece of paper that reads "Welcome to Room 220" at the top with sticky notes stuck to the page underneath.

2. In a feature from February of this year, the digital branch of German newspaper Die Zeit, ZEIT ONLINE, showed some interesting finds from their database of approximately 450,000 street names used across Germany. They call the project Streetscapes and use them to explore important parts of German history. These street names show the legacy of political division in Germany, as well as noting what the most common names for streets are and what the age of different streets in Berlin are.

A map of Berlin with streets highlighted in different colors based on the age of the street name.

Older street names are clearly concentrated toward the center of Berlin.

3. Google Maps updated their display this year to zoom out to a globe instead of a flat Mercator projection, noting in a tweet on August 2nd that “With 3D Globe Mode…, Greenland’s projection is no longer the size of Africa.” Adapting the shape of countries from a globe to a flat map has always been a challenge and has resulted in some confusion as to how the Earth’s geography actually looks. In the third part of a series of Story Maps about “The World’s Troubled Lands & Geopolitical Curiosities,” John Nelson outlines some of those misconceptions. In a National Geographic write-up titled “Why your mental map of the world is (probably) wrong,” Betsy Mason goes deeper into why we hold these misconceptions and why they are so hard to let go of.

The title slide of a story map with text that reads "Misconceptions Some Common Geographic Mental Misplacements..."

The story map shows which three different regions people often misplace in their minds.

I hope you enjoyed this data visualization news! If you have any data visualization questions, please feel free to email the Scholarly Commons.

Cool Text Data – Music, Law, and News!

Computational text analysis can be done in virtually any field, from biology to literature. You may use topic modeling to determine which areas are the most heavily researched in your field, or attempt to determine the author of an orphan work. Where can you find text to analyze? So many places! Read on for sources to find unique text content.

Woman with microphone

Genius – the song lyrics database

Genius started as Rap Genius, a site where rap fans could gather to annotate and analyze rap lyrics. It expanded to include other genres in 2014, and now manages a massive database covering Ariana Grande to Fleetwood Mac, and includes both lyrics and fan-submitted annotations. All of this text can be downloaded and analyzed using the Genius API. Using Genius and a text mining method, you could see how themes present in popular music changed over recent years, or understand a particular artist’s creative process.

homepage of case.law, with Ohio highlighted, 147,692 unique cases. 31 reporters. 713,568 pages scanned.

Homepage of case.law

Case.law – the case law database

The Caselaw Access Project (CAP) is a fairly recent project that is still ongoing, and publishes machine-readable text digitized from over 40,000 bound volumes of case law from the Harvard Law School Library. The earliest case is from 1658, with the most recent cases from June 2018. An API and bulk data downloads make it easy to get this text data. What can you do with huge amounts of case law? Well, for starters, you can generate a unique case law limerick:

Wheeler, and Martin McCoy.
Plaintiff moved to Illinois.
A drug represents.
Pretrial events.
Rocky was just the decoy.

Check out the rest of their gallery for more project ideas.

Newspapers and More

There are many places you can get text from digitized newspapers, both recent and historical. Some newspaper are hundreds of years old, so there can be problems with the OCR (Optical Character Recognition) that will make it difficult to get accurate results from your text analysis. Making newspaper text machine readable requires special attention, since they are printed on thin paper and have possibly been stacked up in a dusty closet for 60 years! See OCR considerations here, but the newspaper text described here is already machine-readable and ready for text mining. However, with any text mining project, you must pay close attention to the quality of your text.

The Chronicling America project sponsored by the Library of Congress contains digital copies of newspapers with machine-readable text from all over the United States and its territories, from 1690 to today. Using newspaper text data, you can analyze how topics discussed in newspapers change over time, among other things.

newspapers being printed quickly on a rolling press

Looking for newspapers from a different region? The library has contracts with several vendors to conduct text mining, including Gale and ProQuest. Both provide newspaper text suitable for text mining, from The Daily Mail of London (Gale), to the Chinese Newspapers Collection (ProQuest). The way you access the text data itself will differ between the two vendors, and the library will certainly help you navigate the collections. See the Finding Text Data library guide for more information.

The sources mentioned above are just highlights of our text data collection! The Illinois community has access to a huge amount of text, including newspapers and primary sources, but also research articles and books! Check out the Finding Text Data library guide for a more complete list of sources. And, when you’re ready to start your text mining project, contact the Scholarly Commons (sc@library.illinois.edu), and let us help you get started!

Wikidata and Wikidata Human Gender Indicators (WHGI)

Wikipedia is a central player in online knowledge production and sharing. Since its founding in 2001, Wikipedia has been committed to open access and open editing, which has made it the most popular reference work on the web. Though students are still warned away from using Wikipedia as a source in their scholarship, it presents well-researched information in an accessible and ostensibly democratic way.

Most people know Wikipedia from its high ranking in most internet searches and tend to use it for its encyclopedic value. The Wikimedia Foundation—which runs Wikipedia—has several other projects which seek to provide free access to knowledge. Among those are Wikimedia Commons, which offers free photos; Wikiversity, which offers free educational materials; and Wikidata, which provides structured data to support the other wikis.

The Wikidata logo

Wikidata provides structured data to support Wikimedia and other Wikimedia Foundation projects

Wikidata is a great tool to study how Wikipedia is structured and what information is available through the online encyclopedia. Since it is presented as structured data, it can be analyze quantitatively more easily than Wikipedia articles. This has led to many projects that allow users to explore data through visualizations, queries, and other means. Wikidata offers a page of Tools that can be used to analyze Wikidata more quickly and efficiently, as well as Data Access instructions for how to use data from the site.

The webpage for the Wikidata Human Gender Indicators project

The home page for the Wikidata Human Gender Indicators project

An example of a project born out of Wikidata is the Wikidata Human Gender Indicators (WHGI) project. The project uses metadata from Wikidata entries about people to analyze trends in gender disparity over time and across cultures. The project presents the raw data for download, as well as charts and an article written about the discoveries the researchers made while compiling the data. Some of the visualizations they present are confusing (perhaps they could benefit from reading our Lightning Review of Data Visualization for Success), but they succeed in conveying important trends that reveal a bias toward articles about men, as well as an interesting phenomenon surrounding celebrities. Some regions will have a better ratio of women to men biographies due to many articles being written about actresses and female musicians, which reflects cultural differences surrounding fame and gender.

Of course, like many data sources, Wikidata is not perfect. The creators of the WHGI project frequently discovered that articles did not have complete metadata related to gender or nationality, which greatly influenced their ability to analyze the trends present on Wikipedia related to those areas. Since Wikipedia and Wikidata are open to editing by anyone and are governed by practices that the community has agreed upon, it is important for Wikipedians to consider including more metadata in their articles so that researchers can use that data in new and exciting ways.

An animated gif of the Wikipedia logo bouncing like a ball

New Uses for Old Technology at the Arctic World Archive

In this era of rapid technological change, it is easy to fall into the mindset that the “big new thing” is always an improvement on the technology that came before it. Certainly this is often true, and here in the Scholarly Commons we are always seeking innovative new tools to help you out with your research. However, every now and then it’s nice to just slow down and take the time to appreciate the strengths and benefits of older technology that has largely fallen out of use.

A photo of the arctic

There is perhaps no better example of this than the Arctic World Archive, a facility on the Norwegian archipelago of Svalbard. Opened in 2017, the Arctic World Archive seeks to preserve the world’s most important cultural, political, and literary works in a way that will ensure that no manner of catastrophe, man-made or otherwise, could destroy them.

If this is all sounding familiar to you, that’s because you’ve probably heard of the Arctic World Archive’s older sibling, the Svalbard Global Seed Vault. The Global Seed Vault, which is much better known and older than the Arctic World Archive, is an archive seeds from around the world, meant to ensure that humanity would be able to continue growing crops and making food in the event of a catastrophe that wipes out plant life.

Indeed, the two archives have a lot in common. The World Archive is housed deep within a mountain in an abandoned coal mine that once served as the location of the seed vault, and was founded to be for cultural heritage what the seed vault is for crops. But the Arctic World Archive has made truly innovative use of old technology that makes it a truly impressive site in its own right.

A photo of the arctic

Perhaps the coolest (pun intended) aspect of the Arctic World Archive is the fact that it does not require electricity to operate. It’s extreme northern location (it is near the northernmost town of at least 1,000 people in the world) means that the temperature inside the facility is naturally very cold year-round. As any archivist or rare book librarian who brings a fleece jacket to work in the summer will happily tell you, colder temperatures are ideal for preserving documents, and the ability to store items in a very cold climate without the use of electricity makes the World Archive perfect for sustainable, long-term storage.

But that’s not all: in a real blast from the past, all information stored in this facility is kept on microfilm. Now, I know what you’re thinking: “it’s the 21st century, grandpa! No one uses microfilm anymore!”

It’s true that microfilm is used by a very small minority of people nowadays, but nevertheless it offers distinct advantages that newer digital media just can’t compete with. For example, microfilm is rated to last for at least 500 years without corruption, whereas digital files may not last anywhere near that long. Beyond that, the film format means that the archive is totally independent from the internet, and will outlast any major catastrophe that disrupts part or all of our society’s web capabilities.

A photo of a seal

The Archive is still growing, but it is already home to film versions of Edvard Munch’s The Scream, Dante’s The Divine Comedy, and an assortment of government documents from many countries including Norway, Brazil, and the United States.

As it continues to grow, its importance as a place of safekeeping for the world’s cultural heritage will hopefully serve as a reminder that sometimes, older technology has upsides that new tech just can’t compete with.

Exploring Data Visualization #9

In this monthly series, I share a combination of cool data visualizations, useful tools and resources, and other visualization miscellany. The field of data visualization is full of experts who publish insights in books and on blogs, and I’ll be using this series to introduce you to a few of them. You can find previous posts by looking at the Exploring Data Visualization tag.

Map of election districts colored red or blue based on predicted 2018 midterm election outcome

This map breaks down likely outcomes of the 2018 Midterm elections by district.

 

Seniors at Montgomery Blair High School in Silver Spring, Maryland created the ORACLE of Blair 2018 House Election Forecast, a website that hosts visualizations that predict outcomes for the 2018 Midterm Elections. In addition to breakdowns of voting outcome by state and district, the students compiled descriptions of how the district has voted historically and what are important stances for current candidates. How well do these predictions match up with the results from Tuesday?

A chart showing price changes for 15 items from 1998 to 2018

This chart shows price changes over the last 20 years. It gives the impression that these price changes are always steady, but that isn’t the case for all products.

Lisa Rost at Datawrapper created a chart—building on the work of Olivier Ballou—that shows the change in the price of goods using the Consumer Price Index. She provides detailed coverage of how her chart is put together, as well as making clear what is missing from both hers and Ballou’s chart based on what products are chosen to show on the graph. This behind-the-scenes information provides useful advise for how to read and design charts that are clear and informative.

An image showing a scale of scientific visualizations from figurative on the left to abstract on the right.

There are a lot of ways to make scientific research accessible through data visualization.

Visualization isn’t just charts and graphs—it’s all manner of visual objects that contribute information to a piece. Jen Christiansen, the Senior Graphics Editor at Scientific American, knows this well, and her blog post “Visualizing Science: Illustration and Beyond” on Scientific American covers some key elements of what it takes to make engaging and clear scientific graphics and visualizations. She shares lessons learned at all levels of creating visualizations, as well as covering a few ways to visualize uncertainty and the unknown.

I hope you enjoyed this data visualization news! If you have any data visualization questions, please feel free to email the Scholarly Commons.

Election Forecasts and the Importance of Good Data Visualization

In the wake of the 2016 presidential election, many people, on the left and right alike, came together on the internet to express a united sentiment: that the media had called the election wrong. In particular, one man may have received the brunt of this negative attention. Nate Silver and his website FiveThirtyEight have taken nearly endless flak from disgruntled Twitter users over the past two years for their forecast which gave Hillary Clinton a 71.4% chance of winning.

However, as Nate Silver has argued in many articles and tweets, he did not call the race “wrong” at all, everyone else just misinterpreted his forecast. So what really happened? How could Nate Silver say that he wasn’t wrong when so many believe to this day that he was? As believers in good data visualization practice, we here in the Scholarly Commons can tell you that if everyone interprets your data to mean one thing when you really meant it to convey something else entirely, your visualization may be the problem.

Today is Election Day, and once again, FiveThirtyEight has new models out forecasting the various House, Senate, and Governors races on the ballot. However, these models look quite a bit different from 2016’s, and in those differences lie some important data viz lessons. Let’s dive in and see what we can see!

The image above is a screenshot taken from the very top of the page for FiveThirtyEight’s 2016 Presidential Election Forecast, which was last updated on the morning of Election Day 2016. The image shows a bar across the top, filled in blue 71.4% of the way, to represent Clinton’s chance of winning, and red the rest of the 28.6% to represent Trump’s chance of winning. Below this bar is a map of the fifty states, colored from dark red to light red to light blue to dark blue, representative of the percentage chance that each state goes for one of the two candidates.

The model also allows you to get a sense of where exactly each state stands, by hovering your cursor over a particular state. In the above example, we can see a bar similar the one at the top of the national forecast which shows Clinton’s 55.1% chance of winning Florida.

The top line of FiveThirtyEight’s 2018 predictions looks quite a bit different. When you open the House or Senate forecasts, the first thing you see is a bell curve, not a map, as exemplified by the image of the House forecast below.

At first glance, this image may be more difficult to take in than a simple map, but it actually contains a lot of information that is essential to anyone hoping to get a sense of where the election stands. First, the top-line likelihood of each party taking control is expressed as a fraction, rather than as a percent. The reasoning behind this is that some feel that the percent bar from the 2016 model improperly gave the sense that Clinton’s win was a sure thing. The editors at FiveThirtyEight hope that fractions will do a better job than percentages at conveying that the forecasted outcome is not a sure thing.

Beyond this, the bell curve shows forecasted percentage chances for every possible outcome (for example, at the time of writing, this, there is a 2.8% chance that Democrats gain 37 seats, a 1.6% chance that Democrats gain 20 seats, a <0.1% chance that Democrats gain 97 seats, and a <0.1% chance that Republicans gain 12 seats. This visualization shows the inner workings of how the model makes its prediction. Importantly, it strikes home the idea that any result could happen even if one end result is considered more likely. What’s more, the model features a gray rectangle centered around the average result, that highlights the middle 80% of the forecast: there is an 80% chance that the result will be between a Democratic gain of 20 seats (meaning Republicans would hold the House) and a Democratic gain of 54 (a so-called “blue wave”).

The 2018 models do feature maps as well, such as the above map for the Governors forecast. But some distinct changes have been made. First, you have to scroll down to get to the map, hopefully absorbing some important information from the graphs at the top in the meantime. Most prominently, FiveThirtyEight has re-thought the color palette they are using. Whereas the 2016 forecast only featured shades of red and blue, this year the models use gray (House) and white (Senate and Governors) to represent toss-ups and races that only slightly lean one way or the other. If this color scheme had been used in 2016, North Carolina and Florida, both states that ended up going for Trump but were colored blue on the map, would have been much more accurately depicted not as “blue states” but as toss-ups.

Once again, hovering over a state or district gives you a detail of the forecast for that place in particular, but FiveThirtyEight has improved that as well.

Here we can see much more information than was provided in the hover-over function for the 2016 map. Perhaps most importantly, this screen shows us the forecasted vote share for each candidate, including the average, high, and low ends of the prediction. So for example, from the above screenshot for Illinois’ 13th Congressional District (home to the University of Illinois!) we can see that Rodney Davis is projected to win, but there is a very real scenario in which Betsy Dirksen Londrigan ends up beating him.

FiveThirtyEight did not significantly change how their models make predictions between 2016 and this year. The data itself is treated in roughly the same way. But as we can see from these comparisons, the way that this data is presented can make a big difference in terms of how we interpret it. 

Will these efforts at better data visualization be enough to deter angry reactions to how the model correlates with actual election results? We’ll just have to tune in to the replies on Nate Silver’s twitter account tomorrow morning to find out… In the meantime, check out their House, Senate, and Governors  forecasts for yourself!

 

All screenshots taken from fivethirtyeight.com. Images of the 2016 models reflect the “Polls-only” forecast. Images of the 2018 models reflect the “Classic” forecasts as of the end of the day on November 5th 2018.

Lightning Review: Data Visualization for Success

Data visualization is where the humanities and sciences meet: viewers are dazzled by the presentation yet informed by research. Lovingly referred to as “the poster child of interdisciplinarity” by Steven Braun, data visualization brings these two fields closer together than ever to help provide insights that may have been impossible without the other. In his book Data Visualization for Success, Braun sits down with forty designers with experience in the field to discuss their approaches to data visualization, common techniques in their work, and tips for beginners.

Braun’s collection of interviews provides an accessible introduction into data visualization. Not only is the book filled with rich images, but each interview is short and meant to offer an individual’s perspective on their own work and the field at large. Each interview begins with a general question about data visualization to contribute to the perpetual debate of what data visualization is and can be moving forward.

Picture of Braun's "Data Visualization for Success"

Antonio Farach, one of the designers interviewed in the book, calls data visualization “the future of storytelling.” And when you see his work – or really any of the work in this book – you can see why. Each new image has an immediate draw, but it is impossible to move past without exploring a rich narrative. Visualizations in this book cover topics ranging from soccer matches to classic literature, economic disparities, selfie culture, and beyond.

Each interview ends by asking the designer for their advice to beginners, which not only invites new scholars and designers to participate in the field but also dispels any doubt of the hard work put in by these designers or the science at the root of it all. However, Barbara Hahn and Christine Zimmermann of Han+Zimmermann may have put it best, “Data visualization is not making boring data look fancy and interesting. Data visualization is about communicating specific content and giving equal weight to information and aesthetics.”

A leisurely, stunning, yet informative read, Data Visualization for Success offers anyone interested in this explosive field an insider’s look from voices around the world. Drop by the Scholarly Commons during our regular hours to flip through this wonderful read.

And finally, if you have any further interest in data visualization make sure you stay up to date on our Exploring Data Visualization series or take a look at what services the Scholarly Commons provides!

An Introduction to Google MyMaps

Geographic information systems (GIS) are a fantastic way to visualize spatial data. As any student of geography will happily explain, a well-designed map can tell compelling stories with data which could not be expressed through any other format. Unfortunately, traditional GIS programs such as ArcGIS and QGIS are incredibly inaccessible to people who aren’t willing or able to take a class on the software or at least dedicate significant time to self-guided learning.

Luckily, there’s a lower-key option for some simple geospatial visualizations that’s free to use for anybody with a Google account. Google MyMaps cannot do most of the things that ArcMap can, but it’s really good at the small number of things it does set out to do. Best of all, it’s easy!

How easy, you ask? Well, just about as easy as filling out a spreadsheet! In fact, that’s exactly where you should start. After logging into your Google Drive account, open a new spreadsheet in Sheets. In order to have a functioning end product you’ll want at least two columns. One of these columns will be the name of the place you are identifying on the map, and the other will be its location. Column order doesn’t matter here- you’ll get the chance later to tell MyMaps which column is supposed to do what. Locations can be as specific or as broad as you’d like. For example, you could input a location like “Canada” or “India,” or you could choose to input “1408 W. Gregory Drive, Urbana, IL 61801.” The catch is that each location is only represented by a marker indicating a single point. So if you choose a specific address, like the one above, the marker will indicate the location of that address. But if you choose a country or a state, you will end up with a marker located somewhere over the center of that area.

So, let’s say you want to make a map showing the locations of all of the libraries on the University of Illinois’ campus. Your spreadsheet would look something like this:

Sample spreadsheet

Once you’ve finished compiling your spreadsheet, it’s time to actually make your map. You can access the Google MyMaps page by going to www.google.com/mymaps. From here, simply select “Create a New Map” and you’ll be taken to a page that looks suspiciously similar to Google Maps. In the top left corner, where you might be used to typing in directions to the nearest Starbucks, there’s a window that allows you to name your map and import a spreadsheet. Click on “Import,”  and navigate through Google Drive to wherever you saved your spreadsheet.

When you are asked to “Choose columns to position your placemarks,” select whatever column you used for your locations. Then select the other column when you’re prompted to “Choose a column to title your markers.” Voila! You have a map. Mine looks like this:  

Michael's GoogleMyMap

At this point you may be thinking to yourself, “that’s great, but how useful can a bunch of points on a map really be?” That’s a great question! This ultra-simple geospatial visualization may not seem like much. But it actually has a range of uses. For one, this type of visualization is excellent at giving viewers a sense of how geographically concentrated a certain type of place is. As an example, say you were wondering whether it’s true that most of the best universities in the U.S. are located in the Northeast. Google MyMaps can help with that!

Map of best universities in the United States

This map, made using the same instructions detailed above, is based off of the U.S. News and World Report’s 2019 Best Universities Ranking. Based on the map, it does in fact appear that more of the nation’s top 25 universities are located in the northeastern part of the country than anywhere else, while the West (with the notable exception of California) is wholly underrepresented.

This is only the beginning of what Google MyMaps can do: play around with the options and you’ll soon learn how to color-code the points on your map, add labels, and even totally change the appearance of the underlying base map. Check back in a few weeks for another tutorial on some more advanced things you can do with Google MyMaps!

Try it yourself!

Exploring Data Visualization #8

Note from Megan Ozeran, Data Analytics & Visualization Librarian: It’s been a pleasure sharing data visualization news with you over the last seven months. Now, I am excited to announce that one of our awesome Graduate Assistants, Xena Becker, will oversee the Exploring Data Visualization series. Take it away, Xena!

In this monthly series, I share a combination of cool data visualizations, useful tools and resources, and other visualization miscellany. The field of data visualization is full of experts who publish insights in books and on blogs, and I’ll be using this series to introduce you to a few of them. You can find previous posts by looking at the Exploring Data Visualization tag.

Bar graph of energy consumption

From Chartable

1) The amount of energy required to make electronic devices and information centers run on a daily basis is significant—but just how much energy is used worldwide? Lisa Charlotte Rost used the principles from Alberto Cairo’s the truthful art to design and explain the choices behind a chart showing worldwide IT energy consumption.

Line graph showing income inequality over time

Junk Charts breaks down what works and what doesn’t in this graphic

2) Crazy Rich Asians was a box office hit this summer, gaining attention for its opulent set design and for being the first film to feature Asians and Asian Americans in most of the leading, directing, and other production roles since the 1990s. The New York Times used the opening of the film to write a report on Asian immigration and wealth disparity in the United States. Junkcharts wrote up a breakdown of the data visualizations used in the report, noting what the NYTimes did well and what areas could be improved in their representations.

A portion of a graphic that uses colored bars to indicate whether Brett Kavanaugh and Dr. Christine Blasey Ford answered the questions they were asked during the Senate confirmation hearing for Brett Kavanaugh.

The graphic incorporates the transcript to indicate what questions were answered or left unanswered.

3) The news cycle has been dominated by the confirmation hearing of Supreme Court nominee Brett Kavanaugh, who has been accused to sexual assault by Dr. Christine Blasey Ford. Vox created a simple but impactful chart that shows every time Ford or Kavanaugh answered (or did not answer) the question they had been asked.

I hope you enjoyed this data visualization news! If you have any data visualization questions, please feel free to email the Scholarly Commons.

Analyze and Visualize Your Humanities Data with Palladio

How do you make sense of hundreds of years of handwritten scholarly correspondence? Humanists at Stanford University had the same question, and developed the project Mapping the Republic of Letters to answer it. The project maps scholarly social networks in a time when exchanging ideas meant waiting months for a letter to arrive from across the Atlantic, not mere seconds for a tweet to show up in your feed. The tools used in this project inspired the Humanities + Design lab at Stanford University to create a set of free tools specifically designed for historical data, which can be multi-dimensional and not suitable for analysis with statistical software. Enter Palladio!

To start mapping connections in Palladio, you first need some structured, tabular data. An Excel spreadsheet in CSV format with data that is categorized and sorted is sufficient. Once you have your data, just upload it and get analyzing. Palladio likes data about two types of things: people and places. The sample data Palladio provides is information about influential people who visited or were otherwise connected with the itty bitty country of Monaco. Read on for some cool things you can do with historical data.

Mapping

Use the Map feature to mark coordinates and connections between them. Using the sample data that HD Lab provided, I created the map below, which shows birthplaces and arrival points. Hovering over the connection shows you the direction of the move. By default, you can change the map itself to be standard maps like satellite or terrain, or even just land masses with no human-created geography, like roads or place names.

Map of Mediterranean sea and surrounding lands of Europe, red lines across map show movement, all end in Monaco

One person in our dataset was born in Galicia, and later arrived in Monaco.

But, what if you want to combine this new-fangled spatial analysis with something actually historic? You’re in luck! Palladio allows you to use other maps as bases, provided that the map has been georeferenced (assigned coordinates based on locations represented on the image). The New York Public Library’s Map Warper is a collection of some georeferenced maps. Now you can show movement on a map that’s actually from the time period you’re studying!

Same red lines across map as above, but image of map itself is a historical map

The same birthplace to arrival point data, but now with an older map!

Network Graphs

Perhaps the connections you want to see don’t make sense to be on a map, like those between people. This is where the Graph feature comes in. Graph allows you to create network visualizations based on different facets of your data. In general, network graphs display relationships between entities, and work best if all your nodes (dots) are the same type of information. They are especially useful to show connections between people, but our sample data doesn’t have that information. Instead, we can visualize our peoples’ occupation by gender.

network graph shows connections between peoples' occupations and their gender

Most occupations have both males and females, but only males are Monegasque, Author, Gambler, or Journalist, and only females are Aristocracy or Spouse.

The network graph makes it especially visible that there are some slight inconsistencies in the data; at least one person has “Aristocracy” as an occupation, while others have “Aristocrat.” Cleaning and standardizing your data is key! That sounds like a job for…OpenRefine!

Timelines

All of the tools in Palladio have the same Timeline functionality. This basically allows you to filter the data used in your visualization by a date, whether that’s birthdate, date of death, publication date, or whatever timey wimey stuff you have in your dataset. Other types of data can be filtered using the Facet function, right next to the Timeline. Play around with filtering, and watch your visualization change.

Try Palladio today! If you need more direction, check out this step-by-step tutorial by Miriam Posner. The tutorial is a few years old so the interface has changed slightly, so don’t panic if the buttons look different!

Did you create something cool in Palladio? Post a comment below, or tell us about it on Twitter!