Introducing the Illinois Open Publishing Network: Digital Publishing from the University of Illinois Library

The face of scholarly publishing is changing and libraries are taking on the role of publisher for many scholarly publications, including those that don’t fit the mold of traditional presses. Initiatives at the University of Illinois at Urbana-Champaign are working to address strides in digital publishing, increasing momentum for open access research, and the need for sustainable publishing models. This year alone, The Illinois Open Publishing Network (IOPN) has released five new open-access multi-modal scholarly publications. IOPN represents a network of publications and publishing initiatives hosted at the University Library, working towards high-quality open-access scholarship in digital media. IOPN assists authors with a host of publishing services—copyright, peer review, and even providing assistance in learning the publishing tools themselves and strategizing their publications in what for many is a new mode of writing.

Continue reading

Exploring Data Visualization #18

In this monthly series, I share a combination of cool data visualizations, useful tools and resources, and other visualization miscellany. The field of data visualization is full of experts who publish insights in books and on blogs, and I’ll be using this series to introduce you to a few of them. You can find previous posts by looking at the Exploring Data Visualization tag.

Painting the World with Water

Creating weather predictions is a complex tasks that requires global collaboration and advanced scientific technologies. Most people know very little about how a weather prediction is put together and what is required to make it possible. NASA gives us a little glimpse into the complexities of finding out just how we know if it’s going to rain or snow anywhere in the world.

Continue reading

Lightning Review: The GIS Guide to Public Domain Data

One of the first challenges encountered by anyone seeking to start a new GIS project is where to find good, high quality geospatial data. The field of geographic information science has a bit of a problem in which there are simultaneously too many possible data sources for any one researcher to be familiar with all of them, as well as too few resources available to help you navigate them all. Luckily, The GIS Guide to Public Domain Data is here to help!

The front cover of the book "The GIS Guide to Public Domain Data" by Joseph J. Kerski and Jill Clark. Continue reading

Exploring Data Visualization #17

In this monthly series, I share a combination of cool data visualizations, useful tools and resources, and other visualization miscellany. The field of data visualization is full of experts who publish insights in books and on blogs, and I’ll be using this series to introduce you to a few of them. You can find previous posts by looking at the Exploring Data Visualization tag.

The unspoken rules of visualization

Title header of essay "The unspoken rules of data visualization" by Kaiser Fung. White text on a black background with green and red patches Continue reading

Using Article Citations to Find Data for Social Science

Whether we like it or not, using quantitative measures in social science research has become increasingly important for getting your work published and recognized. If you’ve never used data before and don’t even know where to start this can seem a little daunting. The good news is: You most likely won’t have to collect your own data. There is so much data already out there but the hard part can be finding it. In this post I will explain one strategy for finding social science data: using article citations.

Looney Toons' Wiley Coyote searching a landscape with binoculars

You don’t have to look too far to find the right data

Continue reading

Stata vs. R vs. SPSS for Data Analysis

As you do research with larger amounts of data, it becomes necessary to graduate from doing your data analysis in Excel and find a more powerful software. It can seem like a really daunting task, especially if you have never attempted to analyze big data before. There are a number of data analysis software systems out there, but it is not always clear which one will work best for your research. The nature of your research data, your technological expertise, and your own personal preferences are all going to play a role in which software will work best for you. In this post I will explain the pros and cons of Stata, R, and SPSS with regards to quantitative data analysis and provide links to additional resources. Every data analysis software I talk about in this post is available for University of Illinois students, faculty, and staff through the Scholarly Commons computers and you can schedule a consultation with CITL if you have specific questions.

Short video loop of a kid sitting at a computer and putting on sun glasses

Rock your research with the right tools!


STATA

Stata logo. Blue block lettering spelling out Stata.

Among researchers, Stata is often credited as the most user-friendly data analysis software. Stata is popular in the social sciences, particularly economics and political science. It is a complete, integrated statistical software package, meaning it can accomplish pretty much any statistical task you need it to, including visualizations. It has both a point-and-click user interface and a command line function with easy-to-learn command syntax. Furthermore, it has a system for version-control in place, so you can save syntax from certain jobs into a “do-file” to refer to later. Stata is not free to have on your personal computer. Unlike an open-source program, you cannot program your own functions into Stata, so you are limited to the functions it already supports. Finally, its functions are limited to numeric or categorical data, it cannot analyze spatial data and certain other types.

 

Pros

Cons

User friendly and easy to learn An individual license can cost
between $125 and $425 annually
Version control Limited to certain types of data
Many free online resources for learning You cannot program new
functions into Stata

Additional resources:


R logo. Blue capital letter R wrapped with a gray oval.

R and its graphical user interface companion R Studio are incredibly popular software for a number of reasons. The first and probably most important is that it is a free open-source software that is compatible with any operating system. As such, there is a strong and loyal community of users who share their work and advice online. It has the same features as Stata such as a point-and-click user interface, a command line, savable files, and strong data analysis and visualization capabilities. It also has some capabilities Stata does not because users with more technical expertise can program new functions with R to use it for different types of data and projects. The problem a lot of people run into with R is that it is not easy to learn. The programming language it operates on is not intuitive and it is prone to errors. Despite this steep learning curve, there is an abundance of free online resources for learning R.

Pros

Cons

Free open-source software Steep learning curve
Strong online user community Can be slow
Programmable with more functions
for data analysis

Additional Resources:

  • Introduction to R Library Guide: Find valuable overviews and tutorials on this guide published by the University of Illinois Library.
  • Quick-R by DataCamp: This website offers tutorials and examples of syntax for a whole host of data analysis functions in R. Everything from installing the package to advanced data visualizations.
  • Learn R on Code Academy: A free self-paced online class for learning to use R for data science and beyond.
  • Nabble forum: A forum where individuals can ask specific questions about using R and get answers from the user community.

SPSS

SPSS logo. Red background with white block lettering spelling SPSS.

SPSS is an IBM product that is used for quantitative data analysis. It does not have a command line feature but rather has a user interface that is entirely point-and-click and somewhat resembles Microsoft Excel. Although it looks a lot like Excel, it can handle larger data sets faster and with more ease. One of the main complaints about SPSS is that it is prohibitively expensive to use, with individual packages ranging from $1,290 to $8,540 a year. To make up for how expensive it is, it is incredibly easy to learn. As a non-technical person I learned how to use it in under an hour by following an online tutorial from the University of Illinois Library. However, my take on this software is that unless you really need a more powerful tool just stick to Excel. They are too similar to justify seeking out this specialized software.

Pros

Cons

Quick and easy to learn By far the most expensive
Can handle large amounts of data Limited functionality
Great user interface Very similar to Excel

Additional Resources:

Gif of Kermit the frog dancing and flailing his arms with the words "Yay Statistics" in block letters above

Thanks for reading! Let us know in the comments if you have any thoughts or questions about any of these data analysis software programs. We love hearing from our readers!

 

Featured Resource: QGIS, a Free, Open Source Mapping Platform

This week, geographers around the globe took some time to celebrate the software that allows them to analyze, well, that very same globe. November 13th marked the 20th annual GIS Day,  an “international celebration of geographic information systems,” as the official GIS Day website puts it.

the words "GIS day" in a stylized font appear below a graphic of a globe with features including buildings, trees, and water

But while GIS technology has revolutionized the way we analyze and visualize maps over the past two decades, the high cost of ArcGIS products, long recognized as the gold standard for cartographic analysis tools, is enough to deter many people from using it. At the University of Illinois and other colleges and universities, access to ArcGIS can be taken for granted, but many of us will not remain in the academic world forever. Luckily, there’s a high-quality alternative to ArcGIS for those who want the benefits of mapping software without the pricetag!

the QGIS logo

QGIS is a free, open source mapping software that has most of the same functionality as ArcGIS. While some more advanced features included in ArcGIS do not have analogues in QGIS, developers are continually updating the software and new features are always being added. As it stands now, though, QGIS includes everything that the casual GIS practitioner could want, along with almost everything more advanced users need.

As is often the case with open source software alternatives, QGIS has a large, vibrant community of supporters, and its developers have put together tons of documentation on how to use the program, such as this user guide. Generally speaking, if you have any experience with ArcGIS it’s very easy to learn QGIS—for a picture of the learning curve, think somewhere along the lines of switching from Microsoft Word to Google Docs. And if you don’t have experience, the community is there to help! There are many guides to getting started, including the one listed in the above link, and more forum posts of users working through questions together than anyone could read in a lifetime. 

For more help, stop by to take a look at one of the QGIS guidebooks in our reference collection, or send us an email at sc@library.illinois.edu!

Have you made an interesting map in QGIS? Send us pictures of your creations on Twitter @ScholCommons!

 

Exploring Data Visualization #16

Daylight Saving Time Gripe Assistant Tool

Clocks fell back this weekend, which means the internet returns once again to the debate of whether or not we still need Daylight Saving Time. Andy Woodruff, a cartographer for Axis Maps, created a handy tool for determining how much you can complain about the time change. You input your ideal sunset and sunrise times, select whether the sunset or sunrise time you chose is more important, and the tool generates a map that shows whether DST should be gotten rid of, used year-round, or if no changes need to be made based on where you live. The difference a half hour makes is surprising for some of the maps, making this a fun data viz to play around with and examine your own gripes with DST.

A map of the United States with different regions shaded in different colors to represent if they should keep (gray) or get rid of (gold) changing the clocks for Daylight Saving Time. Blue represents areas that should always use Daylight Saving Time.

This shows an ideal sunrise of 7:00 am and an ideal sunset of 6:00 pm.

Laughing Online

Conveying tone through text can be stressful—finding the right balance of friendly and assertive in a text is a delicate operation that involves word choice and punctuation equally. Often, we make our text more friendly through exclamations points! Or by adding a quick laugh, haha. The Pudding took note of how varied our use of text-based laughs can be and put together a visual essay on how often we use different laughs and whether all of them actually mean we are “laughing out loud.” The most common laugh on Reddit is “lol,” while “hehe,” “jaja,” and “i’m laughing” are much less popular expressions of mirth.

A proportional area chart showing which text laughs are most used on Reddit.

“ha” is the expression most likely to be used to indicate fake laughter or hostility

how to do it in Excel: a shaded range

Here’s a quick tip for making more complex graphs using Excel! Storytelling with Data’s Elizabeth Ricks put together a great how-to article on making Excel show a shaded range on a graph. This method involves some “brute force” to make Excel’s functions work in your favor, but results in a clean chart that shows a shaded range rather than a cluster of multiple lines.

A shaded area chart in Excel

Pixelation to represent endangered species counts

On Imgur, user JJSmooth44 created a photo series to demonstrate the current status of endangered species using pixilation. The number of squares represent the approximate number of that species that remains in the world. The more pixelated the image, the fewer there are left.

A pixelated image of an African Wild Dog. The pixelation represents approximately how many of this endangered species remain in the wild (estimated between 3000 and 5500). The Wild Dog is still distinguishable, but is not clearly visible due to the pixelation.

The African Wild Dog is one of the images in which the animal is still mostly recognizable.

Scary Research to Share in the Dark: A Halloween-Themed Roundup

If you’re anything like us here in the Scholarly Commons, the day you’ve been waiting for is finally here. It’s time to put on a costume, eat too much candy, and celebrate all things spooky. That’s right, folks. It’s Halloween and we couldn’t be happier!

Man in all black with a jack o' lantern mask dancing in front of a green screen cemetery

If you’ve been keeping up with our Twitter (@ScholCommons) this month, you’ve noticed we’ve been sharing some ghoulish graphs and other scary scholarship. To keep the holiday spirit(s) high, I wanted to use this week’s blog post to gather up all our favorites.

First up, check out the most haunted cities in the US on The Next Web, which includes some graphs but also a heat map of the most haunted areas in the country. Which region do you think has the most ghosts?

If you’re more interested in what’s happening on across the pond, we’ve got you covered. Click on this project to see just how scary ArcGIS story maps can be.

https://twitter.com/ScholCommons/status/1187058855282462721

And while ghosts may be cool, we all know the best Halloween characters are all witches. Check out this fascinating project from The University of Edinburgh that explores real, historic witch hunts in Scotland.

The next project we want to show you might be one of the scariest. I was absolutely horrified to find out that Illinois’ most popular Halloween candy is Jolly Ranchers. If you’re expecting trick-or-treaters tonight, please think of the children and reconsider your candy offerings.

Now that we’ve share the most macabre maps around, let’s shift our focus to the future. Nathan Yau uses data to predict when your death will occur. And if this isn’t enough to terrify you, try his tool to predict how you’ll die.

Finally, if you’re looking for some cooking help from an AI or a Great Old One, check out this neural network dubbed “Cooking with Cthulhu.”

Do you have any favorite Halloween-themed research projects? If so, please share it with us here or on Twitter. And if you’re interested in doing your own deadly digital scholarship, feel free to reach out to the Scholarly Commons to learn how to get started or get help on your current work. Remember, in the words everyone’s favorite two-faced mayor…

A clip of the Mayor from Nightmare Before Christmas saying There's only 365 days left until next Halloween