Using Voyant Tools for Basic Text Analysis

Voyant Tools is an open source web-based application that allows users to work with their own texts or existing text collections to perform basic text mining functions. These functions make it possible to quickly extract characteristics from a corpus and discover themes. Voyant Tools is available for free at http://voyant-tools.org/. From here, users can input the text to be analyzed in multiple ways. Follow the steps below to get started.

Loading Texts

For a basic single text: paste the text into the text box.

For text from webpages: enter the URLs of the webpages into the text box, listing each URL on a   separate line.

For plain text, HTML, XML, PDF, RTF or MS Word: select the “Upload” button beneath the text box. Click “Add” for each new document and “Upload” when all documents have been added.

To use Voyant’s pre-existing text collections: select “Open” and choose from the drop down list. Currently available are the Humanist Listserv Archives and Shakespeare’s Plays.

After the text is in place, select “Reveal.”

Basic Analysis Tools

After “revealing” the text, three tools will automatically appear: Cirrus, Summary, and Corpus Reader.

Cirrus displays a word cloud of highest frequency terms. Hovering over certain words will reveal their frequency. Clicking on a word will reveal more information including a word trends graph. In order to remove articles such as “the” and “and” from the word cloud, select the cog tool above the Cirrus feature. Select the language of the text from the drop down list and then click “Ok.” This will remove stop words from the word cloud revealing a more meaningful representation.

The Summary tool will provide information about the text or group of texts including total number of documents and words, length of documents, and distinctive words in each text. This will also draw out notable peaks in frequency and vocabulary density.

The Corpus Reader will reveal the texts in the corpus allowing the user to hover over words within the text to reveal frequency and more information.

To see the additional tools of Word Trends, Keywords in Context, and Words in Documents, click on the double-arrow icon in the upper right corner. Click on the single arrow icons to open all of the windows, and use the toolbars in the bottom of their panes to generate results.

More tools are available here.

Exporting Data

Above each tool, there is a disk icon that can be selected to export data from that tool. Users will have the option to save the data as an image or a URL that will return to that data. Exporting data will prevent the need to upload the same texts each time they are required.

For more information on using Voyant Tools, see this guide and additional documentation.

To see Voyant Tools in use, explore these examples.

 

 

 

Facebook Twitter Delicious Email