This is a guest blog by the amazing Zachary Maiorana, a GA in Scholarly and Communication Publishing
Homepage for Google Scholar
Scholars and users have a vested interest in understanding the relative authority of publications they have either written or wish to cite to form the basis of their research. Although the literature search, a common topic in library instruction and research seminars, can take place on a huge variety of discovery tools, researchers often rely on Google Scholar as a supporting or central platform.
The massive popularity of Google Scholar is likely due to its simple interface, which bears the longtime prestige of Google’s search engine; its enormous breadth, with a simple search yielding millions of results; its compatibility and parallels with other Googles Chrome and Books; and its citation metrics mechanism.
This last aspect of Google Scholar, which collects and reports data on the number of citations a given publication receives, represents the platform’s apparent ability to precisely calculate the research community’s interest in that publication. But, in the University Library’s work on the Illinois Experts (experts.illinois.edu) research and scholarship portal, we have encountered a number of circumstances in which Google Scholar has misrepresented U of I faculty members’ research.
Recent studies reveal that Google Scholar, despite its popularity and its massive reach, is not only often inaccurate in its reporting of citation metrics and title attribution, but also susceptible to deliberate manipulation. In 2010, Labbé discusses an experiment using Ike Antkare (AKA “I can’t care”), a fictitious researcher whose bibliography was manufactured with a mountain of self-referencing citations. After the purposely falsified publications went public, Google’s bots didn’t differentiate Antkare’s research from his real-life peers during their crawling of his 100 generated articles. As a result, Google Scholar reported Antkare as one of the most cited researchers in the world, with a higher H-index* than Einstein.
Ike Antkare “standing on the shoulders of giants” in Indiana University’s Scholarometer. Credit: Adapted from a screencap in Labbé (2010)
In 2014, Spanish researchers conducted an experiment in which they created a fake scholar with several papers making hundreds of references to works written by the experimenters. After the papers were made public on a personal site, Google Scholar scraped the data and the real-life researchers’ profiles increased by 774 citations in total. In the hands of more nefarious users seeking to aggrandize their own careers or alter scientific opinion, such practices could result in large-scale academic fraud.
For libraries, Google’s kitchen-sink-included data collection methods further result in confusing and inaccurate attributions. In our work to supplement the automated collection of publication data for faculty profiles on Illinois Experts using CVs, publishers’ sites, journal sites, databases, and Google Scholar, we frequently encounter researchers’ names and works mischaracterized by Google’s clumsy aggregation mechanisms. For example, Google Scholar’s bots often read a scholar’s name somewhere within a work that the scholar hasn’t written—perhaps they were mentioned in the acknowledgements or in a citation—and simply attribute the work to them as author.
When it comes to people’s careers and the sway of scientific opinion, such snowballing mistakes can be a recipe for large-scale misdirection. Though much research exists that shows that, in general, Google Scholar currently represents highly cited research well, weaknesses persist. Blind distrust of any dominant proprietary platform is unwise, and using Google Scholar requires particularly careful judgment.
Read more on Google Scholar’s quality and reliability:
Brown, Christopher C. 2017. “Google Scholar.” The Charleston Advisor 19 (2): 31–34. https://doi.org/10.5260/chara.19.2.31.
Halevi, Gali, Henk Moed, and Judit Bar-Ilan. 2017. “Suitability of Google Scholar as a Source of Scientific Information and as a Source of Data for Scientific Evaluation—Review of the Literature.” Journal of Informetrics 11 (3): 823–34. https://doi.org/10.1016/j.joi.2017.06.005.
Labbé, Cyril. 2016. “L’histoire d’Ike Antkare et de Ses Amis Fouille de Textes et Systèmes d’information Scientifique.” Document Numérique 19 (1): 9–37. https://doi.org/10.3166/dn.19.1.9-37.
Lopez-Cozar, Emilio Delgado, Nicolas Robinson-Garcia, and Daniel Torres-Salinas. 2012. “Manipulating Google Scholar Citations and Google Scholar Metrics: Simple, Easy and Tempting.” ArXiv:1212.0638 [Cs], December. http://arxiv.org/abs/1212.0638.
Walker, Lizzy A., and Michelle Armstrong. 2014. “‘I Cannot Tell What the Dickens His Name Is’: Name Disambiguation in Institutional Repositories.” Journal of Librarianship and Scholarly Communication 2 (2). https://doi.org/10.7710/2162-3309.1095.
*Read the library’s LibGuide on bibliometrics for an explanation of the h-index and other standard research metrics: https://guides.library.illinois.edu/c.php?g=621441&p=4328607