Can’t find a term in the Glossary? Visit the Ethics in Research Use of Library Patron Data: Glossary and Explainer for more data privacy terms and concepts.
A method of de-identification that reduces the granularity of personal data through grouping data into categories or ranges (such as using age ranges to report on user birthdate or age data). Aggregation carries some risk of re-identifying an individual if there are outliers in the data set, if the data set size is small overall, or if categories or ranges are granular enough to identify an individual in the data set.
The process of transforming data that completely breaks the connection between the data and the individual behind the data. Research shows that true anonymization is near impossible, with most anonymization methods retaining a risk of re-identification of individuals in the dataset.
The practice of surveilling users’ actions over a period of time.
The organizational or operational rationale for a particular process.
The act of giving or denying permission for the use or disclosure of an individual’s data. Explicit consent requires an affirmative action from the individual, while implicit consent implies consent, such as continued use of a service or website.
A small data file sent from the website and stored on the user’s computer or the user’s browser. Web cookies can be used to manage user authorization and session management, as well as track users through web analytic software and other tracking software. Some cookies last until a web session ends (“session” cookies) while other cookies (“persistent” cookies) can remain on the user’s computer after the end of a session. A website can have web cookies from the site itself (“first-party” cookies) as well as cookies from external sites (“third-party” cookies).
The unauthorized access of personal data by an individual, organization, or system process. A data breach is an intentional act of gaining unauthorized access, such as an attacker gaining access to personal data for the purpose of identify theft, while a data leak is unintentional, such as an employee losing a laptop or mobile device containing personal data.
Entities that sell personal data collected from private and public data sets.
The flow of data from collection to deletion, including processing, disclosure, and retention.
The practice of only collecting and retaining personal data that is necessary to fulfil the purpose for which it is needed.
The process of transforming personal data to remove identifiable aspects of the data. This includes a variety of methods, including aggregation, stripping or truncating personal data, or removal or pseudonymization of personal data. De-identified data carries a risk of re-identification, meaning that an individual person can be identified from the dataset by reattaching parts of the dataset back to the individual.
The act of monitoring and capturing a person’s activities through various technologies, including web analytics, cookies, trackers, and other data observation and capture techniques.
The act of encoding plain data into an unreadable format through use of algorithms or other cryptographical methods. Encrypted data can only be read by someone who can decrypt the data through a decryption key.
The process of responding to and managing a data breach or leak. This can include:
- Identification and detection of a breach or leak
- Containing and eliminating the cause of the breach or leak
- Communications with affected parties
Data that cannot be linked to or associated with an identifiable individual. In some data privacy regulations, this can include data that has gone through rigorous de-identification.
Opt in is a choice made by a user involving active affirmation, such as checking a box or toggle to turn on specific data sharing or collection settings.
Opt out is a choice made by a user through inaction unless the user makes an action to choose otherwise. An example is a product collecting user data until the user unchecks the box that controls the data collection setting.
Data relating to an identifiable individual. This includes single points of data that can identify a person (direct identifier); data that, when combined, can identify a person (indirect identifiers); and data about a person’s behaviors (behavioral data).
Direct identifiers can include:
- Email or physical address
- Government or organization issued identification numbers
- Account username and password
- Biometric information
- IP address
- Device information (operating system, browser, device unique ID, etc.)
Indirect identifiers can include:
- Age or date of birth
- Gender identity
- Education level
- Major or minor field of study
- Disability status
- Veteran status
- Geographical information, such as regions or zipcode
Behavioral data can include:
- Search history
- Electronic content access histories
- Circulation histories
- Website activity
- Geolocation history
Personally identifiable information (PII)
Information that can identify an individual person or link to an individual person’s identity. See also: Personal data.
The collection and analysis of website data. Web analytic applications track and capture data from a variety of sources, including user data. Depending on the application, user data can range from search term results, page hits, and landing/exit pages to user demographic data, behavioral data, and even data of users visiting sites outside of the original website.