New approach could help protect consumer data exposed in purchase transactions


LAWRENCE — Whether they are shopping at Costco or watching Netflix, consumers are consistently exposing personal data. Even though companies may be taking reasonable precautions to protect customers (including those provisions required by law), the distinctiveness of purchasing patterns creates a privacy vulnerability.

“Your data is basically everywhere,” said Shaobo Li, assistant professor of business analytics.Shaobo Li

His new article titled “Reidentification Risk in Panel Data: Protecting for k-Anonymity” shows how the commonly used consumer panel data in marketing research is subject to a high threat of reidentification, which can be exploited by intruders. He proposes a new approach to protect such data so that a certain privacy level is guaranteed while the information loss is minimal. It appears in Information Systems Research.

“Many people don’t realize your purchases can be linked to your identity,” Li said.

“Most understand that a combination of your demographic information — such as age, gender and ZIP code — can be linked. But nowadays, if you open your app store in iPhone, you can see there is a privacy notice — and the first one is your purchase history — and that’s going to be linked to your identity. Purchases are definitely something we need to protect.”

Co-written by Matthew Schneider of Drexel University, Yan Yu of University of Cincinnati and Sachin Guptad of Cornell University, the article studied consumer panel data, which is frequently used in marketing. So regardless of whether you’re buying candy bars or over-the-counter medicine, a business usually stores this information. What Li found was that as high as 94% of the consumers in the panel data they studied can be reidentified based on purchases of a single product category (e.g., carbonated beverage).

This reidentification is accomplished through a potential data linkage based on the uniqueness of the purchase. For example, if a consumer buys Fig Newtons, a potted plant and a can of Lysol, that combination of goods has a unique element compared to others. (Supposedly anonymous Netflix customers were identified by cross-referencing their viewings with ratings on the Internet Movie Database, giving intruders access to email addresses and, ultimately, credit card data.)

While legislation such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) offers some safeguarding against identity theft, Li and his team propose a new solution: graph-based minimum movement k-anonymization. This method artificially yet minimally alters certain purchases so that in the protected data, any purchase appears across at least some different customers in a panel data.

Li said that k-anonymity "is a very well-established privacy model, which means essentially any individual is not standing out based on linkable information.

“Data privacy protection is challenging in both industry and academia. Although there are many existing data protection approaches out there, companies should understand users' needs before picking up a method because many approaches can drastically destroy the data in order to achieve privacy. In other words, data utility is a very important aspect in data privacy protection.”

In Li’s work, the proposed method optimizes (maximizes) the data utility while guaranteeing k-anonymity.

Now in his fifth year at KU, Li was trained as a statistician. In addition to statistical research, he has written extensively on data privacy issues in marketing, including “A Flexible Method for Protecting Marketing Data: An Application to Point-of-Sale Data” for Marketing Science and “Protecting Customer Data: Marketing with Second-Party Data” for the International Journal of Research in Marketing.

He said, “The number one lesson for the type of company that collects consumer data and conducts marketing research is — even though they operate under government regulations, and they remove the consumer’s name, address and email address — privacy issues still remain.”

Read this article on the KU News website.