Sign up to uncover the latest in emerging technology.
While other companies are trying to figure out how to protect their massive reserves of data, PayPal is taking the “less is more” approach.
The company filed a patent application for tech that reduces the “overcollection of unstructured data.” When PayPal receives a piece of unstructured data, it uses machine learning to determine which information is personally identifiable information like an address, social security number or a photo.
The system then performs a process called “masking,” where it essentially blurs out the pieces of information that it deems unnecessary for whatever task it’s handling. It then presents the masked file to the user so they’re aware of what’s being protected, and saves this file in place of the original that the user uploaded.
Basically, PayPal’s tech auto-redacts your personal information. For example, if you send PayPal a photo of your driver’s license, but it only needs your address and name, it would blur out your photo, license number and other personal information.
FYI, unstructured data stretches far beyond just photos. It can be any media, PayPal said, with “videos, text conversations, voice conversations” all falling into this bucket. This kind of data doesn’t follow “simple rules,” the company said, and often doesn’t align with organizational data privacy policies.
“With such large volumes of data, it is increasingly difficult to analyze the received data and take actions to align with data privacy and computer security policies, especially when the data is in an unstructured data format,” the company said in its filing. “Unstructured data frequently contains data that is unnecessary for the purpose for which the unstructured data was collected.”
Just from a data storage perspective, PayPal’s patent filing makes sense. Storing unstructured data is tricky, expensive and resource-intensive because it doesn’t follow simple rules and is generally complex in whatever form it comes in. PayPal also noted that storing this data comes with major security risks that can lead to data breaches that come with “heavy regulatory fines, loss of customer trust, and use of computing resources.”
“You have to have a balance,” Raman noted. “Data is your biggest asset. It’s also your biggest liability. Now, there’s a strong movement to figure out what is necessary and what is unnecessary.”
Having a minimalist approach to data collection is, itself, data security. Several years ago, the standard was once to collect as much data as possible and figure out how to manage it later, Allage said. But that unstable foundation has since shifted to a focus on security amid a regulatory push both in the U.S. and globally to protect consumer data privacy in recent years.
“Everyone was so focused on having all this data, but I think they’re beginning to understand that having it all is not necessarily a good thing,” said Allage. “They should now focus on what it is that they really need, versus what they think they want.”
Similar to Amazon, Coinbase and many of its fintech peers, PayPal is no stranger to a data breach. The company suffered a hack that impacted nearly 35,000 users in December, leaving personal data like birthdays, social security numbers and addresses vulnerable. The victims sued PayPal in a pending class action lawsuit in March, alleging the company’s treatment of consumer data was negligent.
If this patent filing is any indication, PayPal has learned that the less it knows, the better.