Amazon Gets Up Close and Personal with Your Data

Amazon is practicing data security by hounding its AWS clients to be more careful.

Photo courtesy of krblokhin on iStock.

Sign up to uncover the latest in emerging technology.

Amazon wants to handle your data with care… or at least nag its AWS customers to do so. 

The company is seeking to patent “statistical techniques” for detecting sensitive data. Essentially, this tech uses machine learning to determine the probability of whether or not a dataset includes sensitive information, such as addresses, IP addresses or phone numbers. 

If the machine learning model predicts that a dataset may contain sensitive data, the system will inform the owner of the dataset and suggest that a sensitive data handling policy be enforced, such as “an encryption policy, or a policy that results in the isolation or deletion of the sensitive data.” If the owner of the dataset claims that the system falsely detected sensitive data, that information is used in turn to train the model to be better at detection.  

Given the rapid rate at which data is being collected, Amazon said, detecting the presence of sensitive data to keep it safe from security breaches and ensure compliance with privacy regulations remains a “non-trivial technical problem.” 

“The sophistication of network-based attack mechanisms has increased,” Amazon said in its filing. “Hardware and software engineers keep enhancing the built-in safety mechanisms of computer systems, while attackers keep discovering new avenues for penetrating the defenses designed by the engineers.”

Photo courtesy of the U.S. Patent and Trademark Office.

Millions of people have been affected by breaches of AWS clients in recent years. In a notable instance in 2019, a former Amazon employee stole the personal information of 100 million CapitalOne customers by exploiting the credit card company’s systems in 2019.

After the incident, AWS and other cloud infrastructure companies invested heavily in security tools and support, Ari Weil, VP of marketing of cloud data security firm Cyera, told me. But they also doubled down on the idea that companies are responsible for their own data, he noted.

“Today, the definition of ‘good enough’ tooling, education, and support is changing, and so it is no surprise that these vendors are investing in technology that can effectively stop customers from running with scissors when it comes to data security,” Weil said. 

Data control like this matters even more in the age of AI given the sheer amount of data needed to train models, said Arti Raman, CEO and founder of cybersecurity firm Titaniam.io. And with Amazon’s plans to add a generative AI chatbot to its e-commerce marketplace for a “conversational experience,” proper data storage is even more poignant.

“When you use AI, you’re kind of like feeding the beast when you’re feeding the model,” Raman said. “And if you put sensitive things in there, it’s not just for your use. It’s for everybody’s use.” 

One other consideration: The tech in Amazon’s patent would likely be applied as another feature with the AWS environment, potentially to keep hold of their trust and draw in interest. But, if patented, Amazon could license this tech as an individual product, opening up another cloud-related revenue stream for the company, said Patrick Juola, Ph.D., professor of computer science and cybersecurity studies coordinator at Duquesne University. 

“The reason that any company patents anything is generally to gain a competitive edge over other companies in that space,” Juola said in an email. “There are a lot of cloud computing and B2B companies out there that want to be able to process client/customer data (which inevitably involves collecting and storing it), but do not want to expose ‘sensitive’ data.” 

Correction: The original piece stated that the 2019 CapitalOne breach was due to AWS systems. This has since been updated. Patent Drop regrets this error.