Pinterest’s Spam Scraping Patent Could Fight Misinformation

Pinterest wants to kick spam to the curb with a patent that uses AI to detect malicious links.

Photo of a Pinterest patent
Photo via U.S. Patent and Trademark Office

Sign up to uncover the latest in emerging technology.

Pinterest wants to keep up the quality of your boards. 

The social media platform filed a patent application for tech that determines “linked spam content.” Pinterest’s system essentially uses machine learning to analyze several factors of a linked webpage to determine if it’s spam, malicious or “otherwise undesirable content.” 

“Although online services that host and maintain content may have a desire to identify and protect their users and subscribers from such spamming, malicious, and/or otherwise undesirable content pages, detection of such … content pages can be difficult,” the filing notes.

Pinterest’s system uses multiple machine learning models, including a deep neural network and a natural language processing system, to “crawl, scrape, and/or parse” a page into a number of different features, including textual, media and structural features. Text is analyzed for certain keywords which are scored based on their relation to spam content. 

Media items such as photos and videos are compared to that of known spam pages, which are stored by Pinterest in a database to help its system further identify spam. For example, if a photo in a linked page appears on a relatively high number of other content pages that are stored in the database, that “may indicate a greater likelihood that the media item in question may be associated with spamming.” 

Finally, the structure of the page is analyzed to determine “tag paths,” or the frequency in which certain tags appear on the website. The scores of these three factors are then factored in to  determine whether the page surpasses a threshold of including undesirable content. 

Pinterest has an obvious motivation in wanting to fight spam on its platform. As a social media firm, its main revenue source is digital advertising. In its most recent earnings report, the company beat Wall Street’s revenue expectations as the ad market started to recover. “Our users are engaging deeply and we’re delivering better results for advertisers through improved measurement and innovation across the full funnel,” CEO Bill Ready said of the company’s earnings. 

Keeping that momentum going will likely cause the company to keep a close eye on content it deems undesirable. Plus, along with its ad ambitions, the company’s concerted effort to build out its shopping on the platform could signal that it wants to make sure its users don’t get scammed. 

This isn’t the first time we’ve seen Pinterest take an interest in innovating its ad tech. The company’s patent activity includes context-based ad placement methods and ways to generate personalized content by (consensually) digging through your emails.  

While this tech could certainly help Pinterest’s bottom line, the benefits of a multi-model AI scraping tool like this could be twofold: Depending on how it defines undesirable or malicious content, this could help fight disinformation or hate speech. According to its recent transparency report, the company already employs “hybrid tools” to catch bad content, deactivating hundreds of thousands of pins total for things like harassment, threats of violence, conspiracy theories and misinformation. 

And though Pinterest may not be the platform that comes to mind when people think about where they get news or political analysis, social media firms are likely to be put under a microscope for what is allowed to go viral in the run-up to the 2024 presidential election.