Tech

Reddit locks its public data in new content policy, says its use now requires a contract

Reddit is rolling out a new policy on Thursday aimed at balancing its desire to license its content to larger tech companies, like Google, with protecting user privacy. The recently announced “Public Content Policy” will now join Reddit’s existing Privacy Policy and Content Policy to guide how Reddit data is accessed and used by commercial entities and other partners. Along the same lines, the company also announced a subreddit dedicated to researchers working with Reddit data.

The announcement comes shortly after Reddit’s IPO debut, which sees the company positioning itself to grow revenue not only through ads served on its platform and developer use of the API, but also from its corpus of data. The company, in its IPO prospectus, said it has already earned $203 million from data licensing deals and expects that figure to rise over time.

Although Reddit had not historically blocked access to its data for AI training purposes, it changed course last year. Reddit CEO Steve Huffman told the New York Times that it makes no sense for Reddit to continue giving “all this value to some of the biggest companies in the world for free,” signaling the company’s plan to getting into the data licensing business. space.

With these efforts now well underway, the new public content policy will lock access to Reddit’s data without an agreement. (Reddit says it’s not adding new restrictions, but simply publishing the policy that has been in place internally for some time.)

“Unfortunately, we are seeing more and more commercial entities using unauthorized access or abusing authorized access to collect public data in bulk, including Reddit’s public content,” Reddit writes on its blog. “Even worse, these entities believe they have no limitations on their use of this data, and they do so without regard for user rights or privacy, ignoring reasonable legal, security and privacy requests. deletion of users. While we continue our efforts to block known bad actors, we need to do more to restrict access to public Reddit content at scale to trusted actors who have agreed to follow our policies. But we must also continue to ensure that users, modifiers, researchers, and other bona fide, non-commercial actors have access to it.

In other words, access to Reddit data for research and other non-commercial efforts will continue, but entities wishing to use Reddit data for other purposes, including AI training, will have to pay. In a graphic shared on the blog, Reddit makes this clear, saying that companies interested in using Reddit data to “power, augment, or improve your product for commercial purposes” require a contract.

Image credits: Reddit

Advertisers, in turn, are directed to an advertising API to manage campaigns and track their performance.

Because the company is essentially just a large website, indexable by search engines, this new policy aims to lock Reddit content from unauthorized collection while respecting users’ rights.

For example, Reddit says its partners will have to upload users’ decisions to remove their content. So, if users do not want their personal posts to serve as the basis for future AI engines, they should be able to opt out. Partners are also limited by the new policy in using Reddit content to identify individuals or their personal information, including for ad targeting purposes. Partners also may not use Reddit content to spam or harass its users or to perform “background checks, facial recognition, government surveillance, or to assist law enforcement in carrying out any of the operations above”.

The policy further restricts access to adult media and specifies that Reddit will not sell its users’ personal information. The company also notes that it will never license non-public content such as private messages or non-public account information, such as users’ emails or browsing history, among others.

To help researchers who want to use Reddit data for non-commercial purposes, the company has created a new subreddit, r/reddit4researchers. The company says it is partnering with OpenMined to also develop a program to guide and grow researcher collaboration with Reddit.

techcrunch

Back to top button