Tech

This week in AI: generative AI and the problem of remuneration for creators

Keeping pace with a rapidly evolving industry like AI is a daunting challenge. Until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, as well as notable research and experiments that we haven’t covered on our own.

By the way, TechCrunch plans to launch an AI newsletter soon. Stay tuned.

In the AI ​​space this week, eight leading US newspapers owned by investment giant Alden Global Capital, including the New York Daily News, the Chicago Tribune and Orlando Sentinel, sued OpenAI and Microsoft for violating the copyright related to companies’ use of generative AI technologies. They, like the New York Times in its ongoing lawsuit against OpenAI, accuse OpenAI and Microsoft of removing their intellectual property without permission or compensation to build and market generative models such as GPT-4.

“We have spent billions of dollars collecting information and reporting news in our publications, and we cannot allow OpenAI and Microsoft to expand the vast technological plan of stealing our work to create their own companies in our costs,” Frank Pine, the editor overseeing Alden newspapers, said in a statement.

The lawsuit appears likely to end in a settlement and licensing agreement, given OpenAI’s existing partnerships with publishers and its reluctance to hinge its entire business model on the fair use argument. But what about the rest of the content creators whose works are dragged into modeling training without payment?

It seems that OpenAI is thinking about it.

A recently published research paper, co-authored by Boaz Barak, a scientist on OpenAI’s Superalignment team, proposes a framework for compensating copyright owners “in proportion to their contributions to the creation of content generated by the AI.” How? Thanks to cooperative game theory.

The framework assesses the extent to which the content of a training dataset – for example text, images or other data – influences what a model generates, using a game theory concept known as of Shapley value. Then, based on this assessment, it determines the “legitimate share” (i.e. compensation) of the content owners.

Let’s say you have an image-generating model trained using the works of four artists: John, Jacob, Jack, and Jebediah. You ask him to draw a flower like Jack. With this framework, you can determine the influence of each artist’s works on the art generated by the model and, thus, the compensation each should receive.

There East one drawback of the framework, however: it is computationally expensive. The researchers’ workarounds rely on compensation estimates rather than exact calculations. Would this satisfy content creators? I’m not so sure. If OpenAI ever puts it into practice, we will definitely know.

Here are some other interesting AI stories from recent days:

  • Microsoft reaffirms ban on facial recognition: Language added to the terms of service for Azure OpenAI Service, Microsoft’s fully managed package around OpenAI technology, more clearly prohibits integrations from being used “by or for” law enforcement for facial recognition in the United States .
  • The nature of native AI startups: AI startups face a different set of challenges than a typical software-as-a-service company. That was the message from Rudina Seseri, founder and managing partner of Glasswing Ventures, last week at the TechCrunch Early Stage event in Boston; Ron has the whole story.
  • Anthropic launches a business plan: AI startup Anthropic is launching a new paid plan for businesses as well as a new iOS application. Team – the enterprise plan – gives customers priority access to Anthropic’s Claude 3 family of generative AI models as well as additional administrator and user management controls.
  • CodeWhisperer is no longer: Amazon CodeWhisperer is now Q Developerpart of Amazon’s Q family of business-oriented generative AI chatbots. Available through AWS, Q ​​Developer helps developers complete tasks during their daily work, like debugging and upgrading applications, much like CodeWhisperer did.
  • Exit Sam’s Club: Walmart-owned Sam’s Club says it’s turning to AI to accelerate its “exit technology.” Instead of requiring store staff to check members’ purchases against their receipts when they leave a store, Sam’s Club customers who pay either at a register or through the Scan & Go mobile app can now leave certain stores without their purchases being rechecked. .
  • Automated fish harvesting: Harvesting fish is an inherently complicated activity. Shinkei is working to improve it with an automated system that portions fish more humanely and reliably, resulting in what could be an entirely different seafood economy, Devin reports.
  • Yelp’s AI assistant: Yelp this week announced a new AI-powered chatbot for consumers – powered by OpenAI models, according to the company – that helps them connect with relevant businesses for their tasks (like installing fixtures, improvement of outdoor spaces, etc.). The company is rolling out the AI ​​assistant to its iOS app under the “Projects” tab, and plans to expand it to Android later this year.

More machine learning

Image credits: US Department of Energy

It looks like there was quite a party at Argonne National Lab this winter when they brought together around 100 AI and energy industry experts to discuss how the rapidly evolving technology could be useful to the country’s infrastructure and R&D in this area. The resulting report is more or less what one would expect from this group: a lot of pie in the sky, but informative nonetheless.

Regarding nuclear energy, the grid, carbon management, energy storage and materials, the themes that emerged from this meeting were, firstly, that researchers need access to tools and high-power computing resources; second, learn to spot weak points in simulations and predictions (including those enabled by the first thing); third, the need for AI tools that can integrate and make accessible data from multiple sources and in many formats. We’ve seen all of these things happen in various ways across the industry, so it’s not a big surprise, but nothing gets done at the federal level without a few boffins putting out a document, so it’s good to have it on the record on file.

Georgia Tech and Meta are working on this in part with a large new database called OpenDAC, a stack of reactions, materials and calculations intended to help scientists designing carbon capture processes do so more easily. It focuses on metal-organic frameworks, a promising and popular type of material for carbon capture, but with thousands of variations, which have not been tested exhaustively.

The Georgia Tech team partnered with Oak Ridge National Lab and Meta’s FAIR to simulate quantum chemical interactions on these materials, using some 400 million computing hours, far more than a university can easily gather. I hope this will be useful to climate researchers working in this area. Everything is documented here.

We hear a lot about applications of AI in the medical field, although most play what might be called an advisory role, helping experts notice things they might not have seen otherwise , or to spot models that would have taken a technician hours to find. This is partly because these machine learning models simply find connections between statistics without understanding what caused or led to what. Researchers at Cambridge and Ludwig-Maximilians-Universität München are working on this, because going beyond basic correlative relationships could be extremely useful in creating treatment plans.

The work, led by Professor Stefan Feuerriegel from LMU, aims to create models capable of identifying causal mechanisms, and not just correlations: “We give the machine rules to recognize the causal structure and correctly formalize the problem. Then the machine must learn to recognize the effects of interventions and understand, so to speak, how the real consequences are reflected in the data fed into the computers,” he said. They are still in their early stages, and they are aware of it, but they believe that their work is part of an important period of development on the scale of a decade.

At the University of Pennsylvania, graduate student Ro Encarnación is working on a new angle in the field of “algorithmic justice” that we’ve seen launched (primarily by women and people of color) over the past 7-8 last years. Her work is more user-driven than platform-driven, documenting what she calls “emergent auditing.”

When Tiktok or Instagram implement a slightly racist filter, or an image generator that does something astonishing, what do users do? Complain, sure, but they also continue to use it and learn to work around, or even exacerbate, the problems coded into it. This may not be a “solution” in the way we imagine it, but it demonstrates the diversity and resilience of the user side of the equation: they are not as fragile or passive as we might think. .

techcrunch

Back to top button