Jannah Theme License is not validated, Go to the theme options page to validate the license, You need a single license for each domain name.
Tech

This Week in AI: Let us not forget the humble data annotator

Keeping pace with a rapidly evolving industry like AI is a daunting challenge. Until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, as well as notable research and experiments that we haven’t covered on our own.

This week in AI, I’d like to shine a spotlight on labeling and annotation startups – startups like Scale AI, which is reportedly in talks to raise new funding at a valuation of $13 billion. dollars. Labeling and annotation platforms might not bring attention to new generative AI models like OpenAI’s Sora does. But they are essential. Without them, modern AI models would likely not exist.

The data that many models train on must be labeled. For what? Labels, or beacons, help models understand and interpret data during the training process. For example, labels intended to train an image recognition model may take the form of markings around objects, “bounding boxes,” or captions that refer to each person, place, or object depicted in an image.

Label accuracy and quality have a significant impact on the performance (and reliability) of trained models. And annotation is a vast undertaking, requiring thousands or even millions of labels for the largest and most sophisticated datasets used.

So you would think that data annotators would be treated well, paid a decent salary, and would enjoy the same benefits that engineers who build the models themselves enjoy. But often, the opposite is true – due to the brutal working conditions favored by many annotation and labeling startups.

Companies with billions in the bank, like OpenAI, rely on annotators from third world countries paid just a few dollars an hour. Some of these annotators are exposed to highly disturbing content, such as graphic images, but do not receive time off (as they are usually contractors) or access to mental health resources.

An excellent article from NY Mag notably reveals Scale AI, which recruits annotators in countries as far away as Nairobi or Kenya. Some of Scale AI’s tasks require labelers to work multiple eight-hour days – with no breaks – and pay as little as $10. And these workers are subject to the whims of the platform. Annotators sometimes go long periods without receiving work, or are unceremoniously kicked out of Scale AI – as happened recently to contractors in Thailand, Vietnam, Poland and Pakistan.

Some annotation and tagging platforms claim to offer “fair trade” work. In fact, they have made it a central part of their branding. But as Kate Kaye of the MIT Tech Review notes, there are no regulations, only weak industry standards for what ethical labeling work means — and company definitions vary widely.

So what to do? Unless there is a massive technological breakthrough, the need to annotate and label data for AI training will not go away. We can hope that platforms self-regulate, but the most realistic solution seems to be policy development. That in itself is a tricky prospect – but it’s the best chance we have, I would say, to change things for the better. Or at least I’m starting to.

Here are some other interesting AI stories from recent days:

    • OpenAI builds a voice cloner: OpenAI is previewing a new AI-based tool it developed, Voice Engine, which allows users to clone a voice from a 15-second recording of a person speaking. But the company chooses not to (yet) distribute it widely, citing risks of misuse and abuse.
    • Amazon doubles down on Anthropic: Amazon has invested another $2.75 billion in the growing power of Anthropic AI, continuing the option it left open last September.
    • Google.org launches an accelerator: Google.org, the charitable arm of Google, is launching a new six-month, $20 million program to help fund nonprofits developing technologies that leverage generative AI.
    • A new model architecture: AI startup AI21 Labs has released a generative AI model, Jamba, that uses a new(ish) model architecture – state space models, or SSM – to improve efficiency.
    • Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model similar to OpenAI’s GPT series and Google’s Gemini. The company claims to have achieved industry-leading results on a number of popular AI tests, including several measurement reasonings.
    • Uber Eats and UK AI regulations: Natasha writes about how an Uber Eats courier’s fight against AI bias shows that justice under UK AI regulations is hard-won.
    • EU guidance on election security: The European Union published draft electoral security guidelines on Tuesday aimed at two dozen platforms regulated by the Digital Services Act, including guidelines to prevent content recommendation algorithms from spreading AI-driven generative disinformation (aka political deepfakes).
    • Grok is upgraded: X’s Grok chatbot will soon receive an improved underlying model, Grok-1.5 — at the same time, all Premium subscribers on X will have access to Grok. (Grok was previously exclusive to X Premium+ customers.)
    • Adobe extends Firefly: This week, Adobe unveiled Firefly services, a collection of more than 20 new generative and creative APIs, tools and services. It also launched Custom Models, which allow businesses to fine-tune Firefly models based on their assets, as part of Adobe’s new GenStudio suite.

More machine learning

What weather is it? AI is increasingly capable of telling you this. I noted some efforts in hourly, weekly, and century forecasts a few months ago, but as with all things AI, the field is evolving rapidly. The teams behind MetNet-3 and GraphCast have published a paper describing a new system called SEEDS, for Scalable Ensemble Envelope Diffusion Sampler.

Animation showing how more forecasts create a more even distribution of weather forecasts.

SEEDS uses broadcasting to generate “sets” of plausible weather outcomes for an area based on input data (radar readings or orbital imagery perhaps) much faster than physics-based models. With a larger number of ensembles, they can cover more edge cases (like an event that only happens in 1 out of 100 possible scenarios) and be more confident about the most likely situations.

Fujitsu also hopes to better understand the natural world by applying AI image processing techniques to underwater images and lidar data collected by autonomous underwater vehicles. Improving imaging quality will allow other, less sophisticated processes (like 3D conversion) to work better on the target data.

Image credits: Fujitsu

The idea is to build a “digital twin” of the waters that can help simulate and predict new developments. We are far from it, but we have to start somewhere.

Among LLMs, researchers have discovered that they imitate intelligence by an even simpler method than expected: linear functions. Frankly, the math is beyond me (vector stuff in many dimensions), but this MIT paper clearly shows that the recall mechanism for these models is pretty… basic.

Even though these models are very complex nonlinear functions, trained on a lot of data and very difficult to understand, they sometimes contain very simple mechanisms. This is an example of that,” said co-lead author Evan Hernandez. If you are more technically minded, check out the document here.

One of the reasons these models can fail is not understanding the context or feedback. Even a truly competent LLM might not “get it” if you tell them your name is pronounced a certain way, because they don’t know or understand anything. In cases where this might be important, like human-robot interactions, it might turn people off if the robot acted that way.

Disney Research has long studied automated character interactions, and this paper on pronunciation and name reuse just appeared recently. This seems obvious, but extracting the phonemes when someone introduces themselves and coding them rather than just the written name is a smart approach.

Image credits: Disney Search

Finally, as AI and research increasingly overlap, it’s worth reevaluating how these tools are used and whether this unnatural union presents new risks. Safiya Umoja Noble has been a leading voice in the field of AI and research ethics for years, and her opinion is always enlightening. She did a great interview with the UCLA news team about how her work is evolving and why we need to stay cool when it comes to bias and bad habits in research.

techcrunch

Back to top button