Jannah Theme License is not validated, Go to the theme options page to validate the license, You need a single license for each domain name.
Tech

Why code-testing startup Nova AI uses open source LLMs more than OpenAI

It is a universal truth of human nature that the developers who build the code should not be the ones who test it. First of all, most of them hate this task. Second, like any good audit protocol, those who do the work should not be the ones to verify it.

It’s no surprise, then, that code testing in all its forms – usability testing, language- or task-specific testing, end-to-end testing – is the focus of a growing group of software startups. Generative AI. Every week, TechCrunch covers another one like Antithesis (raised $47 million); CodiumAI (raised $11 million) QA Wolf (raised $20 million). And new ones emerge all the time, like the new Y Combinator graduate. Momentic.

Another company is year-old startup Nova AI, which graduated from the Unusual Academy accelerator and raised $1 million in pre-seed. It’s trying to beat competitors with its end-to-end testing tools by breaking many of Silicon Valley’s rules about how startups work, founding CEO Zach Smith told TechCrunch.

While Y Combinator’s standard approach is to start small, Nova AI is aimed at mid-to-large companies with complex codebases and an urgent need. Smith declined to name any customers using or testing its product, except to describe them as primarily late-stage (Series C or beyond) venture-backed startups in e-commerce, fintech or products. consumption, and “intensive user experiences”. Downtime for these features is costly.

Nova AI’s technology sifts through its customers’ code to automatically create tests using GenAI. It is particularly suited to continuous integration and continuous delivery/deployment (CI/CD) environments where engineers are constantly pushing things into their production code.

The idea for Nova AI grew out of Smith and co-founder Jeffrey Shih’s experiences as engineers working for large tech companies. Smith is a former Googler who worked on cloud teams that helped clients use many automation technologies. Shih had previously worked at Meta (also at Unity and Microsoft before that) with a rare specialty in AI involving synthetic data. They have since added a third co-founder, an AI data scientist. Henry Li.

Another rule Nova AI doesn’t follow: While many AI startups rely on OpenAI’s industry-leading GPT, Nova AI uses OpenAI’s GPT-4 Chat as little as possible, just for help generate code and perform some labeling tasks. . No customer data is transmitted to OpenAI.

Even though OpenAI promises that data from people on a paid business plan isn’t used to train its models, businesses still don’t trust OpenAI, Smith tells us. “When we talk to big companies, they tell us, ‘We don’t want our data going into OpenAI,’” Smith said.

Engineering teams at large companies aren’t the only ones who feel this way. OpenAI defends itself on a number of prosecutions of those who do not want their work to be used for training models, or who believe that their work ends, without authorization and without remuneration, in its results.

Nova AI relies rather heavily on open source models like Llama developed by Meta and Star encoder (from the BigCoder community, developed by ServiceNow and Hugging Face), as well as building your own models. They don’t yet use Google’s Gemma with their clients, but have tested it and “seen good results,” Smith says.

For example, he explains that a common use of OpenAI GPT4 is to “produce vector embeddings” on data so that LLM models can use the vectors for semantic search. Vector embeddings translate pieces of text into numbers so that the LLM can perform various operations, such as grouping them with other pieces of similar text. Nova AI uses OpenAI’s GPT4 for this on client source code, but strives not to send data into OpenAI.

“In this case, instead of using OpenAI’s integration templates, we deploy our own open source integration templates so that when we need to go through each file, we don’t just send it to OpenAi” , Smith explained.

While not sending customer data to OpenAI eases companies’ jitters, open source AI models are also cheaper and more than sufficient to perform specific, targeted tasks, Smith found. In this case, they work well for writing tests.

“The open LLM industry is really proving that it can beat GPT 4 and these big domain providers, when you go really close,” he said. “We don’t need to provide a massive model that can tell you what your grandmother wants for her birthday. RIGHT? We need to write a test. And that’s all. Our models are therefore specially adapted for this.

Open source models are also advancing rapidly. For example, Meta recently introduced a new version of Llama that is gaining praise in tech circles and could convince more AI startups to consider OpenAI alternatives.

techcrunch

Back to top button