Get Ready to Meet the ChatGPT Clones

Get Ready to Meet the ChatGPT Clones

ChatGPT is built on top of text-generation technology that has been available for several years and learns to mirror human text by picking up on patterns in enormous quantities of text, much of it scraped from the web. OpenAI found that adding a chat interface and providing an additional layer of machine learning that involved humans providing feedback on the bot’s responses made the technology more capable and articulate.

The data provided by users interacting with ChatGPT, or services built on it such as Microsoft’s new Bing search interface, may provide OpenAI a key advantage. But other companies are working on replicating the fine-tuning that created ChatGPT.

Stability AI is currently funding a project investigating how to train similar chatbots called Carper AI. Alexandr Wang, CEO of Scale AI, a startup that carries out data labeling and machine-learning training for many technology companies, says many customers are asking for help doing fine-tuning similar to what OpenAI did to create ChatGPT. “We’re pretty overwhelmed with demand,” he says.

Wang believes that the efforts already underway will naturally mean many more capable language models and chatbots emerging. “I think there will be a vibrant ecosystem,” he says.

Sean Gourley, CEO of Primer, a startup that sells AI tools for intelligence analysts, including those in the US government, and an adviser to Stability AI, also expects to soon see many projects make systems like ChatGPT. “The watercooler talk is that this took about 20,000 hours of training,” he says of the human feedback process that honed OpenAI’s bot.

Gourley estimates that even a project that involved several times as much training would cost a few million dollars—affordable to a well-funded startup or large technology company. “It's a magical breakthrough,” Gourley says of the fine-tuning that OpenAI did with ChatGPT. “But it's not something that isn't going to be replicated.”

What happened after OpenAI announced DALL-E 2, a tool for generating complex, aesthetically pleasing images from a text prompt in April 2022 may foreshadow the path ahead for ChatGPT-like bots.

OpenAI implemented safeguards on its image generator to prevent users from making sexually explicit or violent images, or ones featuring recognizable faces, and only made the tool available to a limited number of artists and researchers for fear that it might be abused. Yet because the techniques behind DALL-E were well known among AI researchers, similar AI art tools soon appeared. Four months after DALL-E 2 was released, Stability AI released an open-source image generator called Stable Diffusion that has been folded into numerous products but also adapted to generate images prohibited by OpenAI.

Clement Delangue, CEO of Hugging Face, a company that hosts open-source AI projects, including some developed by Stability AI, believes it will be possible to replicate ChatGPT, but he doesn’t want to predict when. 

“Nobody knows, and we’re still at the learning phase,” he says. “You never really know that you have a good model before you have a good model,” he says. “Could be next week, could be next year.” Neither is very far off.

Add a Comment