DeepSeek’s Secret Sauce? A Dash of Google Gemini

Earlier this year, DeepSeek burst onto the scene seemingly out of nowhere with an AI model that appeared to hold its own against some of the best. The company recently announced the latest update to its AI model, but the latest DeepSeek update might have used Google Gemini to train itself.

DeepSeek trains itself with Google’s Gemini

According to a post on X by Sam Paech, one reason the latest DeepSeek model “sounds” different than its previous iteration is that it used Google Gemini to train itself. Paech isn’t alone in thinking this. The developer of SpeechMap notes that DeepSeek’s traces read a lot like Gemini’s. For those unfamiliar, traces are the thought processes the AI model goes through before reaching a conclusion.

This isn’t the first time DeepSeek’s developers have been accused of using other AIs to train their own AI model. When DeepSeek first arrived, OpenAI suspected that DeepSeek had used ChatGPT to train itself. It is one of the reasons why DeepSeek claimed that its training process cost a lot less than the competition.

Unlike other AI models that use raw data to “learn”, DeepSeek uses a process called distillation. It involves using the output from other AI models to teach itself. It’s similar to the student-teacher concept, where the teacher distills knowledge they learned previously from books into something that the student can understand.

It is admittedly a more efficient method, but there is the question of the ethics behind it. In fact, OpenAI’s terms of service actually prohibit customers from using the company’s AI model outputs to build their own competing AI. If that’s the case, DeepSeek has clearly violated OpenAI’s policies.

Ethically questionable, but efficient

While DeepSeek’s actions are ethically questionable, some think it makes sense. For instance, Nathan Lambert, a researcher at the nonprofit AI research institute AI2, says that it makes sense that DeepSeek would use Google Gemini to train itself with.

According to Lambert, “If I was DeepSeek I would definitely create a ton of synthetic data from the best API model out there. They’re short on GPUs and flush with cash. It’s literally effectively more compute for them. yes on the Gemini distill question.”

Let’s not forget that the US-China trade war is hindering China’s technological advancements. This includes blocking access to more advanced semiconductor technology and limiting the type of technology that can be exported to China. So, it’s not surprising that Chinese companies, like DeepSeek, are finding alternative ways to train their models.

READ SOURCE