OpenAI has discovered indications that the Chinese AI start-up DeepSeek has utilized its proprietary models to develop an open-source competitor, raising concerns regarding potential intellectual property violations. The San Francisco-based company noted to sources that there is evidence of “distillation,” a method whereby developers enhance smaller models’ performance by leveraging outputs from larger, more sophisticated models. This technique enables similar outcomes on targeted tasks at significantly reduced costs.
OpenAI did not elaborate further on the details of its findings. According to its terms of service, users are prohibited from copying any of its services or using the output to create competing models.
The launch of DeepSeek’s R1 reasoning model has caught the attention of the market, investors, and tech firms in Silicon Valley due to its notable performance on cognitive tasks, achieving rankings comparable to top US models. An insider remarked that while distillation is commonly practiced in the industry and OpenAI provides a means for developers to do this through its platform, issues arise when it’s done to establish a separate model for independent use.
DeepSeek has not yet responded to requests for comment regarding these claims. Previously, David Sacks, an AI and crypto advisor, suggested the possibility of intellectual property infringement.
Sacks explained that the distillation technique involves one model learning from another, effectively extracting knowledge from the parent model. He expressed the belief that DeepSeek likely distilled knowledge from OpenAI’s models, although he did not present specific evidence supporting this claim.
DeepSeek reported that it utilized 2,048 Nvidia H800 graphics cards and invested $5.6 million to train its V3 model, comprising 671 billion parameters, which is considerably less than what OpenAI and Google have spent on similarly scaled models. Some analysts observed that the model produced responses indicative of being trained on OpenAI’s GPT-4 outputs, which would contravene OpenAI’s terms of service.
Industry experts assert that it is standard for AI labs, in both China and the US, to draw upon outputs from leading companies. Major players like OpenAI have focused on hiring talent to refine their models for more human-like responses, a process that is both costly and time-consuming. Consequently, smaller entities often leverage this joint effort.
A PhD candidate in AI noted that using outputs from human-aligned commercial large language models (LLMs) for training new models is a prevalent practice and if DeepSeek is engaging in this, curbing such practices may pose challenges.
This situation underscores a growing dilemma for pioneering companies facing competition from those that benefit from their work without incurring the same expenses. Chinese firms have rapidly assimilated knowledge from their US counterparts while developing strategies to optimize their limited hardware for cost-effective model training and execution.
OpenAI has acknowledged that organizations, including those based in China, are consistently attempting to distill models from leading US AI companies. The company stated it is actively working to shield its intellectual property, employing careful measures regarding what advanced capabilities get included in released models, and collaborating with the US government to safeguard its most sophisticated technologies from being appropriated by competitors.
OpenAI is concurrently addressing its own copyright infringement claims from various media outlets and authors, who allege that its models were trained using their works without proper authorization.
photo credit: www.ft.com