Meta加班抄DeepSeek作业
"Managers and engineers from Meta’s generative AI group and infrastructure team have started four war rooms to learn how DeepSeek works. Two of the mobilized groups are trying to understand how High-Flyer lowered the cost of training and running DeepSeek. Meta wants to apply those techniques, a number of which a technical paper from High-Flyer outlined, to Llama, one of the employees said. ... 6park.comA third Meta research group is trying to figure out what data High-Flyer might have used to train its models, according to one of the employees with direct knowledge. 6park.comThe fourth war room is considering new techniques for restructuring Meta’s models based on attributes of the DeepSeek models, they said. Meta is considering launching a version of Llama that, like DeepSeek, would include numerous AI models, each trained to handle different tasks. That way, when a customer asks Llama to handle a certain task, only some parts of the model would need to work on it. That could make the overall model faster and require less computing power to operate." 6park.com页首
|