DeepSeek says its R1 update rival in performing math, coding and logic
Chinese AI startup DeepSeek has recently released an upgraded version of its R1 model, named DeepSeek-R1-0528, which significantly enhances its capabilities in mathematics, programming, and logical reasoning. This update positions DeepSeek as a formidable competitor to leading AI models from OpenAI and Google.
DeepSeek-R1-0528
-
Mathematics: The model achieves a 97.3% score on the MATH-500 benchmark, surpassing OpenAI's o1-1217, which scores 96.4%.
-
Programming: On the Code forces platform, DeepSeek-R1-0528 attains an Elo rating of 2,029, placing it in the top 3.7% of human coders. Additionally, it scores 65.9% on the Live CodeBench benchmark, outperforming OpenAI's o1-mini model.
-
Logical Reasoning: The model demonstrates strong performance on the AIME 2024 benchmark with a 79.8% score, slightly edging out OpenAI's o1-1217 at 79.2%.
These enhancements are attributed to DeepSeek's innovative use of reinforcement learning techniques and a multi-stage training process that emphasizes reasoning and problem-solving skills.
Cost Efficiency
One of DeepSeek-R1-0528's standout features is its cost-effectiveness. The model was trained at a fraction of the cost compared to its competitors, with training expenses around $12 million, significantly lower than the estimated $40 million for OpenAI's o1-1217. This affordability extends to its usage, with output costs per million tokens at $2.19, compared to OpenAI o1's $60.00
Open-Source Accessibility
DeepSeek-R1-0528 is available on the Hugging Face platform, making it accessible for developers and researchers worldwide. Its open-source nature encourages collaboration and innovation within the AI community
No comments
Post a Comment