
There’s a new AI player in town, and you might want to pay attention.
On Monday, Chinese artificial intelligence company DeepSeek launched a new, open-source large language model called DeepSeek R1.
According to DeepSeek, R1 wins over other popular LLMs (large language models) like OpenAI in many important benchmarks, and it is especially good with mathematical, coding, and logic tasks.
Tweet may have been deleted
DeepSeek R1 is actually a refinement of DeepSeek R1 Zero, which was an LLM that was trained without a traditionally used method called supervised fine-tuning. This made it very capable of some tasks, but as DeepSeek itself says, Zero had “poor readability and language mixing”. Enter R1, which fixes these issues by incorporating “multi-stage training and cold-start data” before it is trained with reinforcement learning.
mashable light speed
Arcane technical language aside (details are online if you’re interested), there are several important things you should know about the DeepSeek R1. First of all, it’s open source, which means it’s up for scrutiny by experts, which should reduce concerns about privacy and security. Second, it’s free to use as a web app, while API access is much cheaper ($0.14 for one million input tokens, compared to $7.5 for OpenAI’s most powerful reasoning model, O1).
OpenAI may soon release agentic AI tool operator
Most importantly, this thing is very capable. To test it, I immediately threw it into deep waters, and asked it to code a fairly complex web app that needed to parse publicly available data and create a web with travel and weather information for tourists. There was a need to create a dynamic website. Amazingly, DeepSeek produced perfectly acceptable HTML code in no time, and was able to further refine the site based on my input, automatically improving and optimizing the code along the way.
I will do all that tomorrow…
Credit: Stan Schroeder / Mashable / DeepSeek
I also asked it to improve my chess skills in five minutes, to which it replied with several well-arranged and very useful tips (my chess skills did not improve, but only because I really wanted to on DeepSeek’s tips. Was too lazy to implement).
Then I asked DeepSeek to prove how smart it is in exactly three sentences. This move of mine is bad because I, the human, am not smart enough to verify or fully understand any of these three sentences. Note, in the screenshot below, you can see DeepSeek’s “thought process” as it figures out the answer, which is probably even more fascinating than the answer itself.
We get it, you’re smart.
Credit: Stan Schroeder / Mashable / DeepSeek
Using it is impressive. But as ZDnet notes, the backdrop to all this is training costs that are significantly lower than some competing models, as well as chips that aren’t as powerful as those available to US AI companies. DeepSeq thus shows that it is not very expensive to train or use extremely smart AI with reasoning capabilities.
Subject
artificial intelligence