DeepSeek, a Chinese artificial intelligence (AI) startup, quickly became a key player in the global AI industry. Founded in 2023 by entrepreneur Liang Wenfeng, the company introduced a series of innovative large language models (LLMs) that challenged established tech giants like OpenAI, Anthropic, etc., and reshaped AI development.
Thank you for reading this post, don't forget to subscribe!
Founding and Vision
In July 2023, Liang, co-founder of the hedge fund High-Flyer, established DeepSeek in Hangzhou, Zhejiang Province, China. With a background in electronic information engineering from Zhejiang University, he aimed to advance AI research by developing efficient and fully accessible models. Totally funded by High-Flyer, DeepSeek focused on innovation without having any external financial pressures.
Early Developments
DeepSeek launched its first AI model, DeepSeek-Coder, in November 2023. This open-source tool was designed to assist in code generation and comprehension, thereby benefiting developers and programmers. Later that month, the company introduced the DeepSeek-LLM series, available in 7 billion (7B) and 67 billion (67B) parameter configurations. Trained on a dataset of 2 trillion tokens, these models supported both English and Chinese, positioning DeepSeek as a significant player in the AI field.
Innovations in Model Architecture
In January 2024, DeepSeek introduced DeepSeek-MoE, which utilized a Mixture-of-Experts (MoE) architecture. This technological design allowed the model to activate only a subset of its parameters for each input, thereby enhancing computational efficiency without compromising performance. Such innovation enabled DeepSeek-MoE to achieve results comparable to larger dense models while maintaining lower computational costs.
The following month, in February 2024, DeepSeek released DeepSeek-Math, a model tailored for advanced mathematical problem-solving. Trained on specialized mathematical texts and problem sets, this 7B parameter model excelled in tasks requiring complex mathematical reasoning and theorem proving, further diversifying DeepSeek’s AI offerings.
Advancements and Global Impact
In May 2024, DeepSeek launched DeepSeek-V2, integrating Multi-head Latent Attention (MLA) and an improved MoE framework. With a 128,000-token context length, these models handled extensive text processing efficiently. Their API pricing of 2 RMB per million output tokens significantly decreased the pricing set of competitors inside China and beyond.
In December 2024, DeepSeek introduced DeepSeek-V3, a 671B parameter model trained on 14.8 trillion tokens. Optimized for mathematics, coding, and multilingual tasks, it achieved high performance using fewer resources. Training required about 2,000 Nvidia H800 graphical processing units (GPUs) over 55 days for $5.58 million, challenging the belief that significant AI advancements demand massive computational resources and budgets.
Pioneering Reasoning Capabilities
In January 2025, DeepSeek released DeepSeek-R1, designed to enhance reasoning. Built from DeepSeek-V3-Base, it leveraged reinforcement learning to improve logical inference, mathematical reasoning, and real-time problem-solving. Its performance matched leading models like OpenAI’s o1, positioning DeepSeek at the forefront of AI reasoning research.
Market Disruption and Global Recognition
By early 2025, DeepSeek’s AI assistant became Apple’s store’s most downloaded free app. This rapid adoption disrupted global tech markets and highlighted shifting AI leadership dynamics. DeepSeek’s combination of open-source accessibility and cost-effective, high-performance models set new standards, challenging major tech firms.
Commitment to Innovation
Reflecting on DeepSeek’s growth, Liang emphasized the company’s guiding principle: “Our goal is not to lose money or seek huge profits… We are here to lead in technology and contribute to the ecosystem’s development.” This philosophy shows DeepSeek’s focus on advancing AI for broader societal benefit rather than purely commercial gain.
Conclusion
Overall, In less than two years, DeepSeek evolved from a startup to a key contender in AI. Through a series of strategic model launches and a commitment to innovation and accessibility, they redefined efficiency and performance in AI development. As the field continues to evolve, DeepSeek’s journey demonstrates how visionary leadership and research-driven strategies can drive groundbreaking advancements and shape AI’s future.