Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

What CIOs should learn now that DeepSeek is here

Chinese AI startup DeepSeek made a big splash last week when it unveiled an open-source version of its reasoning model, DeepSeek-R1, claiming performance superior to OpenAI’s o1 generative pre-trained transformer (GPT).

The news caused NVIDIA, leading maker of GPUs used to power AI in data centers, to shed nearly $600 billion of its market cap on Monday because DeepSeek’s innovations, according to Gartner, appear to use significantly less advanced hardware and computing power resources, while still offering performance comparable to other leading LLMs at a faction of the cost.

CIOs are now reassessing the strategies to transform their organizations with gen AI, but it’s not exactly time to throw out the work that’s already been done.

“DeepSeek’s advancements could lead to more accessible and affordable AI solutions, but they also require careful consideration of strategic, competitive, quality, and security factors,” says Ritu Jyoti, group VP and GM, worldwide AI, automation, data, and analytics research with IDC’s software market research and advisory practice.

That echoes a statement issued by NVIDIA on Monday: “DeepSeek is a perfect example of test time scaling. DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely available models and compute that is fully export control compliant. Inference requires significant numbers of NVIDIA GPUs and high-performance networking.”

Open to interpretation

Chirag Dekate, a VP analyst at Gartner who specializes in quantum technologies, AI, digital R&D, and emerging tech, believes the market is overreacting to both technical details of what was required to train DeepSeek, and the source of the innovation itself.

“It feeds into this perception of us versus some unknown them, and also into a narrative of jingoism or nationalism,” he says. “These narratives are taking hold because they capture the imagination faster than anybody actually double-clicking into the technical report, because when they see the details, they’re less glamorous than the headlines made them out to be.”

That’s not to disregard DeepSeek’s innovations, however. In a research note, Gartner said DeepSeek challenges the prevailing gen AI cost structures and methodologies, underscoring the inefficiencies in current leading vendor pricing models that can lead to negative ROI for high-value use cases deployed at scale.

“DeepSeek’s R1 model thus represents a pivotal shift, suggesting that the future of gen AI lies in innovative, cost-efficient approaches rather than the traditional paradigm of scaling through sheer computational force,” Gartner researchers, including Haritha Khandabattu, Jeremy D’Hoinne, Rita Sallam, Leinar Ramos, and Arun Chandrasekaran, wrote in a research note Wednesday.

Peter Rutten, research VP, performance intensive computing, and worldwide infrastructure research at IDC, says the key takeaway from the DeepSeek results is the current approach to AI training — that AI can only improve with bigger, more, and faster architecture — is not justified.

“New approaches to algorithm, framework, and software for AI development deliver comparable or even better results than, for example, the latest version of ChatGPT, with the same accuracy and at a fraction of the infrastructure cost,” says Rutten. “What this means is that AI training doesn’t need to be the sole domain of hyperscalers who can afford to invest billions of dollars into large infrastructure buildouts.”

Instead, he adds, the approach DeepSeek developed shows that large AI development is within reach for enterprises from a cost and footprint perspective.

“Medium-sized or small AI initiatives also become significantly more affordable, including customizing or finetuning a model, as well as inferencing on a model,” he says. “I believe AI will become affordable — perhaps, over time, as affordable as any other workload, thanks to the type of technologies that DeepSeek developed.”

Deep interest for CIOs

Dekate believes the DeepSeek news is yet another reminder of the speed at which AI innovation is accelerating, and that CIOs need to engage with gen AI now, if they haven’t already, or risk becoming obsolete.

“CIOs have a choice to either jump in, start experimenting, start creating gen AI strategies, implementation, and deployment strategies today, or fall so far behind that catching up isn’t even an option,” he says.

Even if the market is overreacting to the degree to which DeepSeek disrupts the current gen AI landscape, Dekate says it’s a clear sign CIOs can’t afford to wait any longer.

“DeepSeek is showcasing that the cost vectors of gen AI will eventually  become more effective and more approachable,” he says.

IDC’s Jyoti notes that Kai-Fu Lee, chairman and CEO of Sinovation Ventures, who was the founding director of Microsoft Research Asia and is former president of Google China, predicted last year that Chinese AI startups would focus on creating efficiencies.

“Digging through their secret sauce, it’s evident it’s all about RL [reinforcement learning] and how they used it,” Jyoti adds. “Most language models use a combination of pre-training, supervised fine-tuning, and then some RL to polish things up. DeepSeek’s approach has showed that LLMs are capable of reasoning with RL alone.”

Making the distinction

DeepSeek-R1 is a new open-weight LLM based on the DeepSeek-V3 base model. DeepSeek-R1-Zero is an interim model trained solely via RL. Gartner says it demonstrates that model providers can use pure RL to increase capabilities in certain domains, like math and coding, where answers are hard to generate but easy to verify.

But Gartner researchers said the DeepSeek model doesn’t represent a new model paradigm. Rather it builds on the existing LLM training architecture, layering on technical and architectural optimizations to make training and inference more efficient. Nor does DeepSeek set a new state-of-the-art for model performance. The Gartner researchers added it often matches, but doesn’t surpass, existing state-of-the-art models. They also said DeepSeek isn’t proof that scaling models via additional compute and data doesn’t matter. Instead, it shows it pays off to scale a more efficient model.

“DeepSeek’s R1 launch and its dramatically lower inference pricing compared to OpenAI’s o1-preview model go hand in hand with the broader commoditization of the LLM model layer, they wrote. “That means efficiency isn’t about cost per token anymore,” the researchers added. “It’s about which model can reason the cheapest, without impacting accuracy and latency. So the focus will soon turn to efficient scaling of AI versus how much compute you can assemble to build it.”

The researchers, agreeing with their colleague Dekate, note that in the wake of the DeepSeek announcement, other model builders like Meta are in their war rooms devising plans to follow. CIOs, therefore, should expect a rapid short- to mid-term reduction in the cost and price of LLMs, but only to a degree.

“These software and algorithmic-driven innovations also allow model vendors to do more with more powerful hardware,” they wrote. “The most advanced new models will still have high R&D and compute costs that’ll be passed on to early adopters.”

IDC’s Jyoti offers five key takeaways for CIOs:

  • Cost efficiency: DeepSeek’s AI models claim to achieve high performance at a faction of the cost compared to traditional models. This could mean that companies might not need to invest as heavily in infrastructure and hardware, potentially lowering the barriers to entry for advanced AI capabilities.
  • Competitive landscape: DeepSeek’s emergence as a strong competitor to established AI giants like OpenAI and Meta suggests the AI landscape is becoming more competitive. This could drive innovation and force existing players to improve their offerings and reduce costs.
  • Open-weight models: DeepSeek’s decision to release its models as “open-weight” allows developers and researchers to access and build upon its technology. This openness could foster a more collaborative environment in the AI community, accelerating advancements and applications.
  • Strategic re-evaluation: With DeepSeek demonstrating that high-performance AI can be achieved with less data and lower costs, CIOs might need to reassess their AI strategies. This includes evaluating current investments in AI infrastructure and considering more cost-effective alternatives.
  • Data privacy and security: Given that DeepSeek is based in China, there may be concerns about data privacy and security. CIOs should carefully consider the implications of integrating technology from companies that operate under different regulatory environments.

Forrester principal analysts Carlos Casanova, Michele Pelino, and Michele Goetz further note that CIOs should expect DeepSeek to impact edge computing technologies, AIOps, and IT operations. In particular, DeepSeek has the ability to explain its answers by default, delivering transparency that’s crucial to building trust and understanding in AI-driven decisions in AIOps solutions.

“With LLMs running on edge devices, AIOps and observability can achieve new levels of real-time insight and automation,” they wrote. “The integration of smaller-footprint LLMs that can run at the edge — such as DeepSeek R1 — with AIOps can also lead to more proactive and predictive maintenance of devices and infrastructure, or injection of risk-mitigating actions with no human intervention.”


Read More from This Article: What CIOs should learn now that DeepSeek is here
Source: News

Category: NewsJanuary 30, 2025
Tags: art

Post navigation

PreviousPrevious post:AI Pact: Simplifying EU AI Act compliance for enterprisesNextNext post:SquareX Discloses “Browser Syncjacking”, a New Attack Technique that Provides Full Browser and Device Control, Putting Millions at Risk

Related posts

휴먼컨설팅그룹, HR 솔루션 ‘휴넬’ 업그레이드 발표
May 9, 2025
Epicor expands AI offerings, launches new green initiative
May 9, 2025
MS도 합류··· 구글의 A2A 프로토콜, AI 에이전트 분야의 공용어 될까?
May 9, 2025
오픈AI, 아시아 4국에 데이터 레지던시 도입··· 한국 기업 데이터는 한국 서버에 저장
May 9, 2025
SAS supercharges Viya platform with AI agents, copilots, and synthetic data tools
May 8, 2025
IBM aims to set industry standard for enterprise AI with ITBench SaaS launch
May 8, 2025
Recent Posts
  • 휴먼컨설팅그룹, HR 솔루션 ‘휴넬’ 업그레이드 발표
  • Epicor expands AI offerings, launches new green initiative
  • MS도 합류··· 구글의 A2A 프로토콜, AI 에이전트 분야의 공용어 될까?
  • 오픈AI, 아시아 4국에 데이터 레지던시 도입··· 한국 기업 데이터는 한국 서버에 저장
  • SAS supercharges Viya platform with AI agents, copilots, and synthetic data tools
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.