Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Case in point: taking stock of the CrowdStrike outages

Last summer, a faulty CrowdStrike software update took down millions of computers, caused billions in damages, and underscored that companies are still not able to manage third-party risks, or respond quickly and efficiently to disruptions.

“It was an interesting case study of global cyber impact,” says Charles Clancy, CTO at Mitre.

In response to the outage, 84% of companies are either considering diversifying their software and service providers, or are already doing so, according to a survey by Adaptavist released in late January.

For companies who had been using CrowdStrike, switching vendors might seem like an obvious solution.

“But then what endpoint detection and response platform should you use instead?” Clancy asks. “Ditching them isn’t the answer if they’re the best product on the market.”

What happened

In CrowdStrike’s own root cause analysis, the cybersecurity company’s Falcon system deploys a sensor to user machines to monitor potential dangers. On July 19, 2024, CrowdStrike released an update, and it crashed user machines.

The company released a fix 78 minutes later, but making it required users to manually access the affected devices, reboot in safe mode, and delete a bad file. An automated fix wasn’t released until three days later.

A total of 8.5 million computers were affected. As a result of the outage, thousands of flights were canceled and tens of thousands delayed worldwide. Several hospitals canceled surgeries as well, and banks, airports, public transit systems, 911 centers, and multiple government agencies — including the Department of Homeland Security — also suffered outages.

The overall cost was estimated at $5.4 billion for Fortune 500 firms alone, according to an analysis by Parametrix, and total economic damages could run into tens of billions, Nir Perry, CEO of cyber insurance risk platform Cyberwrite, told Reuters. By comparison, the previous record-holder for most expensive downtime was the 2017 AWS outage, which cost customers an estimated $150 million.

Delta alone had more than $500 million in losses as a result of crippled operations and thousands of flight cancellations and delays. In a lawsuit the airline filed in October, Delta claimed the faulty update was pushed out in an unsafe manner and CrowdStrike should pay for the losses. In a countersuit, CrowdStrike blamed Delta for the airline’s problems, saying that other airlines were able to recover much faster, and that the contract between the two companies meant Delta wasn’t allowed to sue for damages.

In total, CrowdStrike’s stock price fell from $343 the day before the outage to a low of $218 on August 2. That’s a loss of over $30 billion or more than a third of its total market capitalization.

But, as of January 28, the company’s stock price was over $400, an all-time high, helped by a perfect score on an industry test for ransomware detection. And also by improvements to its quality control processes as CrowdStrike added a check for that particular problem after the outage, as well as other tests, deployment layers, and checks. Customers also got additional controls over how updates are deployed.

In addition, CrowdStrike hired two independent software security vendors to review the Falcon sensor code, its quality control, and release processes, and also changed how its updates are released: more gradually, to “increasing rings of deployment,” says Adam Meyers, CrowdStrike’s SVP for counter adversary operations. “This allows us to monitor for issues in a controlled environment and proactively roll back changes if problems are detected before affecting a wider population,” he told a Congressional subcommittee in September.

But while CrowdStrike made changes, companies around the world re-evaluated how much trust they placed in their vendors, reviewed their software security processes, and refocused their attention on resilience.

Trust, but verify. On second thought, don’t trust…

The outage was a rude awakening for Akamai, a content delivery company, says CIO and SVP Kate Prouty. “It was a reminder of how incredibly interconnected the world is,” she says.

Akamai was not itself a CrowdStrike customer, but does use similar services from outside vendors to help protect its systems.

“The first thing we did was audit all the solutions we have that have an agent that sits on a machine and has access to an operating system to make sure none of them have auto update,” she says. “When you have a third-party vendor that pushes updates to a system automatically, that takes control out of your hands.”

But turning off automatic updates can be a problem for some companies. What if there’s an urgent security fix? It can take time to test each update to make sure it works before rolling it out — time bad actors can take advantage of.

If there’s a security threat and potential exposure, you have to go through the testing process as quickly as you can, Prouty says. “There’s no point in patching even a security issue without knowing if it’s going to cause harm in your environment,” she adds.

Akamai has a structure in place that allows it to do the testing quickly, and involves both automation and human intervention. “It’s worth doing that extra step of diligence because it can save you problems down the road,” she says. After the testing is complete, the update is then rolled out in stages. “It doesn’t completely eliminate the risk, but it certainly reduces the risk of having a large-scale impact,” she adds.

When possible, Akamai avoids using tools that require agents, though there are areas, including cybersecurity, where they’re necessary and the benefits outweigh the risks. “But we didn’t have a lot of them to audit, and we didn’t find anything that was misconfigured,” says Prouty.

Akamai also has other measures in place to reduce the risk of problems third-party software causes, including microsegmentation and identity-based authentication and access controls.

Contracts, audits, and SBOMs

Beyond protecting enterprise architecture from dangerous updates, and dangerous software in general, there are other steps companies can take to safeguard their software supply chain, starting with selecting the vendor and signing the contract. “I’m a CIO in an enviable position in that we sell security solutions that work very well,” Prouty says. “Our legal team knows exactly what to ask for when negotiating contracts. If a company isn’t willing to provide us with what we require to keep our company safe, then we don’t do business with them.”

According to the Cybersecurity and Infrastructure Security Agency, it’s hard for vendors to invest money in security if customers aren’t asking for it. That means, in addition to creating a secure by design philosophy within software companies, the industry also needs a secure by demand philosophy on the buyer side.

As part of this effort, CISA released a software acquisition guide in August for government enterprise customers that could serve as a model for enterprises in general.

The guide addresses four phases of software ownership: software supply chains, development practices, deployment, and vulnerability management, and says they help organizations buying software better understand their software manufacturers’ approach to cybersecurity, and ensure that secure by design is a core consideration.

After the CrowdStrike incident, Akamai began reviewing all its vendor agreements to make sure the contracts had all the necessary protections in place. “We’re still in the process of looking at everything,” Prouty says.

And, again, it’s not enough to take the vendor’s word for it that they’re safe. Akamai, for example, uses tools that audit the configuration of cloud software solutions, as well as run other security checks. “They’re not going to eliminate risk but they’ll significantly reduce it,” she says.

Another approach that enterprises are increasingly using is asking vendors to provide a software bill of materials (SBOM). In an Anchore survey released in November, 78% of organizations plan to increase their use of SBOMs in the next 18 months.

Building resilience

Unfortunately, all the precautions in the world can only reduce risk, not eliminate it. That’s why Akamai also plans for worst-case scenarios and runs drills to gauge its ability to respond quickly, and look for areas that need improvement. Immediately after the CrowdStrike outage happened, for example, Akamai ran a tabletop exercise.

“If this had happened to us, what would it look like?” Prouty asks. The exercise even involved running through CrowdStrike’s remediation process. The exercise worked, she says, and Akamai would’ve been able to recover if the bad update had slipped through the checks.

More companies should be doing these kinds of preparedness drills, says Mitre’s Clancy. “You need to understand your incident response plan, your communication plan, and not just have it written down, but practice it so those skills are fresh,” he says.

In addition, it’s important to involve more than just the security team in these exercises. “When you have an incident, the entire business is impacted,” he adds. “CIOs need to bring the other business executives in on these exercises and disaster response plans. In the real world, they’re the ones calling the shots, not some incident response manager three levels down.”

Resiliency is particularly important since enterprises can’t always test all third-party software. “Independently auditing every software update isn’t practical,” Clancy says. “The best thing to do is have playbooks in place to respond and recover if something like this does happen.” But 84% of organizations didn’t have an adequate incident response plan in place before the CrowdStrike outage took place, the Adaptavist survey shows. And of those who did have a plan, only 16% found them effective during the crisis. Fortunately, that might now be changing.

After the outage, 54% of organizations say they’re implementing an incident response plan, or investing more into the one they have. Plus, about half are introducing or increasing investment into a variety of testing measures, and monitoring and observing technologies over the next 12 months.

Next steps

Guy Moskowitz, CEO and co-founder at Coro Cybersecurity, says the big problem is when vendors prioritize speed and profits over best practices. “CrowdStrike pushes out around a dozen updates every day,” he says. That’s a lot of opportunities for things to go wrong. “I hope we’ll see a push for legislation that recommends or even requires that all cybersecurity companies immediately implement staging environment safeguards to their software upgrade rollout process,” he adds. “This way, they’ll catch any mishaps in a secure environment before rolling out the update broadly to customers.”

He’s not the only one who wants to see government action. In the Adaptavist survey, 47% of respondents say they’re now more supportive of regulations around cybersecurity and resilience than they were before, and 48% are more supportive of regulations around software quality assurance. In addition, 49% endorse mandatory incident reporting requirements.

In August, the US Technology Policy Committee of the Association for Computing Machinery released a statement calling for a thorough investigation of the incident so both private enterprises and regulators can learn how to better strengthen cyberinfrastructure, improve incident response programs and remediation processes, improve international coordination and cooperation, and develop claims processes for these incidents.

“When mistakes happen, it can be serious — and this was a very serious incident,” says Jody Westby, vice-chair of AMC’s US Technology Policy Committee. “Companies had to go through and reset systems, and it took weeks to recover from this.”

But there’s only so much individual customers can do, she says.

“The big vendors aren’t going to have 5,000 different contracts with 5,000 different customers,” she says. “In some cases we can push contract clauses and say, ‘You’ll send us a SOC 2 report every year and you’ll attest you have all these controls.’ And they might sign and say yes, but you won’t really know. There’s only so far you can go with due diligence.”

What the CrowdStrike incident has done is highlight the need for better government assistance, she says.

The Association for Computing Machinery says there’s already an organization that seems to be uniquely positioned to undertake an investigation into the incident and publish results: the CISA’s Cyber Safety Review Board. In its statement, the ACM urged the US government to provide the CSRB with the necessary resources it needs to take on this investigation. That would have been nice but instead, the Department of Homeland Security just disbanded it, citing “misuse of resources.” The AI Safety and Security Board was also disbanded. That’s a particular problem because, just as with CrowdStrike, there’s a growing dependence on a small number of vendors. OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Gemini, and Meta’s Llama are the foundation of nearly all enterprise AI applications, says Chuck Herrin, field CISO at security firm F5.

“Our rush to adopt AI without corresponding investment in security and resilience suggests we’re setting ourselves up for potentially catastrophic failures that could make the CrowdStrike incident appear minor in retrospect,” he says. “The CrowdStrike incident required physical access to affected systems for recovery, yet organizations are now creating AI dependencies so deep that manual intervention may become impossible.”


Read More from This Article: Case in point: taking stock of the CrowdStrike outages
Source: News

Category: NewsApril 2, 2025
Tags: art

Post navigation

PreviousPrevious post:수세 CRO “벤더 종속 심화하는 IT 인프라 시장··· 오픈소스 철학 유지할 것”NextNext post:Onilsa busca, digitalización mediante, impulsar una construcción eficiente y sostenible

Related posts

AI security analytics: Turning your data into defenses
May 22, 2025
Digital twins at scale: Building the AI architecture that will reshape enterprise operations
May 22, 2025
Why Microsoft is unifying data and AI within Fabric
May 22, 2025
MCP, ACP, and Agent2Agent set standards for scalable AI results
May 22, 2025
Data analytics and AI on and off the court in Orlando
May 22, 2025
SAP wants to make AI ubiquitous — just don’t ask about S/4HANA
May 22, 2025
Recent Posts
  • AI security analytics: Turning your data into defenses
  • Digital twins at scale: Building the AI architecture that will reshape enterprise operations
  • Why Microsoft is unifying data and AI within Fabric
  • MCP, ACP, and Agent2Agent set standards for scalable AI results
  • Data analytics and AI on and off the court in Orlando
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.