The widespread disruption caused by the recent CrowdStrike software glitch, which led to a global outage of Windows systems, has sent shockwaves through the IT community. For CIOs, the event serves as a stark reminder of the inherent risks associated with over-reliance on a single vendor, particularly in the cloud.
The incident, which saw IT systems crashing and displaying the infamous “blue screen of death (BSOD),” exposed the vulnerabilities of heavily cloud-dependent infrastructures.
While the issue is being resolved, it has highlighted the potential for catastrophic consequences when a critical security component fails. This has forced CIOs to question the resilience of their cloud environments and explore alternative strategies.
Reevaluating cloud dependencies
“When an issue of such magnitude happens and causes such a big disruption, it is important and necessary to revisit your existing beliefs, decisions, and tradeoffs that went into arriving at the current architecture,” said Abhishek Gupta, CIO at DishTV, one of India’s largest cable TV provider. “The outcome of the review may still be the same decision but necessary to review,” Gupta said, adding that DishTV is already re-evaluating its cloud strategy in a phased manner after the Crowdstrike incident.
Shashank Jain, CIO at the financial services firm, Shree Financials, suggested a strategic shift. “Organizations and CISOs must review their cloud strategies, and the automatic updating of patches should be discouraged. All patches should first be tested on a test server,” Jain said further emphasizing that despite CrowdStrike’s reputation, the incident revealed a failure of trust due to untested patches causing a cascading effect.
Saurabh Gugnani, Director and Head of CyberDefence, IAM, and Application Security at Netherlands-headquartered TMF Group, added that a diversified approach to cloud strategies could mitigate such risks. “Yes, they [enterprises] should revisit cloud strategies. It has to be a mix of all the available solutions.”
Few organizations have already started taking the leap of faith.
“In response to recent disruptions affecting our critical operations, we have proactively updated our Business Continuity Plan to address unexpected downtimes and minimize the impact on productivity and service delivery,” said Shivkumar Borade, founder and CMD of Mytek Innovations, a victim of the BSOD effect. “Our revised plan includes enhanced communication management, featuring multiple layers to ensure all employees are well-informed about potential issues and their resolution.”
The company’s internal communication was significantly disrupted as its entire network, including Outlook, Teams, and SharePoint, is hosted on Microsoft 365.
“However, our in-house developed application remained unaffected due to GoDaddy’s use of its own hosting infrastructure,” said Borade. “We did experience issues with a few API integrations linked to the Azure platform, which were non-functional for the entire day. This disruption led to interrupted services for both our clients and users.”
A wake-up call for CIOs
A primary concern for CIOs is vendor lock-in. The reliance on a single cloud provider, as demonstrated by the CrowdStrike incident, creates a single point of failure. If a critical service from that provider is disrupted, it can have far-reaching implications for an organization. To mitigate this risk, CIOs are likely to explore multicloud or hybrid cloud architectures, distributing workloads across multiple platforms.
Allie Mellen, a principal analyst at Forrester, emphasized the critical nature of reliable tools and services in the face of cyber threats.
“Reliability of the tools and services cybersecurity teams use is critical in the face of cyberattacks,” Mellen stated. “An incident like this questions that reliability. This will undoubtedly raise questions and concerns from executives about how to ensure the reliability of enterprise systems, especially with technology as integrated into day-to-day operations as cybersecurity software.”
The incident exposed the fragility of cloud-dependent systems where a single point of failure can have cascading effects across an organization. Sunil Varkey, senior security professional and advisor at Beagle Security, noted, “Trust between cloud and security vendors is now questioned. This breach of confidence is likely to drive a higher emphasis on agentless solutions, which can offer enhanced security without the vulnerabilities associated with traditional agents.”
It is said to be one of the worst cybersecurity events considering the magnitude of the impact. The CrowdStrike incident affected computers running Microsoft Windows across various sectors, including airlines, banks, retailers, brokerage houses, media companies, and railways. The travel sector was notably impacted, with airlines and airports in Germany, France, the Netherlands, the UK, the US, Australia, China, Japan, India, Singapore, and Taiwan facing significant issues with check-in and ticketing systems, leading to flight delays and airport chaos.
Microsoft said around 8.5 million Windows computers were affected.
The impact was so much that SpaceX and Tesla CEO Elon Musk had to delete CrowdStrike from all its systems.
Enhanced risk management practices
The incident has highlighted the need for improved risk management practices. Enhanced due diligence, rigorous testing of updates, and phased rollouts are now critical.
“This incident serves as a wake-up call, emphasizing the need for continuous adaptation and improvement in cybersecurity practices across the industry,” said Gaurav Ranade, CTO at RAH Infotech.
D.R. Goyal, senior architect at Rakuten Symphony, advocated for a mechanism to test updates with select users before a full release: “It should have a mechanism to test with certain organizations with a set of users before releasing to the entire community and user base to reduce the impact.”
As the digital landscape evolves, ensuring the resilience of cloud-based systems is paramount. Ashis Guha, founder of An Idea Global Innovations, highlighted broader implications: “The incident has broader implications for the global economy; longer downtimes and recovery times will impact productivity and economics.”
Industry experts recommend several strategies for future preparedness, including phased rollouts, comprehensive testing, and robust backup systems.
Siddharth Ugrankar, Co-founder of Blockchain firm Qila, suggested that a phased deployment and thorough testing of updates could have mitigated the impact: “If CrowdStrike had deployed the update in a phased manner, the impact would have been far less.”
Enterprises aiming to prevent issues akin to the CrowdStrike update incident should bolster their update management by enhancing testing protocols across diverse environments, implementing rigorous risk assessments, and fortifying change management processes with robust governance frameworks, said Moyukh Goswami, CTO at Nuvepro.
“Strengthening monitoring capabilities, refining incident response plans tailored to update failures, and fostering proactive vendor relationships are crucial,” Goswami added.
The CrowdStrike incident underscores the need for CIOs to revisit and fortify their cloud strategies. By implementing robust risk management practices, enhancing security measures, and diversifying cloud solutions, organizations can better protect themselves against future disruptions.
As the industry grapples with the implications of this event, the focus must shift towards building resilient, adaptable, and well-tested cloud strategies to navigate an increasingly complex digital landscape.
Read More from This Article: CrowdStrike incident has CIOs rethinking their cloud strategies
Source: News