Enterprise

How to prevent your software update from being the next CrowdStrike

Comment

Times Square billboards displaying Windows blue screen of death after CrowdStrike outage on July 19, 2024.
Image Credits: Selcuk Acar/Anadolu / Getty Images

CrowdStrike released a relatively minor patch on Friday, and somehow it wreaked havoc on large swaths of the IT world running Microsoft Windows, bringing down airports, healthcare facilities and 911 call centers. While we know a faulty update caused the problem, we don’t know how it got released in the first place. A company like CrowdStrike very likely has a sophisticated DevOps pipeline with release policies in place, but even with that, the buggy code somehow slipped through.

In this case it was perhaps the mother of all buggy code. The company has suffered a steep hit to its reputation, and the stock price plunged from $345.10 on Thursday evening to $263.10 by Monday afternoon. It has since recovered slightly.

In a statement on Friday, the company acknowledged the consequences of the faulty update: “All of CrowdStrike understands the gravity and impact of the situation. We quickly identified the issue and deployed a fix, allowing us to focus diligently on restoring customer systems as our highest priority.”

Further, it explained the root cause of the outage, although not how it happened. That’s a post mortem process that will likely go on inside the company for some time as it looks to prevent such a thing from happening again.

Dan Rogers, CEO at LaunchDarkly, a firm that uses a concept called feature flags to deploy software in a highly controlled way, couldn’t speak directly to the CrowdStrike deployment problem, but he could speak to software deployment issues more broadly.

“Software bugs happen, but most of the software experience issues that someone would experience are actually not because of infrastructure issues,” he told TechCrunch. “They’re because someone rolled out a piece of software that doesn’t work, and those in general are very controllable.” With feature flags, you can control the speed of deployment of new features, and turn a feature off, if things go wrong to prevent the problem from spreading widely.

It is important to note however, that in this case, the problem was at the operating system kernel level, and once that has run amok, it’s harder to fix than say a web application. Still, a slower deployment could have alerted the company to the problem a lot sooner.

What happened at CrowdStrike could potentially happen to any software company, even one with good software release practices in place, said Jyoti Bansal, founder and CEO at Harness, a maker of DevOps pipeline developer tools. While he also couldn’t say precisely what happened at CrowdStrike, he talked generally about how buggy code can slip through the cracks.

Typically, there is a process in place where code gets tested thoroughly before it gets deployed, but sometimes an engineering team, especially in a large engineering group, may cut corners. “It’s possible for something like this to happen when you skip the DevOps testing pipeline, which is pretty common with minor updates,” Bansal told TechCrunch.

He says this often happens at larger organizations where there isn’t a single approach to software releases. “Let’s say you have 5,000 engineers, which probably will be divided into 100 teams of 50 or so different developers. These teams adopt different practices,” he said. And without standardization, it’s easier for bad code to slip through the cracks.

How to prevent bugs from slipping through

Both CEOs acknowledge that bugs get through sometimes, but there are ways to minimize the risk, including perhaps the most obvious one: practicing standard software release hygiene. That involves testing before deploying and then deploying in a controlled way.

Rogers points to his company’s software and notes that progressive rollouts are the place to start. Instead of delivering the change to every user all at once, you instead release it to a small subset and see what happens before expanding the rollout. Along the same lines, if you have controlled rollouts and something goes wrong, you can roll back. “This idea of feature management or feature control lets you roll back features that aren’t working and get people back to the prior version if things are not working.”

Bansal, whose company just bought feature flag startup Split.io in May, also recommends what he calls “canary deployments,” which are small controlled test deployments. They are called this because they hark back to canaries being sent into coal mines to test for carbon monoxide leakage. Once you prove the test roll out looks good, then you can move to the progressive roll out that Rogers alluded to.

As Bansal says, it can look fine in testing, but a lab test doesn’t always catch everything, and that’s why you have to combine good DevOps testing with controlled deployment to catch things that lab tests miss.

Rogers suggests when doing an analysis of your software testing regimen, you look at three key areas — platform, people and processes — and they all work together in his view. “It’s not sufficient to just have a great software platform. It’s not sufficient to have highly enabled developers. It’s also not sufficient to just have predefined workflows and governance. All three of those have to come together,” he said.

One way to prevent individual engineers or teams from circumventing the pipeline is to require the same approach for everyone, but in a way that doesn’t slow the teams down. “If you build a pipeline that slows down developers, they will at some point find ways to get their job done outside of it because they will think that the process is going to add another two weeks or a month before we can ship the code that we wrote,” Bansal said.

Rogers agrees that it’s important not to put rigid systems in place in response to one bad incident. “What you don’t want to have happen now is that you’re so worried about making software changes that you have a very long and protracted testing cycle and you end up stifling software innovation,” he said.

Bansal says a thoughtful automated approach can actually be helpful, especially with larger engineering groups. But there is always going to be some tension between security and governance and the need for release velocity, and it’s hard to find the right balance.

We might not know what happened at CrowdStrike for some time, but we do know that certain approaches help minimize the risks around software deployment. Bad code is going to slip through from time to time, but if you follow best practices, it probably won’t be as catastrophic as what happened last week.

More TechCrunch

It’s been three years since Life360’s $205 million acquisition of AirTag competitor Tile. The company announced Monday its new lineup of lost-item Bluetooth trackers, featuring a sleeker redesign in new…

Life360’s Tile introduces its first new Bluetooth trackers since its acquisition

Typeface, a generative AI startup focused on enterprise use cases, has acquired a pair of companies just over a year after raising $100 million at a $1 billion valuation. Typeface…

Generative AI startup Typeface acquires two companies, Treat and Narrato, to bolster its portfolio

Earlier this year, former NFL quarterback and civil rights activist Colin Kaepernick launched his AI startup, Lumi. Kaepernick has had thousands of stories written about him, and he knows a…

Colin Kaepernick is coming to TechCrunch Disrupt 2024

Runway, one of several AI startups developing video-generating tech, today announced an API to allow devs and organizations to build the company’s generative AI models into third-party platforms, apps and…

Runway announces an API for its video-generating models

IBM today launched the Qiskit Functions Catalog, a new set of services that aims to make programming quantum computers easier by abstracting away many of the complexities of working with…

IBM makes developing for quantum computers easier with the Qiskit Functions Catalog

Supermaven, an AI coding assistant, has raised $12 million in a funding round that had participation from OpenAI and Perplexity co-founders.

AI coding assistant Supermaven raises cash from OpenAI and Perplexity co-founders

Arjun Vora and Tito Goldstein were working on the corporate side of Uber when they realized that HR software largely wasn’t built to manage hourly staff. Many hourly workers lacked…

TeamBridge, founded by former Uber execs, raises $28M to build HR software for hourly workers

The US Food and Drug Administration Monday published approval for sleep apnea detection on the Apple Watch Series 9, Series 10, and Watch Ultra 2. The green light comes four…

Apple Watch sleep apnea detection gets FDA approval

Featured Article

Apple AirPods 4 with Active Noise Cancellation review

I can’t recall another consumer electronics product category becoming a commodity as quickly as Bluetooth earbuds. Apple’s AirPods played a key role in that growth, of course, recapturing a kind of excitement not seen in consumer music tech since the original iPod. AirPods’ fundamentals haven’t changed much in the eight…

Apple AirPods 4 with Active Noise Cancellation review

Myntra, India’s largest fashion e-commerce platform, is trialling a four-hour delivery service in four Indian cities, two sources familiar with the matter told TechCrunch, a dramatic acceleration from its standard…

Myntra bets on 4-hour delivery amid India’s quick commerce boom

AWS today announced that it is transitioning OpenSearch, its open source fork of the popular Elasticsearch search and analytics engine, to the Linux Foundation with the launch of the very…

AWS brings OpenSearch under the Linux Foundation umbrella

Insight Partners is reportedly on the cusp of closing on more than $10 billion in capital commitments for its 13th fund, per the FT.  The FT report notes that two…

Insight Partners is closing in on a whopping $10B+ new fund

The Port of Seattle released a statement Friday confirming that it was targeted by a ransomware attack. The attack occurred on August 24, with the Port (which also operates the…

Port of Seattle shares ransomware attack details

A decade after the wildly popular game Flappy Bird disappeared, an organization calling itself The Flappy Bird Foundation announced plans to “re-hatch the official Flappy Bird® game.” But this morning,…

Flappy Bird’s creator disavows ‘official’ new version of the game

Platforms to connect apps that wouldn’t normally talk to each other have been around for a minute (see: Zapier). But they have not gotten dramatically simpler to use if you’re…

DryMerge promises to connect apps that normally don’t talk to each other — and when it works, it’s great

Featured Article

Cohere co-founder Nick Frosst’s indie band, Good Kid, is almost as successful as his AI company

Nick Frosst, the co-founder of $5.5 billion Canadian AI startup Cohere, has been a musician his whole life. He told TechCrunch that once he started singing, he never shut up. That’s still true today. In addition to his full-time job at Cohere, Frosst is also the front man of Good…

Cohere co-founder Nick Frosst’s indie band, Good Kid, is almost as successful as his AI company

Blockchain technology is all about decentralization and virtualization. So it’s a little ironic that humans love to come together in person at big blockchain events. Such was the case last…

A walk through the crypto jungle at Korea Blockchain Week

I have a guilty pleasure, and it’s not that I just rewatched “Glee” in its entirety (yes, even the awful later seasons), or that I have read an ungodly amount…

The LinkedIn games are fun, actually

It’s looking increasingly likely that OpenAI will soon alter its complex corporate structure. Reports earlier this week suggested that the AI company was in talks to raise $6.5 billion at…

OpenAI could shake up its nonprofit structure next year

Fusion startups have raised $7.1 billion to date, with the majority of it going to a handful of companies. 

Every fusion startup that has raised over $300M

Netflix has never quite cracked the talk show formula, but maybe it can borrow an existing hit from YouTube. According to Bloomberg, the streamer is in talks with BuzzFeed to…

‘Hot Ones’ could add some heat to Netflix’s live lineup

Alex Parmley has been thinking about building his latest company, ORNG, since he was working on his last company, Phood.  Launched in 2018, Phood was a payments app that let…

Why ORNG’s founder pivoted from college food ordering to real-time money transfer

Lawyers representing Sam Bankman-Fried, the FTX CEO and co-founder who was convicted of fraud and money laundering late last year, are seeking a new trial. Following crypto exchange FTX’s collapse,…

Sam Bankman-Fried appeals conviction, criticizes judge’s ‘unbalanced’ decisions

OpenAI this week unveiled a preview of OpenAI o1, also known as Strawberry. The company claims that o1 can more effectively reason through math and science, as well as fact-check…

OpenAI previews its new Strawberry model

There’s something oddly refreshing about starting the day by solving the Wordle. According to DeepWell DTx, there’s a scientific explanation for why our brains might feel just a bit better…

DeepWell DTx receives FDA clearance for its therapeutic video game developer tools

Soundiiz is a free third-party tool that builds portability tools through existing APIs and acts as a translator between the services.

These two friends built a simple tool to transfer playlists between Apple Music and Spotify, and it works great

In early 2018, VC Mike Moritz wrote in the FT that “Silicon Valley would be wise to follow China’s lead,” noting the pace of work at tech companies was “furious”…

This is how bad China’s startup scene looks now

Fei-Fei Li, the Stanford professor many deem the “Godmother of AI,” has raised $230 million for her new startup, World Labs, from backers including Andreessen Horowitz, NEA, and Radical Ventures.…

Fei-Fei Li’s World Labs comes out of stealth with $230M in funding

Bolt says it has settled its long-standing lawsuit with its investor Activant Capital. One-click payments startup Bolt is settling the suit by buying out the investor’s stake “after which Activant…

Fintech Bolt is buying out the investor suing over Ryan Breslow’s $30M loan

The rise of neobanks has been fascinating to witness, as a number of companies in recent years have grown from merely challenging traditional banks to being massive players in and…

Dave and Varo Bank execs are coming to TechCrunch Disrupt 2024