Enterprise

How to prevent your software update from being the next CrowdStrike

Comment

Times Square billboards displaying Windows blue screen of death after CrowdStrike outage on July 19, 2024.
Image Credits: Selcuk Acar/Anadolu / Getty Images

CrowdStrike released a relatively minor patch on Friday, and somehow it wreaked havoc on large swaths of the IT world running Microsoft Windows, bringing down airports, healthcare facilities and 911 call centers. While we know a faulty update caused the problem, we don’t know how it got released in the first place. A company like CrowdStrike very likely has a sophisticated DevOps pipeline with release policies in place, but even with that, the buggy code somehow slipped through.

In this case it was perhaps the mother of all buggy code. The company has suffered a steep hit to its reputation, and the stock price plunged from $345.10 on Thursday evening to $263.10 by Monday afternoon. It has since recovered slightly.

In a statement on Friday, the company acknowledged the consequences of the faulty update: “All of CrowdStrike understands the gravity and impact of the situation. We quickly identified the issue and deployed a fix, allowing us to focus diligently on restoring customer systems as our highest priority.”

Further, it explained the root cause of the outage, although not how it happened. That’s a post mortem process that will likely go on inside the company for some time as it looks to prevent such a thing from happening again.

Dan Rogers, CEO at LaunchDarkly, a firm that uses a concept called feature flags to deploy software in a highly controlled way, couldn’t speak directly to the CrowdStrike deployment problem, but he could speak to software deployment issues more broadly.

“Software bugs happen, but most of the software experience issues that someone would experience are actually not because of infrastructure issues,” he told TechCrunch. “They’re because someone rolled out a piece of software that doesn’t work, and those in general are very controllable.” With feature flags, you can control the speed of deployment of new features, and turn a feature off, if things go wrong to prevent the problem from spreading widely.

It is important to note however, that in this case, the problem was at the operating system kernel level, and once that has run amok, it’s harder to fix than say a web application. Still, a slower deployment could have alerted the company to the problem a lot sooner.

What happened at CrowdStrike could potentially happen to any software company, even one with good software release practices in place, said Jyoti Bansal, founder and CEO at Harness, a maker of DevOps pipeline developer tools. While he also couldn’t say precisely what happened at CrowdStrike, he talked generally about how buggy code can slip through the cracks.

Typically, there is a process in place where code gets tested thoroughly before it gets deployed, but sometimes an engineering team, especially in a large engineering group, may cut corners. “It’s possible for something like this to happen when you skip the DevOps testing pipeline, which is pretty common with minor updates,” Bansal told TechCrunch.

He says this often happens at larger organizations where there isn’t a single approach to software releases. “Let’s say you have 5,000 engineers, which probably will be divided into 100 teams of 50 or so different developers. These teams adopt different practices,” he said. And without standardization, it’s easier for bad code to slip through the cracks.

How to prevent bugs from slipping through

Both CEOs acknowledge that bugs get through sometimes, but there are ways to minimize the risk, including perhaps the most obvious one: practicing standard software release hygiene. That involves testing before deploying and then deploying in a controlled way.

Rogers points to his company’s software and notes that progressive rollouts are the place to start. Instead of delivering the change to every user all at once, you instead release it to a small subset and see what happens before expanding the rollout. Along the same lines, if you have controlled rollouts and something goes wrong, you can roll back. “This idea of feature management or feature control lets you roll back features that aren’t working and get people back to the prior version if things are not working.”

Bansal, whose company just bought feature flag startup Split.io in May, also recommends what he calls “canary deployments,” which are small controlled test deployments. They are called this because they hark back to canaries being sent into coal mines to test for carbon monoxide leakage. Once you prove the test roll out looks good, then you can move to the progressive roll out that Rogers alluded to.

As Bansal says, it can look fine in testing, but a lab test doesn’t always catch everything, and that’s why you have to combine good DevOps testing with controlled deployment to catch things that lab tests miss.

Rogers suggests when doing an analysis of your software testing regimen, you look at three key areas — platform, people and processes — and they all work together in his view. “It’s not sufficient to just have a great software platform. It’s not sufficient to have highly enabled developers. It’s also not sufficient to just have predefined workflows and governance. All three of those have to come together,” he said.

One way to prevent individual engineers or teams from circumventing the pipeline is to require the same approach for everyone, but in a way that doesn’t slow the teams down. “If you build a pipeline that slows down developers, they will at some point find ways to get their job done outside of it because they will think that the process is going to add another two weeks or a month before we can ship the code that we wrote,” Bansal said.

Rogers agrees that it’s important not to put rigid systems in place in response to one bad incident. “What you don’t want to have happen now is that you’re so worried about making software changes that you have a very long and protracted testing cycle and you end up stifling software innovation,” he said.

Bansal says a thoughtful automated approach can actually be helpful, especially with larger engineering groups. But there is always going to be some tension between security and governance and the need for release velocity, and it’s hard to find the right balance.

We might not know what happened at CrowdStrike for some time, but we do know that certain approaches help minimize the risks around software deployment. Bad code is going to slip through from time to time, but if you follow best practices, it probably won’t be as catastrophic as what happened last week.

More TechCrunch

BDO, the auditor for Indian edtech startup Byju’s, has resigned with immediate effect, marking the second auditor departure for the embattled startup in about a year and further intensifying concerns…

Second Byju’s auditor exits in a year amid bankruptcy proceedings

A federal judge says he will deliver a punishment in Google’s antitrust case by August 2025, according to The New York Times, after ruling earlier this month that Google had…

Google to receive punishment for search monopoly by next August, says judge

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code…

ChatGPT: Everything you need to know about the AI-powered chatbot

The world will have to wait a little longer to see Blue Origin’s massive New Glenn rocket fly for the first time. That rocket had been scheduled to launch two…

The maiden voyage of Blue Origin’s massive new rocket won’t be for NASA

After 93 days on orbit, Starliner is coming home.  The spacecraft is a “go” for undocking from the International Space Station at 6:04 p.m. EST, though it will be leaving…

Watch live as Boeing and NASA attempt to bring empty Starliner back to Earth

Some of Vice President Kamala Harris’ wealthier donors are informally asking for FTC Chair Lina Khan to be replaced, reports Bloomberg. It’s not really surprising: Her expansive definition of antitrust…

Wealthy Harris donors are reportedly pressing for ouster of FTC Chair Lina Khan

Mangomint seeks to make it easier for spa and salon owners to run their businesses.

How a cold email to a VC helped salon software startup Mangomint raise $35M

The honors program is one of the first in the U.S. that allows incoming freshmen to apply for the program as part of their initial admission application.

University of Texas opens robotics program up to incoming freshmen

By using readily available natural gas as the feedstock, C-Zero hopes to produce emission-free hydrogen for less than other green hydrogen startups.

C-Zero is raising $18M to make emission-free hydrogen using natural gas, filings reveal

Meta on Friday published an update on how it plans to comply with the Digital Markets Act (DMA), the European law that aims to promote competition in digital marketplaces, where…

Meta will let third-party apps place calls to WhatsApp and Messenger users — in 2027

At the annual Roblox Developers Conference, the company announced on Friday a series of changes coming to the platform in the next few months and years. Most notably, Roblox is…

Roblox introduces new earning opportunities for creators, teases generative AI project

Apple is likely to unveil its iPhone 16 series of phones and maybe even some Apple Watches at its Glowtime event on September 9.

How to watch the iPhone 16 reveal during this year’s big Apple Event

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of startups. Want it in your inbox every Friday? Sign up here. You won’t…

Startups have to be clever when fighting larger rivals

The Philadelphia Eagles and the Green Bay Packers will face off tonight in their first game of the NFL season. But this season opener is a bit different. As the…

NFL kicks off in Brazil for the first time, but reporters and fans can’t post on X due to nationwide ban

Venture capitalist Tim Draper’s international pitch competition, “Meet the Drapers,” is partnering up with TikTok as it heads into its seventh season. Under the new tie-up, entrepreneurs will pitch their…

VC pitch show ‘Meet the Drapers’ partners with TikTok

It’s tempting to think the trend of EV startups merging with special purpose acquisition companies (SPACs) to go public has ended, seeing how many of them are struggling or defunct.…

Public EV startup with an indicted CEO is looking to raise an additional $100 million

In the world of modern AI, data is more than just a resource — it’s the fundamental core that aligns decision-makers, supports processes and enables innovation. As AI applications become…

The New Data Pipeline: Fivetran, DataStax and NEA are coming to TechCrunch Disrupt 2024

In a brief update ahead of the weekend, the London transport network said it has no evidence yet that customer data was compromised.

Transport for London outages drag into weekend after cyberattack

Meta-owned Instagram is jazzing up the inbox by adding new features for photo editing, sticker creation and themes. The company is trying to make Instagram more appealing as a messaging…

Instagram jazzes up its DMs with stickers, photo editing, and themes

Keep the excitement of TechCrunch Disrupt 2024 alive by hosting an exclusive Side Event after hours. Don’t miss out — today is the final day to apply for free! Maximize…

Last call: Boost your brand by hosting a Side Event at TechCrunch Disrupt 2024

Today’s your final chance to secure your TechCrunch Disrupt 2024 Student Pass with a $200 discount! Maximize your savings by opting for the Student 4+ Bundle and bring four or…

Students and recent grads: Last day to save on TechCrunch Disrupt 2024 Student Passes

The Equity podcast crew is wrapping up another eventful week, with real estate, AI agents, gambling and secondary markets — which are, of course, a form of legalized gambling. Mary…

Real estate revolutions and beanie baby economies

More antitrust woes for Google. The U.K’.s competition watchdog said on Friday that it suspects the company of adtech antitrust abuses. The tech giant will now have a chance to…

Google faces provisional antitrust charges in UK for ‘self-preferencing’ its ad exchange

You can build a reminder and task management system for yourself, and use a service that works for your team. But it might not be easy to get your family…

Karo is a to-do app that lets you assign tasks to your friends and family

Earlier this week, the EU’s lead privacy regulator ended its court proceeding related to how X processed user data to train its Grok AI chatbot, but the saga isn’t over…

Elon Musk’s X could still face sanctions for training Grok on Europeans’ data

Telegram has updated its website to explicitly allow users to report private chats to its moderators, the company said in its FAQ page, as it updated some of its other…

Telegram quietly updates website to allow abuse reports following founder’s arrest

SpaceX President Gwynne Shotwell made a public plea to one of Brazil’s top judicial figures on Thursday, asking him to “please stop harassing Starlink” amid the ongoing battle in the…

‘Stop harassing Starlink,’ SpaceX president tells Brazilian judge

OSOM always had a difficult road, with plans to launch a privacy-focused handset.

Osom is shutting down on Friday, as it had ‘no customers for a mobile phone’

Salesforce has acquired Own Company, a New Jersey-based provider of data management and protection solutions, for $1.9 billion in cash. Own is Salesforce’s biggest deal since buying Slack for $27.7…

Salesforce acquires data management firm Own for $1.9B in cash

The U.S. government indictment demonstrated deep knowledge of the Russian spies’ activities, including their real-world meetings at a cafe in Moscow.

US charges five Russian military hackers with targeting Ukraine’s government with destructive malware