There have been multiple accounts created with the sole purpose of posting advertisement posts or replies containing unsolicited advertising.

Accounts which solely post advertisements, or persistently post them may be terminated.

Delta CEO says CrowdStrike-Microsoft outage cost the airline $500 million

  • Delta Air Lines CEO Ed Bastian said the massive IT outage earlier this month that stranded thousands of customers will cost it $500 million.
  • The airline canceled more than 4,000 flights in the wake of the outage, which was caused by a botched CrowdStrike software update and took thousands of Microsoft systems around the world offline.
  • Bastian, speaking from Paris, told CNBC’s “Squawk Box” on Wednesday that the carrier would seek damages from the disruptions, adding, “We have no choice.”
hydrashok ,

Pretty sure their software’s legal agreement, and the corresponding enterprise legal agreement, already cover this.

The update was the first domino, but the real issue was the disarray of Delta’s IT Operations and their inability to adequately recover in a timely fashion. Sounds like a customer skimping on their lifecycle and capacity planning so that Ed can get just a bit bigger bonus for meeting his budget numbers.

Brkdncr ,

Negligence can make contracts a little less permanent.

modeler ,

Couldn’t agree more.

And now that this occurred, and cost $500m, perhaps finally some enterprise companies may actually resource IT departments better and allow them to do their work. But who am I kidding, that’s never going to happen if it hits bonuses and dividends :(

Semi_Hemi_Demigod ,
@Semi_Hemi_Demigod@lemmy.world avatar

I wasn’t affected by this at all and only followed it on the news and through memes, but I thought this was something that needed hands-on-keyboard to fix, which I could see not being the fault of IT because they stopped planning for issues that couldn’t be handled remotely.

Was there some kind of automated way to fix all the machines remotely? Is there a way Delta could have gotten things working faster? I’m genuinely curious because this is one of those Windows things that I’m too Macintosh to understand.

Shadow ,
@Shadow@lemmy.ca avatar

All the servers and infrastructure should have “lights out management”. I can turn on a server, reconfigure the bios and install windows from scratch on the other side of the world.

Potentially all the workstations / end point devices would need to be repaired though.

The initial day or two I’ll happily blame on crowdstrike. After that, it’s on their IT department for not having good DR plans.

curbstickle ,

Hell I just did that with what’s effectively a black box this morning - if it’s critical, it gets done the right way or it don’t bother doing it at all.

PriorityMotif ,
@PriorityMotif@lemmy.world avatar
riskable ,
@riskable@programming.dev avatar

Yeah… Maybe don’t put all your IT eggs in one basket next time.

Delta is the one that chose to use Crowdstrike on so many critical systems therefore the fault still lies with Delta.

Every big company thinks that when they outsource a solution or buy software they’re getting out of some responsibility. They’re not. When that 3rd party causes a critical failure the proverbial finger still points at the company that chose to use the 3rd party.

The shareholders of Delta should hold this guy responsible for this failure. They shouldn’t let him get away with blaming Crowdstrike.

clstrfck ,

So you think Delta should’ve had a different antivirus/EDR running on every computer?

Th4tGuyII ,
@Th4tGuyII@fedia.io avatar

I think what @riskable was saying is you shouldn't have multiple mission critical systems all using the same 3rd party services. Have a mix of at least two, so if one 3rd party service goes down not everything goes down with it

partial_accumen ,

That sounds easy to say, but in execution it would be massively complicated. Modern enterprises are littered with 3rd party services all over the place. The alternative is writing and maintaining your own solution in house, which is an incredibly heavy lift to cover the entirety of all services needed in the enterprise. Most large enterprises are resources starved as is, and this suggestion of having redundancy for any 3rd party service that touches mission critical workloads would probably increase burden and costs by at least 50%. I don’t see that happening in commercial companies.

Th4tGuyII ,
@Th4tGuyII@fedia.io avatar

As far as the companies go, their lack of resources is an entirely self-inflicted problem, because they're won't invest in increasing those resources, like more IT infrastructure and staff.
It's the same as many companies that keep terrible backups of their data (if any) when they're not bound to by the law, because they simply don't want to pay for it, even though it could very well save them from ruin.

The crowdstrike incident was as bad as it was exactly because loads of companies had their eggs in one basket. Those that didn't recovered much quicker. Redundancy is the lesson to take from this that none of them will learn.

partial_accumen ,

As far as the companies go, their lack of resources is an entirely self-inflicted problem, because they’re won’t invest in increasing those resources, like more IT infrastructure and staff.

Play that out to its logical conclusion.

  • Our example airline suddenly doubles or triples its IT budget.
  • The increased costs don’t actually increase profit it merely increases resiliency
  • Other airlines don’t do this.
  • Our example airline has to increase ticket prices or fees to cover the increased IT spending.
  • Other airlines don’t do this.
  • Customers start predominantly flying the other airlines with their cheaper fares.
  • Our example airline goes out of business, or gets acquired by one of the other airlines

The end result is all operating airlines are back to the prior stance.

brianary ,

Two big assumptions here.

First, multiple business systems are already being supported, and the OS only incidentally. Assuming double or triple IT costs is very unlikely, but feel free to post evidence to the contrary.

Second, a tight coupling between costs and prices. Anyone that’s been paying attention to gouging and shrinkflation of the past few years of record profits, or the doomsaying virtually anywhere the minimum wage has increased and businesses haven’t been annihilated, would know this is nonsense.

partial_accumen ,

First, multiple business systems are already being supported, and the OS only incidentally. Assuming double or triple IT costs is very unlikely, but feel free to post evidence to the contrary.

The suggestion the poster made was that ALL 3rd party services need to have an additional counterpart for redundancy. So we’re not just talking about a second AV vendor. We have to duplicate ALL 3rd party services running on or supporting critical workloads to meet what that poster is suggesting.

  • inventory agents
  • OS patching
  • security vulnerability scanning
  • file and DB level backup
  • monitoring and alerting
  • remote access management
  • PAM management
  • secrets management
  • config managment

…the list goes on.

Anyone that’s been paying attention to gouging and shrinkflation of the past few years of record profits, or the doomsaying virtually anywhere the minimum wage has increased and businesses haven’t been annihilated, would know this is nonsense.

You’re suggesting the companies simply take less profits? Those company’s board of directors will get annihilated by shareholders. The board would be voted out with their IT improvement plans, and replace with those that would return to profitability.

bomibantai ,

customers start predominantly flying the other airlines with cheaper fares

I was with you till this part, except with the way flying is set up in this country, there’s very little competition between airlines. They’ve essentially set themselves up with airports/hubs so if an airline is down for a day, that’s kinda it unless you want to switch to a different airport.

partial_accumen ,

In the USA besides very small cities, this isn’t my experience. My flights out of my home airport are spread across 5 or 6 airlines. My city doesn’t even break into the top ten largest in the nation. As far as domestic destinations, There are usually 3 to 5 airlines available as choices.

cm0002 ,

Our example airline has to increase ticket prices or fees to cover the increased IT spending.

Or they could just cut already excessive executive bonuses…

ricecake ,

In this case, it’s a local third party tool and they thought they could control to cadence of updates. There was no reason to think there was anything particularly unstable about the situation.

This is closer to saying that half of your servers should be Linux and half should be windows in case one has a bug.

Crowdstrike bypassed user controls on updates.
The normal responsible course of action is to deploy an update to a small test environment, test to make sure it doesn’t break anything, and then slowly deploy it to more places while watching for unexpected errors.
Crowdstrike shotgunned it to every system at once without monitoring, with grossly inadequate testing, and entirely bypassed any user configurable setting to avoid or opt out of the update.

I was much more willing to put the blame on the organizers that had the outages for failing to follow best practices before I learned that they way the update was pushed would have entirely bypassed any of those safeguards.

It’s unreasonable to say that an organization needs to run multiple copies of every service with different fundamental infrastructure choices for each in case one magics itself broken.

kbin_space_program ,

Crowdstrike also bypassed Microsoft's driver signing as part of their update process, just to make the updates release faster.

That MS is getting any flak for this is just shit journalism.

riskable ,
@riskable@programming.dev avatar

If I were in charge I wouldn’t put anything critical on Windows. Not only because it’s total garbage from a security standpoint but it’s also garbage from a stability standpoint. It’s always had these sorts of problems and it always will because Microsoft absolutely refuses to break backwards compatibility and that’s precisely what they’d have to do in order to move forward into the realm of, “modern OS”. Things like NTFS and the way file locking works would need to go. Everything being executable by default would need to end and so, so much more low-level stuff that would break like everything.

Aside about stability: You just cannot keep Windows up and running for long before you have to reboot due to the way file locking works (nearly all updates can’t apply until the process owning them “lets go”, as it were and that process usually involves kernel stuff… due to security hacks they’ve added on since WinNT 3.5 LOL). You can’t make it immutable. You can’t lock it down in any effective way without disabling your ability to monitor it properly (e.g. with EDR tools). It just wasn’t made for that… It’s a desktop operating system. Meant for ONE user using it at a time (and one main application/service, really). Trying to turn it into a server that runs many processes simultaneously under different security contexts is just not what it was meant to do. The only reason why that kinda sort of works is because of hacks upon hacks upon hacks and very careful engineering around a seemingly endless array of stupid limitations that are a core part of the OS.

kbin_space_program , (edited )

Please go read up on how this error happened.

This is not a backwards compatibility thing, or on Microsoft at all, despite the flaws you accurately point out. For that matter the entire architecture of modern PCs is a weird hodgepodge of new systems tacked onto older ones.

  1. Crowdstrike's signed driver was set to load at boot, edit: by Crowdstrike.
  2. Crowdstrike's signed driver was running unsigned code at the kernel level and it crashed. It crashed because the code was trying to read a pointer from the corrupt file data, and it had no protection at all against a bad file.

Just to reiterate: It loaded up a file and read from it at the kernel level without any checks that the file was valid.

  1. As it should, windows treats any crash at the kernel level as a critical issue. and bluescreens the system to protect it.

The entire fix is to boot into safe mode and delete the corrupt update file crowdstrike sent.

clstrfck ,

I enjoy hating on Windows as much as the next guy who installed Linux on their laptop once, but the bottom line is 90 percent of businesses use it because it does work.

Blaming the people who made the decision to purchase arguably the most popular EDR solution on the planet and use it (those bastards!) does nothing but show a lack of how any business related IT decisions work.

riskable ,
@riskable@programming.dev avatar

Adding another reply since I went on a bit of a rant in my other one… You’re actually missing the point I was trying to make: No matter what solution you choose it’s still your fault for choosing it. There are a zillion mitigations and “back up plans” that can be used when you feel like you have no choice but to use a dangerous 3rd party tool (e.g. one that installs kernel modules). Delta obviously didn’t do any of that due diligence.

catloaf ,

Sounds like they executed their plans just fine.

And due diligence is “the investigation or exercise of care that a reasonable business or person is normally expected to take before entering into an agreement or contract with another party or an act with a certain standard of care”. Having BC/DR plans isn’t part of due diligence.

Poem_for_your_sprog ,

Why do news outlets keep calling it a Microsoft outage? It’s only a crowdstrike issue right? Microsoft doesn’t have anything to do with it?

Rekhyt ,

It was a Crowdstrike-triggered issue that only affected Microsoft Windows machines. Crowdstrike on Linux didn’t have issues and Windows without Crowdstrike didn’t have issues. It’s appropriate to refer to it as a Microsoft-Crowdstrike outage.

Poem_for_your_sprog ,

I guess microsoft-crowdstrike is fair, since the OS doesn’t have any kind of protection against a shitty antivirus destroying it.

I keep seeing articles that just say “Microsoft outage”, even on major outlets like CNN.

hperrin ,

Sure, but they did send a $10 Uber Eats gift card, so you gotta take that into account.

TheAuthor_13 ,

Good. They’ve been stealing from their customers for decades; this is fuckin’ karmic.

themeatbridge ,

Also, maybe don’t put all your eggs into one single basket, from an infrastructure perspective.

stoy ,

Yeah, I say I as migrate another service to Azure…

ASDraptor ,

499.999.990

Remember that you got your $10 gift card for Uber eats.

FlyingSquid ,
@FlyingSquid@lemmy.world avatar

Which didn’t work.

JohnnyCanuck ,
@JohnnyCanuck@lemmy.ca avatar

It worked but there was a $10 convenience fee.

Evotech ,

No, only the partners did

Railcar8095 ,

Technically, it was a $10 gift card for each IT technician, so that could have been a whole $100!

Not so bad after all

JJROKCZ ,

I can’t wait to see crowdstrike get liquidated from all of this, MSOFT is getting so much flak when this straight up wasn’t their fault

kevindqc ,

Their stock is at +44% since July 2023, they might be fine

Evotech ,

Pure gambling

kubica ,

The reboot 15 times solution, etc it is a bit on their side. But in general I agree, CrowdStrike and the industries that need that kind of service should know better.

catloaf ,

Why would they be liquidated?

PlasticExistence ,

Crowdstrike wouldn’t have a business model if the security of Microsoft Windows wasn’t so awful. Microsoft isn’t directly to blame for this, but they’re not blameless either.

werefreeatlast ,

But at least a door didn’t just fall off a plane. That’s already fixed in windows 11 the shitwagon release.

dhork ,

Bastian said the figure includes not just lost revenue but “the tens of millions of dollars per day in compensation and hotels” over a period of five days. The amount is roughly in line with analysts’ estimates. Delta didn’t disclose how many customers were affected or how many canceled their flights.

It’s important to note that the DOT recently clarified a rule that reinforced that if an airline cancels a flight, they have to compensate the customer. So that’s the real reason why Delta had to spend so much, they couldn’t ignore their customers and had to pay out for their inconvenience.

kxan.com/…/can-you-get-compensation-if-your-fligh…

So think about how much worse it might have been for fliers if a more industry-friendly Transportation Secretary were in charge. The airlines might not have had to pay out nearly as much to stranded customers, and we’d be hearing about how stranded fliers got nothing at all.

solsangraal ,

womp.

womp.

MediaBiasFactChecker Bot ,

CNBC Media Bias Fact Check Credibility: [High] (Click to view Full Report)### CNBC is rated with High Creditability by Media Bias Fact Check. > Bias: Left-Center
> Factual Reporting: Mostly Factual
> Country: United States of America
> Full Report: mediabiasfactcheck.com/cnbc/
Check the bias and credibility of this article on Ground.News


Thanks to Media Bias Fact Check for their access to the API.
Please consider supporting them by donating.

Footer> Media Bias Fact Check is a fact-checking website that rates the bias and credibility of news sources. They are known for their comprehensive and detailed reports. Beep boop. This action was performed automatically. If you dont like me then please block me.💔
If you have any questions or comments about me, you can make a post to LW Support lemmy community.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • random
  • lifeLocal
  • goranko
  • All magazines