Every affected company should be extremely thankful that this was an accidental bug, because if crowdstrike gets hacked, it means the bad actors could basically ransom I don’t know how many millions of computers overnight
Not to mention that crowdstrike will now be a massive target from hackers trying to do exactly this
and its not just opeerational costs. what happens when an outage lasts 3+ days and affects all communication and travel? thats another massive shock to the system.
The CEO made a statement to the effect of “It’s not an attack, it’s just me and my company being shockingly incompetent.” He didn’t use exactly those words but that was the gist.
If I had to bet my money, a bad machine with corrupted memory pushed the file at a very final stage of the release.
The astonishing fact is that for a security software I would expect all files being verified against a signature (that would have prevented this issue and some kinds of attacks
So here’s my uneducated question: Don’t huge software companies like this usually do updates in “rollouts” to a small portion of users (companies) at a time?
That’s what the BSOD is. It tries to bring the system back to a nice safe freshly-booted state where e.g. the fans are running and the GPU is not happily drawing several kilowatts and trying to catch fire.
I’m gonna take from this that we should have AI doing disaster recovery on all deployments. Tech CEO’s have been hyping AI up so much, what could possibly go wrong?
Problem is that software cannot deal with unexpected situations like a human brain can. Computers do exactly what a programmer tells it to do, nothing more nothing less. So if a situation arises that the programmer hasn’t written code for, then there will be a crash.
The file is used to store values to use as denominators on some divisions down the process. Being all zeros is caused a division by zero erro. Pretty rookie mistake, you should do IFERROR(;0) when using divisions to avoid that.
I disagree. I’d rather things crash than silently succeed or change the computation. They should have done better input and output validation, and gracefully fail into a recoverable state that sends a message to an admin to correct. A divide by zero doesn’t crash a system, it’s a recoverable error they should 100% detect and handle, hot sweep under the rug.
Life pro tip: if you’re a python programmer you should use try: func() except: continue every time you run a function, that way ypu would never have errors on your code.
twiiit.com
Hot