Thank you, I am fucking sick of people passing this comic around in relation to the Crowdstrike failure. Crowdstrike is a $90bn corporation, they're not some little guy doing a thankless task. They had all the resources and expertise required to avoid this happening, they just didn't give a shit. They want to move fast and break things, and that's exactly what they did.
Off topic but that "move fast and break things" line from Zuck irks me quite a bit. Probably because it's such a bratty corporate billionaire thing to say
It works in most software because the cost of failure is cheap. It's especially cheap if you can make that failure happen early in the development process. If anything, I think the industry should be leaning into this even harder. Iterate quickly and cause failures in the staging environment.
This does not work out so well for things like cars, rockets, and medicine. And, yes, software that runs goddamn everything.
That's an easy thing to say when you haven't laid off a ton of your workforce, might be careful operating like that the way tech has been cutting jobs lately.
You're right people should have high expectations of crowd strike since it's a well funded company, and they should provide better support to the random project with a single maintainer.
That said, is there any indication crowd strike is a "move fast and break things" company? Sometimes people just fuck up, even if they don't have a crazy ideology.
You want proof they move fast and break things? They pushed an untested software update with auto update without rollout phases. How’s that for move fast? As for break things, well, do I need to explain?
Q: We really appreciate everything you’ve shared. To finish up, what is one question you wish I’d asked and how would you have answered?
A: I’ll give you the fun one, which is, we know racing as part of CrowdStrike. Why is that? What does all that mean? It’s a couple of things. One, it’s part of CrowdStrike. Many have probably seen us. If they’ve watched Formula One or Netflix, we’re big sponsors there and we’re pretty active in the US as well. And I think it’s been a great platform for us to gather like-minded customers together to spend some time talking about security in the industry and also understanding that, to your original comment, speed is critical for security. Speed is critical in racing as well. And if you could combine great technology like Formula One and CrowdStrike and speed together, that’s a winning proposition and the details matter, right? If you take care of the details, the little stuff takes care of the big stuff. And that’s just part of our DNA. I think it’s [speed] has served us really well.
Yep, its not the cave gremlin that codes clean and efficiently, using 1/10th of the amount of code lines, that fucks it up. Its the bloated commercial software vendors that break their software every week.
...or it's the gremlin who tries to get by, but only has like 30min a week for his project, since he has a day job and two gremlettes to feed.
See the xz debacle.
The underlying problem is, that there's no monetary value being assigned to good software. As long as it's good enough to sell it and buy insurance, that's fine.
I doubt it. Few people are volunteering their time reading pull requests of random repos. It probably went fast from pull request to deployment, so there would be no time for anyone external to read.
The only thing open source would’ve done is to give us a faster explanation of why it happened after the fact.
Considering this is a cybersecurity product that requires installing a kernel mode driver on mission-critical hardware, I guarantee at least a few people would have been interested in looking at the source if they had the opportunity. Or tried to convince their employers purchasing the software to pay for a third-party audit.
The update that broke everything only pushed data, not code. The bug was extant in the software before the update, likely for years. Can I say for sure that a few extra eyes on the code would have found the problem ahead of time? No, of course not. But it couldn't have hurt.
Or during, and with open source it could have been possible for independent fixes to have been created as people figured out through trial and error. Additionally, something like this would have cost Crowdstrike a ton of trust, and we would see forks of their code to prevent this from happening again, and now have multiple options. As it stands, we have nothing but promises that something like this won’t happen again, and no control over it without abandoning the entire product.
To my understanding, the driver was apparently attempting to process update files without verifying the content first (in this case a file containing all zeroes), so this issue would have likely been visible long before the catastrophe actually happened.
Yes. Security through obscurity is an illusion. ClamAV is a well known and high performance open source AV solution.
Edit: Here is the CWE entry on the topic in case anybody wants to read some more details as to how and why obscurity is not a valid approach to security.
Strictly speaking, it's not anti-virus software. It's not designed to prevent malicious software from running or remove it. It's just monitoring for behavior that looks malicious so it can notify the system administrator and they can take manual action.
Most of the actual proprietary value, ironically enough, is in data files like the one that broke it. Those specify the patterns of behavior that the software is looking for. The software itself just reads those files and looks at the things they tell it to. But that's where the bug was: in the code that reads the files.
If the security of your algorithm depends on the algorithm itself being secret, then it's not safe to distribute the software only in binary form either.
Anti-virus companies--when they do it right--have tightly controlled air-gapped systems that they use to load viruses and test countermeasures. It takes a lot of staff to keep those systems maintained before we even talk about the programming involved, plus making sure some idiot doesn't inadvertently connect those machines to the main building WiFi.