Git Blame exists for a reason, and that’s to find the engineer who pushed the bad commit so everyone can work together to fix it.
Blame the Project manager/Middle manager/C-Level exec/Unaware CEO/Greedy Shareholders who allowed for a CI/CD process that doesn’t allow ample time to test and validate changes.
Software needs a union. This shit is getting out of control.
Licensed professional engineers are expected to push back on requests that endanger the public and face legal liability if they don’t. Software has hit the point where failure is causing the economic damage of a bridge collapsing.
That’s not how any of this worked. Also not how working in a large team that develops for thousands of clients works. It wasn’t just one dev that fucked up here.
Crowd Strike Falcon uses a signed boot driver. They don’t want to wait for MS to get around to signing a driver if there’s a zero day they’re trying to patch. So they have an empty driver with null pointers to the meat of a real boot driver. If you fat finger a reg key, that file only containing the 9C character, points to another null pointer in a different file and you end up getting a non bootable system as the whole driver is now empty.
If you don’t understand what I just said here’s some folk that spent good time and effort to explain it.
Exactly. All of our code requires two reviews (one from a lead if it’s to a shared environment), and deploying to production also requires approval of 3 people:
project manager
product owner
quality assurance
And it gets jointly verified immediately after deploy by QA and customer support/product owner. If we want an exception to our deploy rules (low QA pass rate, deploy within business hours, someone important is on leave, etc), we need the director to sign off.
We have <100 people total on the development org, probably closer to 50. We’re a relatively large company, but a relatively small tech team within a non-tech company (we manufacture stuff, and the SW is to support customers w/ our stuff).
I can’t imagine we’re too far outside the norms as far as big org deployments work. So that means that several people saw this change and decided it was fine. Or at least that’s what should happen with a multi-billion dollar company (much larger than ours).
No, separate groups. We basically have four separate, less-technical groups that are all involved in some way with the process of releasing stuff, and they all have their own motivations and whatnot:
PM - evaluated on consistency of releases, and keeping costs in line with expectations
PO - evaluated on delivering features customers want, and engagement with those features
QA - evaluated on bugs in production vs caught before release
support - evaluated on time to resolve customer complaints
devs - evaluated on reliability of estimates and consistency of work
PM, PO, and QA are involved in feature releases, PM, QA, and support are involved in hotfixes. Each tests in a staging environment before signing off, and tests again just after deploy.
It seems to work pretty well, and as a lead dev, I only need to interact with those groups at release and planning time. If I do my job properly, they’re all happy and releases are smooth (and they usually are). Each group has caught important issues, so I don’t think the redundancy is waste. The only overlap we have is our support lead has started contributing code changes (they cross-trained to FE dev), so they have another support member fill in when there’s a conflict of interest.
My industry has a pretty high cost for bad releases, since a high severity bug could cost customers millions per day, kind of like CrowdStrike, so I must assume they have a similar process for releases.
I do wonder how frequent it is that an individual developer will raise an important issue and be told by management it’s not an issue.
I know of at least one time when that’s happened to me. And other times where it’s just common knowledge that the central bureaucracy is so viscous that there’s no chance of getting such-and-such important thing addressed within the next 15 years is unlikely. And so no one even bothers to raise the issue.
Issues have to be prioritized so teams don’t miss critical ones. Apparently they thought the risk of something like this happening was being mitigated elsewhere. Oops!
Reminds me of Microsoft’s response when one of their employees kept trying to get them to fix the vulnerability that ultimately led to the Solar Winds hack.
If capitalism insists on those higher up getting exorbitantly more money than those doing the work, then we have to hold them to the other thing they claim they believe in: that those higher up also deserve all the blame.
It’s a novel concept, I know. Leave the Nobels by the doormat, please.
How could one Dev commit to prod without other Devs reviewing the MR? IF you’re not protecting your prod branch that’s a cultural issue. I don’t know where you’ve worked in the past, or where you’re working now, but once it’s N+1 engineers in a code base there needs to be code reviews.
yieldcode.blog
Hot