When Good Scripts Go Bad

August 16, 2012

This week, an incident happened with Knight Capital when their “trading algorithms” allegedly cost the firm hundreds of millions of dollars. Within hours of the story, various camps have been quick to denounce the algorithms and the automation that was supposed to save the day.  Of course, the problem is in automation, because automation is supposed to reduce errors and prevent outages and boost security and mitigate risks and improve the bottom line and make ice cream sundaes with a cherry on top.  Except when it doesn’t, and now this latest mishap only adds to the argument against such automated practices.

The challenge is that, automation does not reduce errors like some magical fairy dust.  People have rallied around it without realizing some the side effects, especially when the starting points of so many automated tasks may be severely flawed to begin with.  Automation will not clean up your mess, and more importantly, it absolutely removes some of the safety controls that were once present in the manual and belabored efforts.  The elimination of small and frequent interactive human touches — where the same small and frequent errors are injected — means that now, one singular mistake may be replicated through the whole and the magnitude of such a singular mistake is magnified many folds.  Fewer small problems, but boy, the one problem will likely be HUGE.  Automation’s strength aligns with speed, scale and cost reduction.  “Doing it wrong” is a human mistake, and not a weakness of automation.  The computer is literal, it is doing exactly as you commanded it to. As a tool, automation must be used to increase coverage, like in QA, and not for simple laziness, believing that the tools will manage themselves.

The duality of being great at engineering is that there is absolutely no recognition — sometimes, not even a nod from the direction of the software team themselves — on achieving a spectacular milestone. However, the Spanish Inquisition will assuredly arrive to find facts on behalf of the management team if things ever go awry. The level of ensuing finger-pointing is directly proportional to the level of publicity such a going-astray commands. Remember, by that time, it’s not really about identifying causality as much as channeling the embarrassment or anger onto some deflective path, as to avoid the painful introspection [ or revelation ] on what truly went wrong. Identifying human errors takes effort, gumption and a heaping bowl of humility. The easier out is to un-shoulder the responsibility onto systems and processes — oh especially automation — because that’s surely where the mistakes were made.  If it was up to me, I’d still double-down on automation.  The alternative is not the answer.

Settings

Eddie is a technology enthusiast and a blogger, now, who loves all things Internet and mobile, as if those were two separate things. As part of feyn.com, he's looking to battle the forces of evil, fight crimes and purchase security upgrades to the Metaverse.