I used to work on software which was subject to the DO-178B Level A software development regulations (this was so long ago that it was before DO-178C came out), which are probably one of the biggest operational examples we have of real-world regulation of potentially life-endangering software systems. My impression of them, as a then-junior developer who went on to work on other high-reliability but unregulated systems, is that they were ~20% actually useful stuff, like:
-- stringent, high-coverage testing requirements
-- requiring that you actually write down a failure mode analysis and point to where you were mitigating each failure mode and have that document reviewed by someone
and ~80% bureaucratic CYA and well-intentioned sludge, like:
-- "traceability" requirements from code to multiple levels of documentation and back
-- reviewer "independence" requirements that made it almost impossible to find someone who both knew enough to review the code intelligently and was "independent" enough
-- quantitative fault probability analyses intended to prove that the chance of catastrophic failure was less than 10^-9, which in practice were exercises in making up numbers that were basically impossible to evaluate with any sort of epistemic rigor
Am I being too cynical about DO-178? Either way, can we learn useful things from its practical application history to apply to AI regulation?
Thanks for this comment; I agree, the best approach is to put controls at points where they have high impact. This is where I think OpenAI, if it wanted to, could really stand out: if the majority of controls implemented (maybe 70%, realistically?) are high-impact and the rest are tested, identified as not useful, and then removed or upgraded, they could improve on the usual ratio.
Thanks, I'll take a look and see if I have anything to add, and post something on LinkedIn too in case one of my old colleagues is inspired.
Dredging my old memories for examples of what actually seemed to improve safety in the earlier regulated environment: one particularly effective tactic was stress testing, i.e. throwing enormous numbers of weird or "hard" inputs at the program until you found a crash. AIUI the security folks use similar strategies for things like "fuzzing".
It's worth thinking about what that would entail in an AI context and how one could standardize a best-practice version. It seems like informally the current players are trying to do some fuzzing by releasing their products and letting people play with them to produce things like the recent "evil Bing" type outputs. One would like a more lab-controlled and repeatable version of this, with some reasonable defined process for checking when an output is problematic-- you might have to start by asking human raters "does this seem creepy and potentially dangerous to you?" but could perhaps train a narrow AI to do the job more scalably.
Fascinating. There’s a ground breaking movie, documentary perhaps, just waiting to be made here. My immediate thought: Is the human race an AIG gone rogue? Suddenly I’m thinking of a comedy/drama film, but one that could explore both hazards and great possibilities.
Great topic. Sure raises a lot of questions though. Humanity's experience with regulating complexity is mixed. I can think of four major initiatives in the last century where regulating things with very high risk association. The ones that come to mind are Finance, Light Water Reactors, WMD and Compartmentalizing National Security Information. Each could be a topic unto itself as it relates to AI, ML & AGI. You are in a great and interesting field.
Thanks, Mark! So many potential benefits and risks, and I am apprehensive about the somewhat nonlinear risk AI could pose, but giving up is not a good option--thus these posts to organize some thoughts and provide food for thought.
I agree. The breakdown of the parties and little compromise is very dangerous. The resulting rise of "deregulation good, regulation bad" is broadly accepted by so many we are in genuine danger of unintended consequences. The recent crashes of Boeing jets with (1) extensive software (2) largely self-regulating businesses rather than govt oversight. These were perfect examples of how indispensible government regulation and oversight are. AI/ML/AGI are factors of 1000 times more dangerous. It is difficult to imagine how leaving these decisions in the hands of private firms will not end badly. It is inconceivable how bad it would have been to let GE and Westinghouse decide somewhat independently what was a reasonable design and operation approach for a nuclear power plant. Even with broad oversight, VERY LARGE OMISSIONS and "I didn't think of that" occurred anyhow. Only the US Navy emerged as a successful self-governing entity. Not another single government or entity managed to get it right worldwide.
I used to work on software which was subject to the DO-178B Level A software development regulations (this was so long ago that it was before DO-178C came out), which are probably one of the biggest operational examples we have of real-world regulation of potentially life-endangering software systems. My impression of them, as a then-junior developer who went on to work on other high-reliability but unregulated systems, is that they were ~20% actually useful stuff, like:
-- stringent, high-coverage testing requirements
-- requiring that you actually write down a failure mode analysis and point to where you were mitigating each failure mode and have that document reviewed by someone
and ~80% bureaucratic CYA and well-intentioned sludge, like:
-- "traceability" requirements from code to multiple levels of documentation and back
-- reviewer "independence" requirements that made it almost impossible to find someone who both knew enough to review the code intelligently and was "independent" enough
-- quantitative fault probability analyses intended to prove that the chance of catastrophic failure was less than 10^-9, which in practice were exercises in making up numbers that were basically impossible to evaluate with any sort of epistemic rigor
Am I being too cynical about DO-178? Either way, can we learn useful things from its practical application history to apply to AI regulation?
Thanks for this comment; I agree, the best approach is to put controls at points where they have high impact. This is where I think OpenAI, if it wanted to, could really stand out: if the majority of controls implemented (maybe 70%, realistically?) are high-impact and the rest are tested, identified as not useful, and then removed or upgraded, they could improve on the usual ratio.
Hopefully OpenAI is also contributing to the new NIST AI risk framework, and you can too if you want, the draft Playbook is taking comments until February 27th: https://pages.nist.gov/AIRMF/ and overall contact info for their risk framework is here: https://www.nist.gov/itl/ai-risk-management-framework/ai-risk-management-framework-engage
Thanks, I'll take a look and see if I have anything to add, and post something on LinkedIn too in case one of my old colleagues is inspired.
Dredging my old memories for examples of what actually seemed to improve safety in the earlier regulated environment: one particularly effective tactic was stress testing, i.e. throwing enormous numbers of weird or "hard" inputs at the program until you found a crash. AIUI the security folks use similar strategies for things like "fuzzing".
It's worth thinking about what that would entail in an AI context and how one could standardize a best-practice version. It seems like informally the current players are trying to do some fuzzing by releasing their products and letting people play with them to produce things like the recent "evil Bing" type outputs. One would like a more lab-controlled and repeatable version of this, with some reasonable defined process for checking when an output is problematic-- you might have to start by asking human raters "does this seem creepy and potentially dangerous to you?" but could perhaps train a narrow AI to do the job more scalably.
Fascinating. There’s a ground breaking movie, documentary perhaps, just waiting to be made here. My immediate thought: Is the human race an AIG gone rogue? Suddenly I’m thinking of a comedy/drama film, but one that could explore both hazards and great possibilities.
That would be an interesting film or play!
Great topic. Sure raises a lot of questions though. Humanity's experience with regulating complexity is mixed. I can think of four major initiatives in the last century where regulating things with very high risk association. The ones that come to mind are Finance, Light Water Reactors, WMD and Compartmentalizing National Security Information. Each could be a topic unto itself as it relates to AI, ML & AGI. You are in a great and interesting field.
Thanks, Mark! So many potential benefits and risks, and I am apprehensive about the somewhat nonlinear risk AI could pose, but giving up is not a good option--thus these posts to organize some thoughts and provide food for thought.
I agree. The breakdown of the parties and little compromise is very dangerous. The resulting rise of "deregulation good, regulation bad" is broadly accepted by so many we are in genuine danger of unintended consequences. The recent crashes of Boeing jets with (1) extensive software (2) largely self-regulating businesses rather than govt oversight. These were perfect examples of how indispensible government regulation and oversight are. AI/ML/AGI are factors of 1000 times more dangerous. It is difficult to imagine how leaving these decisions in the hands of private firms will not end badly. It is inconceivable how bad it would have been to let GE and Westinghouse decide somewhat independently what was a reasonable design and operation approach for a nuclear power plant. Even with broad oversight, VERY LARGE OMISSIONS and "I didn't think of that" occurred anyhow. Only the US Navy emerged as a successful self-governing entity. Not another single government or entity managed to get it right worldwide.