You can ask a human subtle oblique questions to see how they reached a decision. You can't do that with an ML system.
You can query a well designed ML system, too. You can probe what lead to its conclusions, and probably get more meaningful output than from the average pleb.
Reference please. Or are you relying on "well designed" as being the weasel-words.
Most ML systems are split into a training system, and a much simpler run time system. The run times are as basic as possible, but the systems on which the learning takes place generally have pretty flexible facilities for getting an explanation for a decision.
Please provide references about those "facilities for getting an explanation". Fundamentally it is a serious research topic, which no clear resolutions on the horizon. It will be a fruitful source of PhDs and research grants for decades.
Even where the rules were explicitly coded (i.e. 1980s old-skool AI), in practice it was difficult to determine why a decision was made, and then to modify the rules to make the desired change
and no other. In modern ML systems there are no explicitly coded rules, so even that doesn't work.
I'm sure ML systems they are structured that way: the training systems determine the weights and interconnections, and the deployed runtime executes them. Hence you can see that distinction is irrelevant to the points I've been making.
The second reason it is irrelevant is that deployed systems don't necessarily retain all the information that caused them to make a decision, so their decision making process can't be "replayed" back in the lab.
With people you can probe the mental model they have of the problem. I accept sometimes it will be just "mental", but that is in itself an adequate result!
You cannot even manage that with ML systems since they do not have an identifiable mental model per se; they just have neurons and weighting factors.
ML systems do "forget". All it needs is:
- you spot a problem in an ML's output
- you apply more training examples in the hope they will reconfigure some of the pathways and weights
- you cannot have any concept of how the new pathways/weights will change previously correct output. That's a real problem
We don't usually classify an update in thinking based on new information as forgetting. You can't control a human's learning updates, but its easy to lock down an ML solution if you want to.
In practice you can't lock down an ML system, because there will always be the requirement to remove newly-discovered edge cases. And there will always be newly-discovered edge cases.
Most ML systems can't be trained, as they only perform the run time aspects of the problem. So, any updates come from the learning system, and the transfer of updates from there to the numerous run time systems can be as orderly or chaotic as you make it.
ML, in the modern meaning of the words, are
always "trained by rote" on many many individual examples. Old-skool AI systems were "taught by general rules".
Over-the-air updates are already a problem with driverless cars, because when you get in a car you can't rely on it behaving the same way that it did yesterday.
That is a management problem. Its a pretty dumb process that pushes updates randomly, and surprises a driver in the middle of a journey with new behaviour, whether its an ML behaviour, or something altered in the car's UI.
No, it is a technical problem and a user problem.
Management ought to remove/avoid such problems, but in practice they are only to eager to turn a blind eye.
Have you - like most young software "engineers" - forgotten that "You can't test quality into a product". When you put that to people creating ML systems based on training sets (i.e. all of them), first they pull a face, then they go "la-la-la-la-la".
You can't test quality into a product, but also you can't built a complex product to be perfect. That always defeats human capabilities. Even the best engineered systems do things that surprise their designers, and take years to fully shake out. There is plenty of denial about the vast amount of work it will take to get, say, a driverless car that needs one or two human interventions per journey to one that might be no more dangerous left on its own than the average human. People find it hard to face the reality of just how difficult awkward cases are compared to more straightforward ones. The sort who can't accept just how hard it would be to properly automate the office cleaner's job, and don't understand that the office cleaner is most likely to be laid off because everyone else in the office has been eliminated from their jobs.
Quite right: the first 50% is easy, the last 20% extremely difficult. 50% is acceptable for WarCraft character generation, but
is completely inadequate for important irreversible decisions (e.g. judicial imprisonment, medical diagnosis/treatment, autonomous vehicles, etc).
You are making my points for me. Thanks.
Do us all a favour, and subscribe to comp.risks. It is low-volume and high-quality curated information source - with a 40 year pedigree to prove it!
EDIT: the RSS feed is
http://catless.ncl.ac.uk/risksrss2.xml but there is also a usenet feed and I expect you can get an email (about 2 per week).