Note to Self: No, Apple! No Potty Mouth, Please!: A number of the biases in voice recognition systems come from the initial training dataset. Senior google employees have claimed to me–how serious they were I do not know–that Gmail autocomplete's extraordinary! love! for! exclamation! points! comes from its use of google engineers as its initial training dataset.

Today I am disturbed that Apple voice recognition keeps hearing “slut“ when I say “slack“. What training dataset produces that? From my perspective, Apple voice recognition needs to acquire much less of a potty mouth—or at least to have a potty-mouth-off switch—for it to be useful to me. Someday it is going to do something, and I am not going to catch it...

