Define dark sheep

There have been a handful of reactions to this work, some questioning the core motivation, essentially variants of "if there are biases in the data, they're there for a reason, and removing them is removing important information." The authors give a nice example in the paper (web search two identical web pages about CS one mentions "John" and the other "Mary" query for "computer science" ranks the "John" one higher because of embeddings appeal to a not-universally-held-belief that this is bad). This also shows up on twitter embeddings related to hate speech. The authors additionally present a method for removing those stereotypes with no cost (as measured with analogy tasks) to accuracy of the embeddings.

This is not hugely surprising, and it's nice to see it confirmed. Essentially what they found is that word embeddings reflect stereotypes regarding gender (for instance, "nurse" is closer to "she" than "he" and "hero" is the reverse) and race ("black male" is closest to "assaulted" and "white male" to "entitled"). Tolga Bolukbasi and colleagues recently posted an article about bias in what is learned with word2vec, on the standard Google News crawl (h/t Jack Clark).