TY - GEN
T1 - Quantifying and Debiasing Gender Bias in Japanese Gender-specific Words with Word Embedding
AU - Chen, Leisi
AU - Sugimoto, Toru
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Machine Learning is playing a significant role in modern life. However, the problem that Machine Learning has biases and stereotypes has also drawn the researcher's attention. Word2Vec, a popular framework in the NLP field to encode the word's meaning as a real-valued vector, has been used in many machine learning and natural language processing tasks. Still, it also has been proved that it contains severe biases toward women. In this paper, we used Word2Vec to analyze the relationship between gender-specific words and personality adjectives in Japanese to Figure out the latent gender bias in those gender-specific words. We first found that the Word2Vec model trained by Japanese Wikipedia data shows that some occupation gender-specific words strongly connect with negative personality adjectives. The experiment results reflect that people commonly use these gender-specific words to criticize women in these specific occupations. Then we eliminated the projection of word vectors of personality adjectives on the gender subspace and reduced the relationship between negative personality adjectives and gender-specific words by word vector calculation.
AB - Machine Learning is playing a significant role in modern life. However, the problem that Machine Learning has biases and stereotypes has also drawn the researcher's attention. Word2Vec, a popular framework in the NLP field to encode the word's meaning as a real-valued vector, has been used in many machine learning and natural language processing tasks. Still, it also has been proved that it contains severe biases toward women. In this paper, we used Word2Vec to analyze the relationship between gender-specific words and personality adjectives in Japanese to Figure out the latent gender bias in those gender-specific words. We first found that the Word2Vec model trained by Japanese Wikipedia data shows that some occupation gender-specific words strongly connect with negative personality adjectives. The experiment results reflect that people commonly use these gender-specific words to criticize women in these specific occupations. Then we eliminated the projection of word vectors of personality adjectives on the gender subspace and reduced the relationship between negative personality adjectives and gender-specific words by word vector calculation.
KW - Machine Learning
KW - NLP
KW - Word2Vec
KW - gender bias
UR - http://www.scopus.com/inward/record.url?scp=85146687581&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146687581&partnerID=8YFLogxK
U2 - 10.1109/SCISISIS55246.2022.10001950
DO - 10.1109/SCISISIS55246.2022.10001950
M3 - Conference contribution
AN - SCOPUS:85146687581
T3 - 2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems, SCIS and ISIS 2022
BT - 2022 Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems, SCIS and ISIS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - Joint 12th International Conference on Soft Computing and Intelligent Systems and 23rd International Symposium on Advanced Intelligent Systems, SCIS and ISIS 2022
Y2 - 29 November 2022 through 2 December 2022
ER -