A tool aimed at figuring out gender bias on Twitter has trouble figuring out users' identities.
Twee-Q, which just launched a U.K. version after starting last summer in Sweden, is an attempt to figure out how often people retweet women compared to how often they retweet men. The goal: to highlight gender bias.
Picking out the gender of a Twitter user isn't straightforward. The service does not ask you to denote your gender, and so Twee-Q opts for a database of names to determine each user's gender. The database uses names from Statistics Sweden, and U.S. Census data.
That can lead to misidentification, especially when it comes to unisex names.
twee-q.com automatically assigns me as a boy because of my name. oh the horrible irony!— andrew mole (@catbeef) May 13, 2013
Ruining your Twee-Q score with my 2 male character parody accounts & Gender Neutral Twitter Name. Take that, you well-meaning app!— Scriblit (@Scriblit) May 13, 2013
Here's the skinny: When a unisex name pops up, Twee-Q chooses the gender that's most common.
Because Twee-Q originated in Sweden, there are certain regional quirks that skew the popularity of each name. For instance, the name Jan is likely more commonly female in the U.S. (think Jan Brady) whereas the opposite is true in much of Europe (think Jan Magnussen). On Twee-Q, Jan is male. (Tough luck if your female friends are all named Jan and you retweet them and only them.)
Also, Twee-Q does not index retweets of brands or any other accounts where the username does not match up with names in its database. That should cut out many nonsense results.
However, omitting those skews the findings anyway. If you were to retweet 99 women who use pseudonyms that do not appear in the database and one man who uses his real name, it would appear that you have a strong bias toward retweeting men—which wouldn't be true.
Further muddying the results, the tool doesn't account for those who identify as neither male nor female.
While Twee-Q is certainly an admirable stab at highlighting gender bias, it's already running at a handicap thanks to the difficulty in establishing users' identities.
H/T The Independent | Illustration by Fernando Alfonso III