?

Log in

No account? Create an account
Statistics for fun - The Motive Center
June 28th, 2006
11:42 pm

[Link]

Previous Entry Share Next Entry
Statistics for fun
I recently got into a discussion of statistics on two mystery-related blogs: Murderati and The Lipstick Chronicles. The newly-founded International Thriller Writers has run into some trouble with its new awards, with fifteen nominees in three categories -- all of which are men. I'm capturing some of what I said here, and it's germane both because of how people feel about writing, and because I have a bee in my bonnet about statistics, which I think should be taught in schools. Grade schools.

Many people are assuming that this is bias; in one case it was made pretty clear that it was those evil menfolk conspiring against women everywhere. Here's the odd thing about that theory, however: the team of judges were (deliberately) roughly half women, and when they discovered that a disproportionate percentage of the publishers' nominations were by men, they made an effort to find more thrillers by women, including going to the publishers and asking for them! Ultimately, 29% of the books judged were by women, though none made it to the final nomination list.

I was intrigued by this, and ran a chi-square test, which, briefly, compares the expected values with the actual values. That is: if 29% of the books in the pool were by women, and the women's books averaged the same quality as the men's, then 29% of the nominees, or 4.35 books out of 15 instead of 0 out of 15 should have been nominated. I thought this would be borderline significant. Imagine my surprise to find it was strikingly significant -- p value of 0.013, meaning that the odds of this occurring by mere chance were only 1.3%.



What it does NOT mean is that "the judges were prejudiced." Here's all it means:

"It is highly unlikely -- but not impossible -- that this result came about by chance." This in turn suggests:

"It would be fruitful to seek out causes of this result."

You're probably thinking "well, duh!" But that's not the way most people intuitively respond to a "statistic." There's no magic, no oracle here. ALL it says is that this is an unlikely result. It says nothing about why.

Bias is a reasonable hypothesis, but it need not be outright prejudice. There are several possible sources of bias, all of which could add up to what we see here. For example:

1. Perhaps fewer women write novels defined as "thrillers."
2. Perhaps works written by women tend not to be defined as thrillers, but go into another category
3. Perhaps some women don't define their own work as thrillers, and therefore didn't consider this (e.g., some romantic suspense writers are more accustomed to being linked to romance)
4. Perhaps publishers tend to buy fewer books by women in the "thriller" category
5. Perhaps it does not occur to these publishers to nominate books by women, either because of some bias against or simply because they don't define those books that way.
6. Perhaps it was a bad year!

Even minor bias can add up. Let's assume the best case: that half of thriller writers are men and half are women. Then let's assume there is only a slight societal bias -- say, 10%. In other words, some unconscious bias, largely but not entirely overcome by conscious thought. Here's how it would play out, simplifying the numbers for ease of use:

1. Books submitted by men and women: 50/50
2. Books accepted by agents (10% bias in play): 55 men, 45 women
3. Books accepted by publishers (10% again): 60% men, 40% women
4. Books chosen to be marketed as thrillers (10%): 65% men, 35% women
5. Books nominated for ITW (10%): 70% men, 30% women

Which is almost the percentage we saw in the nominations.

I believe the judges had no intention of producing this lopsided result -- and indeed there is evidence that they worked very hard to avoid it! And while 1.3% is small, it is still more than zero -- it could still be random chance. But I think it does suggest that we should be alert to all the possibilities of bias, which is why things like this need to be discussed -- not to demonize the judges, who are at the end of the process, but to find ways to reduce and ultimately eliminate the bias. It is unfortunate that the ITW, which clearly made an effort to be inclusive in its selection of judges and subgenres, has falied to demonstrate that in its nominations -- for whatever reason.

I do have a hypothesis, as it happens. It's complicated. I think that women are not as well represented under the "thriller" category, so there were fewer to choose from, which meant that when they sought out more women's books, they got a wider range -- in other words, the average was lower. Then, when they had to get it down to a mere five per category, none of them had enough votes to get on the list. I'd bet some did get close. Also, some of the judges were among the best thriller writers, and I think you were disqualified if you were a judge -- so if there were fewer women to begin with, you're decreasing the pool even more.

I expect next year it will be different. At least, I hope so!

Current Music: Mother Goose

(6 comments | Leave a comment)

Comments
 
[User Picture]
From:albionwood
Date:July 2nd, 2006 10:29 pm (UTC)
(Link)
I'm with you on statistics. I avoided the subject throughout my academic career and am now having to catch up; it would have been easier when I was 20-something. Now I'm seeing how important is a basic understanding of probability. It's not intuitive and most people never learn to think in those terms, so they make the kinds of mistakes you allude to.

This confuses me: Then, when they had to get it down to a mere five per category, none of them had enough votes to get on the list. I'd bet some did. Some did what?
[User Picture]
From:stevekelner
Date:July 2nd, 2006 11:47 pm (UTC)

Stats

(Link)
I didn't "get" statistics until my third course in it, while getting my masters. It just suddenly clicked for me, probably because it is typically taught by statisticians and mathematicians who get it intuitively, and it took that long for my brain to process it. Nowadays, it's second nature to question certain numbers. I strongly recommend Darrell Huff's How to Lie with Statistics as a great introduction to how to think with and about statistics -- instead of just doing them, which is what most courses teach you.

And now I don't even know what I meant by that phrase anymore. I think maybe I should have said "get close" after that. I'll fix.
[User Picture]
From:albionwood
Date:July 4th, 2006 03:34 am (UTC)

Re: Stats

(Link)
I'll look for the How to Lie book.

I was forced to learn the subject because regulations require statistical analysis of water quality data. Unfortunately the regs are badly written and require things that aren't well justified by the nature of the data... in any case I had to come up to speed in order to understand what the regs meant and what was actually justifiable. Now I'm in danger of becoming a statistics geek. (Did you know the Challenger disaster was in large part caused by failure to properly handle censored data?)
[User Picture]
From:stevekelner
Date:July 4th, 2006 04:37 am (UTC)

Re: Stats

(Link)
No! What's the story on that? Censored data? Does that mean they didn't have the data they needed because it was censored?
[User Picture]
From:albionwood
Date:July 4th, 2006 04:51 pm (UTC)

Re: Stats

(Link)
"Censored" in statistics means data that are known only to be above or below some threshold value. In my field, the threshold values are laboratory limits of detection, which "censor" the numbers below the limits. Often, a large proportion of the data are below the detection limit. These limits are based on technical processes, not health risks - so the censored data might be important, yet conventional statistics cannot make use of them.

In the Challenger case, engineers were concerned about the possible failure of O-rings at low temperatures. They had done a number of tests at different temperature ranges, and the night before the launch, they sent a graph to managers in an attempt to convince them to scrub. The graph showed the number of damage incidents as a function of temperature, but it had a major flaw: Tests where no damage had occured were not plotted. (This is very common, in fact it is the usual way people treat censored data - they exclude all the data below the threshold, because it isn't amenable to conventional analysis.) No data were available for the very low temperatures expected at launch time, because no testing had been done at those temps.

The graph showed only a few points, with no apparent pattern. The Rogers Commission produced a graph, constructed the same way as the original but with the non-detect values added, indicating a pattern of increasing failures with decreasing temperatures. A graph showing the proportion of failures by temperature range makes it strikingly obvious that low temperatures presented a greatly increased risk of failure.

(This is condensed from the introduction to Dennis Helsel's extremely useful book, "Nondetects and Data Analysis," Wiley Interscience 2005. It's the most powerful introduction to a book I've ever read.)
[User Picture]
From:albionwood
Date:July 4th, 2006 05:09 pm (UTC)
(Link)
A couple of observations on the original post:

1. The effect of taking out some of the best women thriller authors might be very significant. You could run some more statistics to quantify the effect, if you have some idea of the numbers involved (how many thriller writers are there, how many are women, how many were judges). The smaller the number of women thriller writers, the greater the effect of removing even a small number at the upper end of the curve.

2. Next year will undoubtedly be different, if only because there will be an increased bias to select women authors in order to avoid repeating the controversy.

3. Women own the strongest-selling genre in fiction today (romance), so why is it so bad if a smaller market is still male-dominated? Never mind. ;-)
Motivate Your Writing! Powered by LiveJournal.com