stefan's blurblog

Did blind orchestra auditions really benefit women? by Andrew
Saturday May 11^th, 2019 at 4:49 PM

Statistical Modeling, Causal Inference, and Social Science

You’re blind!
And you can’t see
You need to wear some glasses
Like D.M.C.

Someone pointed me to this post, “Orchestrating false beliefs about gender discrimination,” by Jonatan Pallesen criticizing a famous paper from 2000, “Orchestrating Impartiality: The Impact of ‘Blind’ Auditions on Female Musicians,” by Claudia Goldin and Cecilia Rouse.

We’ve all heard the story. Here it is, for example, retold in a news article from 2013 that Pallesen links to and which I also found on the internet by googling *blind orchestra auditions*:

In the 1970s and 1980s, orchestras began using blind auditions. Candidates are situated on a stage behind a screen to play for a jury that cannot see them. In some orchestras, blind auditions are used just for the preliminary selection while others use it all the way to the end, until a hiring decision is made.

Even when the screen is only used for the preliminary round, it has a powerful impact; researchers have determined that this step alone makes it 50% more likely that a woman will advance to the finals. And the screen has also been demonstrated to be the source of a surge in the number of women being offered positions.

That’s what I remembered. But Pallesen tells a completely different story:

I have not once heard anything skeptical said about that study, and it is published in a fine journal. So one would think it is a solid result. But let’s try to look into the paper. . . .

Table 4 presents the first results comparing success in blind auditions vs non-blind auditions. . . . this table unambigiously shows that men are doing comparatively better in blind auditions than in non-blind auditions. The exact opposite of what is claimed.

Now, of course this measure could be confounded. It is possible that the group of people who apply to blind auditions is not identical to the group of people who apply to non-blind auditions. . . .

There is some data in which the same people have applied to both orchestras using blind auditions and orchestras using non-blind auditions, which is presented in table 5 . . . However, it is highly doubtful that we can conclude anything from this table. The sample sizes are small, and the proportions vary wildly . . .

In the next table they instead address the issue by regression analysis. Here they can include covariates such as number of auditions attended, year, etc, hopefully correcting for the sample composition problems mentioned above. . . . This is a somewhat complicated regression table. Again the values fluctuate wildly, with the proportion of women advanced in blind auditions being higher in the finals, and the proportion of men advanced being higher in the semifinals. . . . in conclusion, this study presents no statistically significant evidence that blind auditions increase the chances of female applicants. In my reading, the unadjusted results seem to weakly indicate the opposite, that male applicants have a slightly increased chance in blind auditions; but this advantage disappears with controls.

Hmmm . . . OK, we better go back to the original published article. I notice two things from the conclusion.

First, some equivocal results:

The question is whether hard evidence can support an impact of discrimination on hiring. Our analysis of the audition and roster data indicates that it can, although we mention various caveats before we summarize the reasons. Even though our sample size is large, we identify the coefficients of interest from a much smaller sample. Some of our coefficients of interest, therefore, do not pass standard tests of statistical significance and there is, in addition, one persistent result that goes in the opposite direction. The weight of the evidence, however, is what we find most persuasive and what we have emphasized. The point estimates, moreover, are almost all economically significant.

This is not very impressive at all. Some fine words but the punchline seems to be that the data are too noisy to form any strong conclusions. And the bit about the point estimates being “economically significant”—that doesn’t mean anything at all. That’s just what you get when you have a small sample and noisy data, you get noisy estimates so you can get big numbers.

But then there’s this:

Using the audition data, we find that the screen increases—by 50 percent—the probability that a woman will be advanced from certain preliminary rounds and increases by severalfold the likelihood that a woman will be selected in the final round.

That’s that 50% we’ve been hearing about. I didn’t see it in Pallesen’s post. So let’s look for it in the Goldin and Rouse paper. It’s gotta be in the audition data somewhere . . . Also let’s look for the “increases by severalfold”—that’s even more, now we’re talking effects of hundreds of percent.

The audition data are described on page 734:

We turn now to the effect of the screen on the actual hire and estimate the likelihood an individual is hired out of the initial audition pool. . . . The definition we have chosen is that a blind audition contains all rounds that use the screen. In using this definition, we compare auditions that are completely blind with those that do not use the screen at all or use it for the early rounds only. . . . The impact of completely blind auditions on the likelihood of a woman’s being hired is given in Table 9 . . . The impact of the screen is positive and large in magnitude, but only when there is no semifinal round. Women are about 5 percentage points more likely to be hired than are men in a completely blind audition, although the effect is not statistically significant. The effect is nil, however, when there is a semifinal round, perhaps as a result of the unusual effects of the semifinal round.

That last bit seems like a forking path, but let’s not worry about that. My real question is, Where’s that “50 percent” that everybody’s talkin bout?

Later there’s this:

The coefficient on blind [in Table 10] in column (1) is positive, although not significant at any usual level of confidence. The estimates in column (2) are positive and equally large in magnitude to those in column (1). Further, these estimates show that the existence of any blind round makes a difference and that a completely blind process has a somewhat larger effect (albeit with a large standard error).

Huh? Nothing’s statistically significant but the estimates “show that the existence of any blind round makes a difference”? I might well be missing something here. In any case, you shouldn’t be running around making a big deal about point estimates when the standard errors are so large. I don’t hold it against the authors—this was 2000, after all, the stone age in our understanding of statistical errors. But from a modern perspective we can see the problem.

Here’s another similar statement:

The impact for all rounds [columns (5) and (6)] [of Table 9] is about 1 percentage point, although the standard errors are large and thus the effect is not statistically significant. Given that the probability of winning an audition is less than 3 percent, we would need more data than we currently have to estimate a statistically significant effect, and even a 1-percentage-point increase is large, as we later demonstrate.

I think they’re talking about the estimates of 0.011 +/- 0.013 and 0.006 +/- 0.013. To say that “the impact . . . is about 1 percentage point” . . . that’s not right. The point here is not to pick on the authors for doing what everybody used to do, 20 years ago, but just to emphasize that we can’t really trust these numbers.

Anyway, where’s the damn “50 percent” and the “increases by severalfold”? I can’t find it. It’s gotta be somewhere in that paper, I just can’t figure out where.

Pallesen’s objections are strongly stated but they’re not new. Indeed, the authors of the original paper were pretty clear about its limitations. The evidence was all in plain sight.

For example, here’s a careful take posted by BS King in 2017:

Okay, so first up, the most often reported findings: blind auditions appear to account for about 25% of the increase in women in major orchestras. . . . [But] One of the more interesting findings of the study that I have not often seen reported: overall, women did worse in the blinded auditions. . . . Even after controlling for all sorts of factors, the study authors did find that bias was not equally present in all moments. . . .

Overall, while the study is potentially outdated (from 2001…using data from 1950s-1990s), I do think it’s an interesting frame of reference for some of our current debates. . . . Regardless, I think blinding is a good thing. All of us have our own pitfalls, and we all might be a little better off if we see our expectations toppled occasionally.

So where am I at this point?

I agree that blind auditions can make sense—even if they do not had the large effects claimed in that 2000 paper, or indeed even if they have no aggregate relative effects on men and women at all. What about that much-publicized “50 percent” claim, or for that matter the not-so-well-publicized but even more dramatic “increases by severalfold”? I have no idea. I’ll reserve judgment until someone can show me where that result appears in the published paper. It’s gotta be there somewhere.

P.S. See comments for some conjectures on the “50 percent” and “severalfold.”

Read the whole story

stefanetal

2292 days ago

reply

Northern Virginia

This map shows which states in the US are competing to top California-based Uber's $15.7 billion in equity funding by Becky Peterson
Sunday April 21^st, 2019 at 12:17 PM

Business Insider

ustech_states__4_12_19

With $15.7 billion in equity funding, Uber is the most highly-funded tech startup in the United States.
This map by CB Insights shows that there's plenty of money to go around.
Florida's Magic Leap, an augmented reality has raised $2.4 billion in funding, while Illinois's' Avant, a financial technology company has raised $655 million, and Georgia's Kabbage, an online lending platform that has raised $490 million.
Visit Businessinsider.com for more stories.

With $15.7 billion in equity funding in its pocket, the San Francisco, California-based ride-hailing company Uber has raised more money than any other tech startup in the country. But California isn't the only state in the union harboring highly-funded tech startups.

Upserve Sheryl Hoskins

In this graphic, research firm CB Insights identified the most highly-funded companies in each of the 50 states, plus Washington, D.C. Some of those companies include Florida's Magic Leap, an augmented reality company that's disclosed $2.4 billion in funding, Illinois' Avant, a financial technology company that's raised $655 million, and Georgia's Kabbage, an online lending platform that has raised $490 million.

The list also includes ten different unicorns — companies valued over $1 billion — from Washington, D.C.'s Vox Media to Utah's InsideSales.com.

While hge amounts of equity funding can be found across the country, three of the states didn't have any companies that fit CB Insight's full criteria, which required that companies have raised at least $1 million in equity funding since January 2014. Those states are Alaska, Mississippi, and Wyoming.

Alaska's Resource Data, for instance, has raised $1.59 million in equity funding, but it's raised the sum through multiple small angel rounds.

Here's the full list:

Alabama: AfterSchool, $16.4 million
Alaska: Resource Data, $1.59 million
Arizona: IO Data Centers, $311 million
Arkansas: One Country, $100 million
California: Uber, $15.7 billion
Colorado: Welltok, $339.43 million
Connecticut: Cedar Gate Technologies, $220 million
DC: Vox Media, $324.65 million
Delaware: SevOne, $203.5 million
Florida: Magic Leap, $2.4 billion
Georgia: Kabbage, $490 million
Hawaii: Ibis Networks, $4.83 million
Idaho: CradlePoint, $154.8 million
Illinois: Avant, $655 million
Indiana: Scale Computing, $89.67 million
Iowa: Involta, $79.5 million
Kansas: C2FO, $199.68 million
Kentucky: Lucina Health, $24.49 million
Louisiana: Lucid, $64.22 million
Maine: Tilson Technology Management, $109.4 million
Maryland: Sonatype, $142.6 million
Massachusetts: DraftKings, $727.6 million
Michigan: Llamasoft, $56.1 million
Minnesota: Code42 Software, $137.5 million
Mississippi: Next Gear Solutions, $11.05 million
Missouri: PayIt, $108 million
Montana: Blackmore Sensors & Analytics, $21.5 million
Nebraska: Hudl, $106.19 million
Nevada: PlayStudios, $36.17 million
New Hampshire: FlexEnergy, $46.24 million
New Jersey: Vidyo, $171.91 million
New Mexico: Skorpios Technologies, $45.17 million
New York: Infor, $4.1 billion
North Carolina: Epic Games, $1.6 billion
North Dakota: Myriad Mobile, $10.6 million
Ohio: Root Insurance, $159 million
Oklahoma: SendaRide, $1.74 million
Oregon: Jama Software, $233 million
Pennsylvania: Duolingo, $108.3 million
Rhode Island: Upserve, $191.45 million
South Carolina: Commerce Guys, $46.3 million
South Dakota: Covered Insurance Solutions, $4.63 million
Tennessee: SmileDirectClub, $426.7 million
Texas: WP Engine, $289.2 million
Utah: InsideSales.com, $264.3 million
Vermont: Faraday, $5.49 million
Virginia: Privia Health, $432.84 million
Washington: Rover, $280.9 million
West Virginia: Geostellar, $29.97 million
Wisconsin: EatStreet, $44.74 million
Wyoming: Mountain Origins Design (dba Stio), $17.2 million

Join the conversation about this story »

NOW WATCH: Watch Apple debut its own no-fee credit card

Read the whole story

stefanetal

2313 days ago

reply

This is pretty scary. I looked for Virginia first and my reaction to Privia's half billion was 'well, that's that.' Surprising how small these numbers get outside of California (only FL, MA, NY and IL are ahead of VA).

Also, how is Infor a startup??

Northern Virginia

sirshannon

2311 days ago

NC!

stefanetal

2311 days ago

Sorry, I missed NC. Yes, makes sense that NC is up there. Still, gaming, almost a bad as MA. And also, how are they a 'startup'?

Those indefatigable Trump defenders by ssumner
Thursday April 18^th, 2019 at 3:04 PM

TheMoneyIllusion

Here’s an imaginary conversation, which will be completed in the comments section:

Me: Trump is obviously an ignorant buffoon.

Trumpistas: But look, he’s picking distinguished people for the Fed, such as Powell, Quarles and Clarida.

Me. That’s true.

One year later:

Me: Now Trump’s picking unqualified people for the Fed.

Trumpistas: It doesn’t matter if they are qualified, all that matters is whether they vote the right way.

Me: But these are super hawks who have praised the gold standard.

Trumpistias: Yes, but they are loyal to Trump, so while they were hawkish in the past, today they’ll do whatever it takes to help the President.

Me: But they are appointed for 14 years, and Trump’s term ends in less than 2 years. Will Trump want the Fed to help the next president? After all, he tried to get the Fed to hurt the previous (Democratic) president. Will these super hawks want a monetary policy that makes a socialist president look good in the eyes of the voters?

Trumpistas: TBD, in the comment section.

Read the whole story

stefanetal

2315 days ago

reply

The political business cycle.

Northern Virginia

StatsGuru

2315 days ago

The best use of indefatigable was in the "Knights of the Round Table" song from Monty Python and the Holy Grail.

The Power of Words, Social Welfare Edition by Kevin Drum
Wednesday April 17^th, 2019 at 6:56 PM

Kevin Drum – Mother Jones

I was browsing through the 2018 General Social Survey again and happened to come across a pretty astounding example of how important question wording can be. Or, perhaps, how important words can be in general. GSS nerds will be unsurprised by this, but here’s how white people feel about helping the poor. It all depends on precisely how you ask:

This is a pretty astounding difference considering that welfare and assistance to the poor are pretty much the same thing. I’m tempted to say that the difference is that whites associate welfare with black families, but it turns out that African Americans show the same gap in attitudes toward the poor. In fact, the gap among African Americans is even bigger than it is for whites.

So how do people really feel about spending on safety net programs? It’s impossible to say. I doubt very much that 70 percent of whites are truly in favor of spending more, but I also doubt very much that only 20 percent are truly in favor of spending more. I guess it all depends on how you sell it.

POSTSCRIPT: It’s also worth noting that these responses are completely divorced from actual spending levels. Spending on poor families has increased by about 300 percent since 1973, but the answer to these questions has stayed rock steady the entire time.

Read the whole story

stefanetal

2316 days ago

reply

One alternative interpretation is that many people believe welfare isn't helping the poor.

Northern Virginia

ahofer

2298 days ago

precisely, also the distinction between government and private relief spending.

Fox News Obama Bizarroworld by noreply@blogger.com (digby)
Wednesday April 17^th, 2019 at 2:43 PM

Hullabaloo

Fox News Bizarroworld

by digby

This is awesome:

If Fox News Covered Trump like the did Obama.

This is awesome!pic.twitter.com/D1utV3izvj
— Brian Krassenstein (@krassenstein) April 17, 2019

Too funny.

.

Read the whole story

stefanetal

2316 days ago

reply

Northern Virginia

Bad News for IV Estimation by Francis Diebold (noreply@blogger.com)
Wednesday April 10^th, 2019 at 12:43 PM

No Hesitations

Alwyn Young's has an eye-opening recent paper, "Consistency without Inference: Instrumental Variables in Practical Application". There's a lot going on worth thinking about in his Monte Carlo: OLS vs. IV, robust/clustered s.e.'s vs. not, testing/accounting for weak instruments vs. not, jacknife/bootstrap vs. "conventional" inference, etc. IV as typically implemented comes up looking, well, dubious.

Alwyn's related analysis of published studies is even more striking. He shows that, in a sample of 1359 IV regressions in 31 papers published in the journals of the American Economic Association,

"... statistically significant IV results generally depend upon only one or two observations or clusters, excluded instruments often appear to be irrelevant, there is little statistical evidence that OLS is actually substantively biased, and IV confidence intervals almost always include OLS point estimates."

Wow.

Perhaps the high leverage is Alwyn's most striking result, particularly as most empirical economists seem to have skipped class on the day when leverage assessment was taught. Decades ago, Marjorie Flavin tried to provide some remedial education (and in an IV context yet!) in her 1991 paper, "The Joint Consumption/Asset Demand Decision: A Case Study in Robust Estimation". She concludes that

"Compared to the conventional results, the robust instrumental variables estimates are more stable across different subsamples, more consistent with the theoretical specification of the model, and indicate that some of the most striking findings in the conventional results were attributable to a single, highly unusual observation."

Sound familiar? The non-robustness of IV seems disturbingly robust, from Flavin through Young.

Unfortunately Flavin's paper fell on deaf ears and remains unpublished. Hopefully Young's will not meet the same fate.

Read the whole story

sarcozona

2323 days ago

reply

Epiphyte City

stefanetal

2324 days ago

reply

Northern Virginia

Did blind orchestra auditions really benefit women? by Andrew Saturday May 11th, 2019 at 4:49 PM

This map shows which states in the US are competing to top California-based Uber's $15.7 billion in equity funding by Becky Peterson Sunday April 21st, 2019 at 12:17 PM

Here's the full list:

Those indefatigable Trump defenders by ssumner Thursday April 18th, 2019 at 3:04 PM

The Power of Words, Social Welfare Edition by Kevin Drum Wednesday April 17th, 2019 at 6:56 PM

Fox News Obama Bizarroworld by noreply@blogger.com (digby) Wednesday April 17th, 2019 at 2:43 PM

Bad News for IV Estimation by Francis Diebold (noreply@blogger.com) Wednesday April 10th, 2019 at 12:43 PM

Did blind orchestra auditions really benefit women? by Andrew
Saturday May 11^th, 2019 at 4:49 PM

This map shows which states in the US are competing to top California-based Uber's $15.7 billion in equity funding by Becky Peterson
Sunday April 21^st, 2019 at 12:17 PM

Those indefatigable Trump defenders by ssumner
Thursday April 18^th, 2019 at 3:04 PM

The Power of Words, Social Welfare Edition by Kevin Drum
Wednesday April 17^th, 2019 at 6:56 PM

Fox News Obama Bizarroworld by noreply@blogger.com (digby)
Wednesday April 17^th, 2019 at 2:43 PM

Bad News for IV Estimation by Francis Diebold (noreply@blogger.com)
Wednesday April 10^th, 2019 at 12:43 PM