Sampling error in statistics and subgroup analysis

An excellent commentary on survey methodology, in the NY Times, Aug. 27, 2006 at 10WK, discusses sampling error (See my posts of Dec. 9, 2005 and its description of margin of error; and Jan. 30, 2006 on Kirkpatrick & Lockhart’s “+ 10%” results.). The statistical term “sampling error” only properly applies with a randomly-sampled survey population; the term actually describes the range of approximation of results from a survey. The article explains that there is even a formula for calculating the error range in comparing one survey’s results to another survey’s similar question and results. None of the annual surveys published about law departments has ever done that calculation.

In this post, though, my point concerns subgroups. I wrote earlier about a survey with responses from about 400 in-house counsel (See my post of Aug. 28, 2006 on 34% had fired or considered firing a firm.). Assuming the respondents were randomly distributed – invited to participate and did participate without any pattern – the error rate for that number of responses would be plus or minus five points. Accordingly, the survey should have pointed out that to be 95 percent certain of having gotten a reliable percentage the swing would be between 39% and 29%.

If anyone tried to extend that range to a subgroup, such as law departments larger than five lawyers, the number of respondents in that group would be less than 400 so the range of approximation (sampling error) would increase.