# Articles Posted inStatistics

Published on:

## Weighting survey responses so that the findings better represent underlying demographics

Surveyors sometimes weight their data to make the findings more representative of some other set of information. This point comes through in an article in the New York Times, July 23, 2015 at 83 regarding political polls. Pollsters may get too few responses from some demographic slice, such as farmers, and want to correct for that imbalance when they present conclusions respecting the entire population. The polling company weights the few farmer respondents more heavily to make up for the imbalance and represent the locations of residents more in line with reality.

How does this transformation of data apply in surveys for the legal industry? Let’s assume that we know roughly how many companies in the United States there are that have revenue over \$100 million by each major industry. Let’s also assume that a benchmark survey of law departments has gathered compensation data regarding the lawyers in the responding law departments.

If the participants in the law department survey materially under-represent some industry — the proportions in each industry don’t match the proportions that we know to be true – it is not hard to adjust the compensation data. One way would be to replicated representatives in industries that have been insufficient number to be proportional by enough to make up the difference. This is what is happens when a surveyor weights survey data to present more proportional data.

Posted in:
Published on:
Updated:
Published on:

## Beyond predicting compensation with linear regression: find out the accuracy of the prediction and how close the data fit a line

Other posts on this blog have reviewed the basic notions of linear regression, using correlations between total compensation and various factors that determine it.  The calculation of a regression also tells how much of total compensation is predicted by each of the factors, such as years of law practice, practice area, size of department, industry, and so forth.

In one example, data from General Counsel Metrics shows that years out of law school only predicts about 50 percent of total compensation.

Additionally, software that calculates regressions can tell us how closely the data matches the regression line.  That number is known as the correlation of determination and the higher it reaches, the more the predictor attribute tracks total compensation.

Posted in:
Published on:
Updated:
Published on:

## How you can predict compensation when you know something about a lawyer, and have lots of similar data from other lawyers

A scatter plot of data, for instance total compensation of a group of lawyers against how many years they have been practicing law, may look like a Milky Way galaxy of points, but much can be learned from the correlation.  Even more can be learned if we also know such facts as the industry of each lawyer and the revenue of their company and their LSAT scores – any facts that might influence income.

We can use linear regression, a statistical calculation to understand and quantify the relationship between any and all of those “independent variables” and the “dependent variable” – total compensation.  Spreadsheet programs can place a straight line within that cloud of dots such that the total distance between each of the dots and that line is the minimum.

A fascinating outcome of that calculation is the formula for the line.  From it you can predict a lawyer’s total compensation if you are given any of those independent variables for the lawyer. This kind of linear regression for compensation data is what General Counsel Metrics produces as part of its benchmarking reports.