From someone who knows how to do economics. Or at least statistics.

Or maybe it\’s even just someone who knows how to us excel.

I\’ve got two sets of data about US States.

One is the population of each state.

The other is the Gini (a measure of inequality) for each State.

What I want is one plotted against the other.

Quite simply, I want to try and see whether there\’s a correlation between the amount of inequality and the number of people. I would, after all, expect greater variance in incomes the larger the number of people (as we expect greater variance in height, weight, sexual prowess and all the rest the larger the population).

Anyone able to do that simply?

Update: Thanks, this has now been done. Yes, there does seem to be, eyeballing the graph, a connection between gini and population size. That\’s what I wanted.

## 9 thoughts on “Asking for a favour”

1. Well, can’t you just type one set of values into a column in Excel, then the other dateset in the next column, then tell it to plot a graph for you?

I would, after all, expect greater variance in incomes the larger the number of people

Why? There’s no reason to believe that will be the case at all, once you’re beyond very small populations.

Also, bear in mind that the Gini measures ineqaulity of outcome, and the more free a market is, the more that will reflect inequality of production (that is, of the value of individuals’ production as perceived by the rest of the population). A heavily socialised market will veer somewhat away from this due to interference by government- we should expect Third Way markets to subsidise both low production and wealth transfer to an elite, leading to a hollowed-out middle and greater inequality in the Gini index (that is, more low earners and more high earners at the expense of middle earners). None of which is proportionate to population.

Tim adds: “Well, can’t you just type one set of values into a column in Excel, then the other dateset in the next column, then tell it to plot a graph for you?”

Because I don’t know how to. I have no idea at all how to use Excel.

Gini does not necessarily measure inequality of outcome. Depends whther the numbers are market incomes or adjustde for tax and benefits. But that’s another problem for me to deal with later.

2. Because I don’t know how to. I have no idea at all how to use Excel.

While you’re sitting there waiting for somebody to volunteer to type all those numbers in for you, you could RTFM. 😉

Gini does not necessarily measure inequality of outcome.

That’s the intention of it. It’s a measure of outcomes, without regard to inputs.

Tim adds: Well, yes, but there are still different outcomes. Those for market incomes or those adjusted for tax and benefits?

3. Well, yes, but there are still different outcomes. Those for market incomes or those adjusted for tax and benefits?

Indeed. Two different genies. Which do you intend to plot? Why not both? That might reveal something interesting. Or not.

Here’s an hypothesis; the greater the difference between Gini 1 and Gini 2, the greater the inequality in the Gini 1 coefficient. That is, greater benefits and tax credits increase production inequality, leading to greater “raw” (unadjusted) Gini inequality. I’d be interested to know if that is the case, personally.

4. “I would, after all, expect greater variance in incomes the larger the number of people (as we expect greater variance in height, weight, sexual prowess and all the rest the larger the population).”

No, this doesn’t hold. You might get a greater dispersion between the very highest and lowest, just as if you throw a million darts at a dartboard you are more likely to get 60 than if you throw 10. But there’s no reason for a broader-based measure such as a Gini coeffient to show a greater dispersion.

5. Put the numbers into Google. Plot a scatter graph and then get it to plot whatever sot of trendline you want.

6. I think this bears out my theory, held since I looked at the creation of the euro and worked out what would happen: that the larger the currency area, the poorer the population is per head, and the greater are the inequalities within the country.
The US was the anomaly, rich per head because the dollar was the world’s currency.
That may be ceasing to be true, so the US becomes a typical case: poor per head.
Europe will eventually go the same way. Areas with their own currencies will stay rich per head with less inequality, because the shifting currncy values iron out differences. The eurozone will be poor per head with some super rich.
And eventually people will notice.

7. There’s a positive correlation of about 0.34, so its there but its on the weak side of middling.

Looking at the plot, I think there might be some mileage in ditching the obvious outliers – the District of Columbia’s Gini is obviously skewed due to the Federal Government presence – and I’d also drop the 3 or 4 states with the largest populations, and concentrate on the 30-40 states with broadly comparable populations where there’s a far degree of variance.

I’m wondering if there may be any interesting patterns in terms of regional variations, i.e. differences between the North-East, Mid-West and Deep South, and also whether there are any relationships to broad economic mix in different state, particular the balance between post-industrial, industrial and agrarian sector and each states relative dependence on each.

8. There should be laws against exposing datasets to the web without a CSV or XML export option. I’ve lost count of the number of times I’ve had to bugger about stripping data out of an HTML file in order to get in a form I can do some number-crunching on it. It’s not hard, but it’s boring, and should be unnecessary.