Gini in the Bottle

For many years, the debate about income inequality has seemed to me to behave like some of my more primitive attempts at cooking.  For a while it simmers, warranting almost no attention and sitting there like the proverbial watched pot with nothing happening.  Then, as some political burner turns up the heat, it boils over into something messy like the Occupy movement, suddenly demanding damage control and cleanup.   And much like my aborted attempts at food preparation, neither of these situations ends up leading to anything satisfying.

As a result, my interest in the income inequality debate falls into the same place in my mind as does my interest in the culinary arts – a dusty corner where I vaguely recognize that people are passionate about it, but where I reason that I’ve nothing to contribute to it and it has nothing to contribute to me.  And, so, I basically tuned it out.  This situation has thawed for me this past week (forgive what will be the last food analogy) with the publication of an intriguing nugget in an article from the Washington Post.

In this article, entitled ‘Income Inequality – the issue the Democrats want’, Ed Rogers rails against what he paints as the Democrat hypocrisy.  The point that Rogers tries to make is that the Democrats don’t want to address income inequality; they simply want to use it as a political tool to separate their message from the Republican one.  He may be correct – I don’t know – but, just as I was going to stop reading at the end of the third paragraph, he posed the question ‘What exactly is “income inequality”?’ This grabbed my attention, and I found out that income inequality is measured by something called a Gini Coefficient.  Suddenly, there was a possibility of real data and actual statistical analysis, and I got excited.  Rogers also cites, as strong support for his contention, an article in the New York Times entitled Is Life Better in America’s Red States?  This article by Richard Florida presents a chart of Gini coefficients, calculated by state, which shows that the majority of the worst 21 states in terms of income inequality in 2012 are in blue or purple states, compared with the majority being red in states in 1979.  Suddenly there was actual tabulated data showing a before and after situation.

I then went off to try to understand the Gini Coefficient, which, in a nutshell, is based on something called a Lorenz Curve.   Resolving to go only one more turtle down, I then set myself to understand the Lorenz curve.  Fortunately a Lorenz curve is relatively easy to understand.

The whole machine starts with a table of income distributions.  The simplest presentation I’ve found is from Timothy Taylor’s book Principle of Economics: Economics and the Economy, 2nd edition.  Start by dividing the existing population into quintiles, and then measure the income that each quintile receives for a given year.  For example, in the years 1967 (the first year measured in the US), 1985, and 2005, the income distribution looks like

Percentage Income Distribution
Quintile 1967 1985 2005
1st 4.0 3.9 3.4
2nd 10.8 9.8 8.6
3rd 17.4 16.3 14.6
4th 24.2 24.4 23.0
5th 43.6 45.6 50.4

From the table, one can tell that, in 1967, the bottom 20 percent of the population received 4.0 percent of the income, and that this percentage fell to 3.4 percent by the year 2005.  Likewise, the middle 20 percent also saw a drop in their share of the income from 17.4 percent to 14.6 over the same time span.

The next step is to construct the cumulative income by partially summing down the column.  The corresponding data looks like

Cumulative Percentage of Income
Population Percentage 1967 1985 2005
0 0 0 0
20 4.0 3.9 3.4
40 14.8 13.7 12.0
60 32.2 30 26.6
80 56.4 54.4 49.6
100 100 100 100

with the obvious boundary conditions that zero percent of the population receives zero percent of the income and 100 percent of the population receives 100 percent of the income.  The addition of the 0-line will be needed in the next step.  Note carefully how the value at, say, 40 percent of the population is the sum of the 1st and 2nd quintiles, while the value at 60 is the sum of the first 3 quintiles.   The graph of these values then gives the Lorenz Curve as shown below

Lorenz_curve

Calculation of the Gini Coefficient is a bit more involved, and requires two new Lorenz curves and a modest amount of computation.  The first curve, called hereafter the ‘perfect curve’, represents a perfectly balanced society with equal income distribution over all segments of its population.  The resulting income distribution and cumulative percentage of income are

Percentage Income Distribution Population Percentage Cumulative Percentage of Income
perfect
Quintile 1st 20 20 20
2nd 20 40 40
3rd 20 60 60
4th 20 80 80
5th 20 100 100

The second curve, called the ‘imperfect curve’, represents the income distribution of a completely unbalanced society, with only one member receiving all of the income and every other member receiving nothing.

With all the ingredients now in place, the Gini Coefficient is then defined as the ratio of the area between the perfect curve and a given Lorenz curve to the area between the perfect and imperfect curves.  As a formula, if A is the area between the perfect curve and a given Lorenz curve and B the area between a given Lorenz curve and the imperfect curve, then the Gini Coefficient, denoted as G, is given by G = A/(A+B).  This is shown in the figure below with the Lorenz curve given for a linear distribution of income (first quintile has 6.7 percent; the second quintile has 13.3 percent, etc.).

Lorenz_curve_annotated

The area between the perfect curve and the given Lorenz curve is most easily calculated by calculating the area beneath the given Lorenz curve (B) and subtracting it from the total area beneath the perfect curve (A+B), since the latter has the known value of 0.5.  The easiest way to see this fact is to convert the y-axis (cumulative percentage of income) to fractions by dividing by 100.  The perfect curve is now a 45-degree diagonal in the unit square with the area of the triangle enclosed by it, and the imperfect curve being one half. The resulting expression for the Gini Coefficient is then A/(A+B) = (A+B-B)/(A+B) = 1-2B.

So, the computation of the Gini Coefficient comes down to computing the area B by integration.  For a mathematically specified distribution, the functional form of the Lorenz curve is known, and the area can be carried out using calculus.  For example, the linear distribution curve results in the functional form of the Lorenz curve of x2, where x is the population fraction.  Note that the linear curve, when partially summed, must be normalized, thus its Lorenz curve is x2 not x2/2.  The integral of x2 is x3/3, which, when evaluated on the interval [0,1], gives B = 1/3 and G = 1-2B = 1-2/3 = 1/3.

For empirical distributions, such as listed above for the years 1967, 1985, and 2005, a numerical approximation to the area under the Lorenz curve can be estimated in a variety of ways.  To illustrate, I chose the particularly simple approach of using the trapezoidal rule.  The resulting Gini coefficients are then

Year Gini Coefficient
1967 0.370
1985 0.392
2005 0.434

Clearly, there is a growing trend towards greater income inequality, but what to make of it?

First, it is important to remember that the Gini Coefficient doesn’t measure poverty.  Everyone in a population can be rich and the Gini Coefficient could indicate an income distribution far from the perfect curve (think of football players and owners).  Likewise, everyone in a population can be poor and the Gini Coefficient could indicate an income distribution near the perfect curve (think of a native tribe in the Amazon like the Yanomama) .

Second, to quote Taylor:

No society should expect or desire complete equality of income at a given point in time, for a number of reasons.  First, most workers receive relatively low earnings in their first few jobs, higher earnings as they reach middle age, and then lower earnings after retirement.  Second, people’s preferences and desires differ.  Some are willing to work long hours to have large income…Others will work fewer hours…  Third, people can be lucky or unlucky. Some decades ago, an economist named Henry Simmons tried to find an objective, scientific way to determine how much inequality was appropriate.  After a great deal of thought, he decided that the question had no answer.

 

Okay, so it seems that income inequality is a fixture of life, but is there any way to understand the observed trends?  I will point out that trends in income inequality are cited by Taylor to be predominantly due to two effects.

The first is a demographic shift amongst the higher income earners, in which they have been preferentially marrying each other (e.g., a lawyer with a lawyer), thereby concentrating more income in the top earners.  This is to be contrasted with an older model in which a high income earner (e.g., a doctor) tended to marry a low income earner (e.g., a school teacher).  I would argue that this change reflects an underlying improvement in American society and the upward mobility of women.

The second effect is as discouraging as the first is encouraging.  There is an educational gap between the highly skilled worker and the low or unskilled component of society, and it seems to be widening, not shrinking.  Highly skilled workers are in high demand due to the technological advances over the past 30 years, and as more of them enter the marketplace, the pace of technological development and the need for more advanced training increases.  This problem is further exacerbated by the fact that lower income families not only have fewer good educational opportunities, but they also tend, more often than their rich counterparts, to be comprised of a single parent, which creates a substantial educational disadvantage.

Leave a Comment

Your email address will not be published.