In this interview, Nora Lustig discusses how important it is to accurately assess the very top of the income and wealth distributions when analyzing inequality, and how research — including her own — reveals the challenges this entails.

Nora Lustig, a Stone Center affiliated scholar, and the Samuel Z. Stone Professor of Latin American Economics at Tulane University, has focused much of her recent research on the impact of taxation and government spending on inequality and poverty in developing countries.

She is also a leading voice in the ongoing debates about how to accurately measure the very top of the income and wealth distributions. The most affluent often don’t respond to household surveys, or they give inaccurate responses that downplay their income and wealth. How should researchers correct for missing or misleading information? We spoke to Lustig about her paper, “The ‘Missing Rich’ in Household Surveys: Causes and Correction Approaches,” and the importance of capturing this information for designing policies aimed at reducing inequality.

What sparked this enormous explosion of interest in the “missing rich,” and what prompted you to focus so much careful attention on this issue?

I think the explosion of interest is linked to the observed increase in income and wealth inequality — in particular, in the United States. When inequality was not changing in the U.S., the interest in the topic within mainstream economics was quite limited. In fact, for decades and for some economists it was almost a taboo subject. I recall comments from colleagues that what mattered was poverty and not inequality — and being asked if I was a “lefty” because I worked on inequality. (I am a progressive economist, by the way.)1 One of the most emblematic changes happened at the IMF. In the 1980s and 1990s, if anybody raised distributional concerns linked to the IMF-sponsored austerity programs, the staff there was quick to rebut them: distributional concerns were the purview of national governments and it was not the IMF’s place to point them out. In contrast, for more than a decade now, the IMF has become a leading voice regarding the many threats that high inequality poses to economic prosperity and social stability.

With the increased concern came a rising interest in accurately measuring inequality. For a long time, scholars typically relied on information from household surveys to calculate income inequality or wealth inequality. Yet they realized that the people at the top of those surveys claimed to have incomes or wealth that are much, much lower than what you see in rich lists by Forbes or that you can infer by looking at what people are buying — e.g., the huge wealth that’s in property. It was obvious that household surveys aren’t capturing the top.

Why aren’t household surveys capturing the top? Is it simply a matter of respondents trying to disguise their actual income or wealth?

Not accurately capturing the top incomes in surveys is the result of two main problems in the achieved survey samples: nonresponse and underreporting.2 The nonrespondent population refers to individuals who have a likelihood, however small, of being selected into the sample, but who — if selected — do not or would not respond, because of noncontact, refusal, or other reasons. As such, and unless the statistical institute in charge of the survey is able to replace the nonrespondent individual by a similar subject, the nonrespondent subjects end up not being included in the achieved sample.

Underreporting refers to subjects who are selected and respond to the survey but who, when they respond, report income (or consumption, or wealth) below its actual level. When the rich are included in surveys and do respond, significant underreporting may arise because high-income individuals usually have diversified portfolios with income flows that are difficult to value, such as capital income invested in pension funds or retained by corporations as undistributed profits; or because they may also be more reluctant to disclose their incomes. Underreporting is a case of measurement error: even when people respond, they may misrepresent their income, whether on purpose or by mistake.3

Why is it important to capture accurate information about the very top of the income distribution?

It’s important to capture top incomes well because you want to have an accurate measure of inequality, at any point in time, and you also want to have an accurate description of the evolution of inequality over time.

If you don’t capture the top, your inequality measurements will be biased, and the bias could also result in estimating the wrong trends. The bias can be in either direction although, in most cases, inequality will be higher after correcting for the missing rich.

There’s another reason why it is important to know the extent of concentration of income and wealth at the top. At my Commitment to Equity Institute, we do fiscal incidence analysis to assess the impact of fiscal policies — tax and transfers — on inequality and poverty. The studies use household surveys to obtain various measures of pre-fiscal and post-fiscal income distribution and determine the extent to which the fiscal system as a whole (as well as specific interventions) is equalizing or not. However, if a significant portion of incomes at the top are missed by household surveys, our measures of redistribution might also be biased. Moreover, this flaw in surveys prevents estimating how much inequality could potentially be reduced by introducing a more progressive personal income tax, a policy reform that many countries should consider.

Another reason to know the extent of concentration of income and wealth is that you may be worried about what people call state capture. To what extent does the fact that you have so much wealth and income concentrated in a small group of people make this group inordinately influential in determining what happens in terms of politics, laws, and regulations?

Your paper details various ways of trying to capture this missing information. Is there a particular factor that you feel is most important?

When it became obvious that top incomes were not captured in the surveys, there was a shift to try to find other sources of information. Professor Tony Atkinson, the late brilliant inequality scholar, led the way. The source Atkinson and researchers who followed in his steps frequently used was information from tax returns.4 The information is not based on individual records, but on information released in the U.S. (some is available since 1913!) and in other advanced (and some middle- and low-income) countries that allows analyzing the distribution of income based on tax returns. Also, the U.S. Internal Revenue Service as well as the tax authorities in other countries sometimes prepare more detailed datasets for small groups of approved researchers. Working with tax records, however, is not always possible and, moreover, is not a panacea. For instance, due to informality (a widespread phenomenon in labor markets in low- and middle-income countries), tax avoidance, and tax evasion, tax records can also suffer from similar problems to those observed in household surveys.5

Given that using tax records is not the ultimate solution, a number of alternative correction approaches have been proposed in the literature. The most promising methods are those that combine survey data with information from external sources such as tax records, national accounts, rich lists, or other external information. The methods, as described in my paper cited above, can correct by replacing top incomes or increasing their weight (reweighting), or a combination of both. However, there is little or no guidance from theory or statistical testing regarding which specific method might be best in getting us closer to the true distribution of income.

Ideally, to help solve to a certain extent the issue of missing information on top incomes based on surveys, data made available to researchers would include some sort of code that permits linking an (anonymized) individual in the household survey with the same (anonymized) individual in tax records. With linked data, for example, we would be able to replace some individuals’ underreported income in surveys with their higher and presumably more accurate incomes recorded in their tax returns.

That would be my message to data producers: there should be agreements between the agencies in charge of producing the surveys and the agencies that have the tax data that allow this linking to take place so that scholars can better measure inequality and its trends, and policymakers can assess the impact of tax reforms more comprehensively. The ability to use linked data would make a huge difference.

Does the U.S. make that type of linked data available?

For tax and survey data, not as a general practice.6 In the U.S. you do find some types of linked data. For example, Meyer and Mittag (2019) used survey data linked with benefit data and found that, if you don’t correct for what the administrative data — data created by government agencies to track beneficiaries — show, your conclusions on poverty in the U.S. based on the survey data alone may be quite biased.

Uruguay should be mentioned here as an exemplary case. The authorities there have made available to academics linked data for a subsample of their household survey and tax returns. This information is pure gold for researchers because one is able to directly observe what the same individuals report to the survey and on their tax returns. With this information, my coauthor Andrea Vigorito and I have been able to show in a forthcoming paper what we all suspect: that individuals in the upper half of the income distribution tend to report less labor income in household surveys than those same individuals earn according to tax returns, and underreporting is increasing in income.7

Since you don’t have the linked data (which would allow you to replace survey incomes with the presumed more accurate ones recorded in, for instance, tax returns), you have to find alternative ways in which you can correct the surveys to include top incomes that are closer to actual ones. As I mentioned already, there are a number of approaches and methods that are described in my paper. Because we do not know what the true distribution is, one big problem we have is: How do we chose among those methods? And that’s where we are. How do we know that one method is more appropriate than another?

And that question remains unresolved?

Yes. As Vigorito and I show in our forthcoming paper that uses linked data for Uruguay, if we take as the true distribution the corrected data using incomes from tax records for individuals who underreport to the survey, both replacing and reweighting can lead to overestimation of true inequality. Furthermore, one correction method overcorrects the bias for some indicators while other methods do it for other indicators (e.g., Gini, top 10%, top 1%, etc.).

In other words, there is no single method that approximates the true distribution better than all the others. As a result, studies that correct survey data using a single correction method to assess the level and trend of inequality — as well as studies that make no correction and rely only on incomes reported in surveys — should be interpreted with caution. In fact, given the importance of the subject, The Journal of Economic Inequality is soliciting submissions for a special issue aimed at addressing issues at the “upper tail” of the distribution.8

What if there were some way to get the completely correct information? What would researchers do with that information if they had it?

We would measure the extent of inequality and its evolution accurately throughout the world.  This would allow us to analyze the causes of inequality more comprehensively: for example, how some regulations distort markets and result in extraordinary income and wealth accruing to some groups in society. We would be able to assess the impact on inequality of having excessive concentration of power in certain groups. We would also be able to better evaluate policy options to address high levels of inequality. Governments would be able to assess the impact of progressive tax systems on inequality and poverty more accurately, and we would be able to assess and tackle tax evasion more forcefully.

Related Research: The “Missing Rich” in Household Surveys: Causes and Correction Approaches

Footnotes

1 My impression is that the fall of the Berlin Wall — the collapse of communism — was also a factor in turning the concern with inequality into a valid one in almost all quarters of our profession in the West. Why? Because being concerned with inequality was no longer viewed as being sympathetic to socialism or, worse, “siding with the enemy.”

2 A third reason that top incomes may not appear in surveys most of the time is that the very rich are a low probability event (such as 7-foot individuals). By definition, samples will more often than not exclude extreme values because the latter are rare. Nonresponse and underreporting, however, while not uncommon, are not considered “normal” and there is great effort invested by data producers and scholars in minimizing and correcting for them.

3 For a survey of causes and correction methods see Lustig (2019). For evidence that underreporting is correlated with income, see Higgins, Lustig, and Vigorito (2018).

4 Inspired by the pioneering work for the United States by Simon Kuznets (1953) and by Tony Atkinson and Alan Harrison (1978), this approach has been pursued by Piketty (2001) to study the long-run distribution of top incomes in France, by Piketty and Saez (2003) for the United States and in a series of other country studies collected in the two volumes on top incomes edited by Atkinson and Piketty (20072010).

To address some of these shortcomings, the DINA (Distributional National Accounts) project, led by Thomas Piketty at the Paris School of Economics and Emmanuel Saez at the University of California, Berkeley, combines tax data with other information sources such as income and wealth surveys – including those made available through LIS – and National Accounts. Another approach combines tax data with household surveys to incorporate incomes from the informal sector.

6 One notable exception is the subsample for Uruguay used by Higgins, Lustig, and Vigorito, op. cit.

7 See Lustig and Vigorito (forthcoming).

8 “Finding the Upper Tail: Empirical Strategies and Methods for the Top of the Distribution”, edited by Frank Cowell, Nora Lustig and Daniel Waldenström, is aimed at publishing research that specifically addresses these and related questions. Submissions must be made online at https://www.springer.com/journal/10888/updates/18049752. Deadline for submissions is December 15, 2020.

References

Atkinson, A. B. and A. J. Harrison (1978), Distribution of Personal Wealth in Britain, Cambridge, UK: Cambridge University Press.

Atkinson, A. B. and T. Piketty (2007), Top Incomes in the Twentieth Century, Oxford: Oxford University Press.

Atkinson, A. B. and T. Piketty (2010), Top Incomes. A Global Perspective, Oxford: Oxford University Press.

Higgins, S., N. Lustig and A. Vigorito (2018), “The Rich Underreport Their Income: Assessing Biases In Inequality Estimates And Correction Methods Using Linked Survey And Tax Data.” CEQ Working Paper 70, CEQ Institute, Tulane University, September.

Kuznets, S. (1953), Economic Change, New York: Norton.

Lustig, N. (2019)  “The ‘Missing Rich’ in Household Surveys: Causes and Correction Approaches.” CEQ Working Paper 75, CEQ Institute, Tulane University, November.

Lustig, N. and A. Vigorito (forthcoming), “The Rich Underreport their Income: Assessing Bias in Inequality Estimates and Correction Methods using Linked Survey and Tax Data.” Chapter in Commitment to Equity Handbook: Estimating the Impact of Fiscal Policy on Inequality and Poverty, edited by Nora Lustig (Brookings Institution Press and CEQ Institute, Tulane University), second edition.

Meyer, B. and N. Mittag (2019), “Combining Administrative And Survey Data To Improve Income Measurement.” NBER, Working Paper 25738, April.

Piketty, T (2001), Les Hauts revenus en France au 20e siècle: inégalités et redistribution, 1901-1998, Paris: Ed. Grasset. 

Piketty, T. and E. Saez (2003), “Income Inequality in the United States 1913-1998,” Quarterly Journal of Economics 118(1), pp. 1-39.