A New Threat to Economic Data
A Commerce Department policy risks damaging the public’s understanding of the economy.
If you are even just a casual user of the economic data produced by the government, or if you simply care about the integrity of the data, you should be distressed by a new policy announced by the Commerce Department last week.
The policy bans the use of something called noise infusion, a necessary tool used by statistical agencies, in particular the Census Bureau, to publish granular data about individuals and businesses without disclosing their identities, thereby protecting their privacy.
The likely reason for the policy change is that the Trump administration wants to improve the precision of certain data series. But the likely result of the policy will be that the government publishes less data, weakening the public’s ability to understand what is happening in their local communities and the American economy.
This consequence would be particularly damaging because individuals, businesses, and policymakers rely on these data to make decisions. Workers use this data to decide whether or where to move, which skills to acquire, and which jobs to seek. Businesses use these data to decide whether and what type of investments to make. And policymakers use these data to design policies and evaluate their effects on constituents. Without it, they are all flying blind.
I agree with the goal of making data more precise. This just isn’t the way to do it.
Disclosure Avoidance Primer
Disclosure avoidance refers to the set of methods used by the federal stats agencies to protect the privacy of individuals and businesses that contribute to statistical estimates. The agencies are required by law to protect the data of respondents to their surveys.1 In practice, this means they must ensure that it is impossible to back out the information of a specific individual or business from any statistical product.
An example is the County Business Patterns (CBP) data, which provides statistics on business activity by industry and geography. Consider the following three scenarios:
There is only one brewery in a small county. If the CBP published the exact count of brewery employees in that county, it would be disclosing the information of one business (how many workers it employs), a clear violation of the law.2
There are two breweries in a small county, and the CBP again publishes the exact count of brewery employees. If I own one of those breweries, I could learn how many employees my competitor has, again violating the law.
There are more than two breweries in a small county, but the CBP chooses not to publish the total number of brewery employees out of concern that it might compromise the privacy of the businesses. If I’m a prospective brewery owner, I may deem the project too risky to pursue without information about the market I’m entering.
These hypotheticals illustrate a core tradeoff of disclosure avoidance: Granular data is often more useful than aggregate data, but it also risks disclosing private information.
The stats agencies have several tools to manage this tradeoff, including coarse aggregations (releasing less data), cell suppression (removing certain statistics from the data that is released), and noise infusion (fuzzing the data).
It is the third option, noise infusion, that is being banned by the Commerce Department. When I say that noise infusion means “fuzzing the data,” I’m not using a technical term. What I mean is that noise infusion is a method of slightly altering granular data in such a way that the privacy and identity of individual people and businesses are protected in the statistics derived from them. The data are not so much altered, however, that they become unhelpful.
Another example helps to explain how it works. The CBP uses a type of noise infusion known as multiplicative noise, in which the employment of each business in a county is multiplied by a random number, increasing or decreasing it by a small amount. Go back to our hypothetical county, and this time there are 20 total breweries. Noise infusion allows the CBP to publish the total number of brewery workers in that county because it is not based on each brewery’s exact employment. But the data on total brewery employment in the county will be close enough to its actual value for it to be useful to anyone who needs this information to make decisions.
For any statistic that covers a large number of people or businesses, the noise cancels out and the statistic ends up very close to the actual value. For a specific industry in a small county with only five or ten businesses, the statistical and actual value may differ. But the differences tend to be small, and even in those cases it is nonetheless better to have some data rather than none. In the absence of noise infusion, for example, the stats agencies might be forced to only give the count of brewery employees at the state level. This information is far less useful to local policymakers and businesses — or to a prospective employer looking to build a new brewery in a specific county with unique business conditions that may differ from other parts of the state.
The practice of noise infusion became contentious around the 2020 decennial census because of the Census Bureau’s introduction of a new disclosure method called differential privacy, which is another type of noise infusion.
Differential privacy is similar to multiplicative noise infusion in that it adds noise to protect the privacy of people and businesses. But differential privacy is fundamentally different in that it gives the Census Bureau a better understanding (through complicated math) of how changes in the amount of noise added will affect the risk of disclosing too much private information.
The worry on the part of data users was that the Census Bureau would use — or misuse, in their view — differential privacy to err too far in the direction of protecting privacy, thus compromising the precision of the data.
Importantly, as I’ve already explained, differential privacy is not the only type of noise infusion used by the stats agencies. The County Business Patterns, Business Dynamics Statistics, and Quarterly Workforce Indicators have been using the simpler multiplicative noise infusion for decades. But the new policy from the Commerce Department would ban multiplicative noise infusion in addition to differential privacy.
Why I’m Worried
Noise infusion is a vital tool in an agency’s toolkit to release more data than it otherwise would. An excerpt from Abowd, Stephens, and Vilhuber (2005) makes this point nicely for the Quarterly Workforce Indicators (QWI) data:
“Because of the fine detail offered by the published statistics and the confidential nature of the micro-data used to compile the statistics, confidentiality protection is a critical and integral part of the design of the QWI system. Only the application of state-of-the-art protection methods allows the Census Bureau to publish these statistics.”
Noise infusion makes possible the granular tabulations of QWI data, which are used to inform the public about employment, hires, separations, and earnings by industry, geography, and worker characteristics including age, sex, race, and education.
If the Commerce Department policy stands and noise infusion is banned, which disclosure avoidance tools could the agencies use?
The policy states: “Coarsening shall be the preferred category of Disclosure Avoidance methods for all statistical products.”
And: “Suppression shall be permitted as a last resort.”
As a reminder, coarsening means releasing less granular data, and suppression means no longer publishing certain statistics altogether.
And that is why you should be worried. Either the agencies decide it’s actually okay to publish data for the county with only a handful of breweries, or, more likely, they decide that there will simply be no county-level brewery data at all. The data might get rolled up to the state level, which is far less detailed and helpful.
Now what?
Remember again the fundamental tradeoff.
On the one hand, the statistical agencies must adhere to strict legal requirements that ensure the privacy of individuals and businesses. Those privacy guarantees, importantly, encourage survey responses by giving respondents confidence that their data won’t be made public.
On the other hand, the statistical agencies are charged with producing quality data about the nation’s people and economy. Excessive focus on disclosure risks can degrade the quality of data, reducing its value to the individuals, businesses, and policymakers that rely on them.
Outside of clearcut cases like our small county with just one or two breweries, the statistical agencies exercise significant discretion in terms of how strict the disclosure-avoidance protocols should be. In the brewery example, how many breweries does a county need for the data to be considered safe? If one brewery was much bigger than the others, could the smaller ones back out the data of the bigger one?
This ambiguity leaves a lot of room for choice. The agencies can choose to publish as much data as possible given legal constraints, or they can avoid the risk and design rules that reduce the amount of data released. The agencies often choose the latter.
The 2022 proposal to change the Current Population Survey Public Use microdata files is an instructive example. In early 2022, the Census Bureau announced a plan to more than double the population threshold for suppression and round wages. By December of 2022, the Bureau walked back those changes due to pushback from data users, finding alternative approaches less damaging to the usability of the data.
Given this tendency towards limiting data release, which I myself observed over years working within the statistical system,3 it is entirely reasonable for policymakers to rebalance the scales and place greater emphasis on producing more and higher-quality data.
But if the administration’s goal is to pursue that rebalancing, banning a specific disclosure avoidance method is not the way to do it.
Without changing the laws governing privacy protections at the statistical agencies, or promoting a culture and leadership more focused on the quality and volume of data they produce, removing noise infusion as a disclosure avoidance technology will lead to less data overall.
Worse yet, the prospect of less data comes at exactly the wrong moment. There is growing bipartisan agreement that we need significant investments in our statistical system to improve our measurement of AI’s impact on the economy.
I hope to be proven wrong, and that the rollout turns out to be more sensible than I’ve imagined it here. But this new policy has me worried.
The relevant codes are Title 13 and Title 26.
See Title 13 U.S. Code § 9, which prohibits making “any publication whereby the data furnished by any particular establishment or individual under this title can be identified”.
I spent nearly 17 years at the Census Bureau, having joined in 2008 as a business analyst, supporting the processing of economic surveys. After completing graduate school I transitioned to Economist in the Census Bureau’s Center for Economic Studies. In 2022 I became Principal Economist, using Census microdata for research and creating new public-use data products. I left the Census Bureau in 2025 to become Research Director at the Economic Innovation Group.



Is this a new CFR? Is there a comment period? Any suggestions for action?
As you have described, this is concerning. For my research, I am trying to tease out the best course of action for myself, as a 55-year old woman who has spent most of their last 20 years caring for children and the elderly, while home educating and working as a delivery driver. I have a BA that I have never used. I am trying to determine the shortest ROI of both time and money, if I should earn another bachelor's degree, seek a license or certification, in what field, or...? I am not 20, with time on my side to pay back a loan. I need to drastically, quickly, improve my financial standing. I need good information to make these decisions.