US statistical agencies deserve support and funding. They also need to reform.
A Q&A with John Haltiwanger
On July 29th, the Economic Innovation Group sent a letter to Congress on behalf of 87 economists. Calling on policymakers to support and properly fund the US statistical agencies, the bipartisan group of signatories included big names like Paul Romer and David Autor in addition to former heads of the agencies themselves such as Steven Landefeld and Erica Groshen.
“Our statistical agencies are outstanding — steadfast and prolific producers of the most consumed and scrutinized economic indicators in the world,” we wrote. “The economy is changing rapidly, however. Without focused and funded efforts to modernize how these essential statistics are collected and produced, the quality and quantity of the system’s output are at risk.”
But just three days after we sent the letter, President Trump shocked the economic statistics community by firing Erika McEntarfer, the commissioner of the Bureau of Labor Statistics, in response to the bad news in the July employment situation report — the precise opposite, in other words, of supporting and safeguarding the integrity and independence of the agencies, attacking one of them instead.
The subsequent controversy has distracted from a separate but equally important conversation, which is about how to reform the statistical agencies for the new era of data-gathering in which they now find themselves.
They face unprecedented strain, a point we also emphasized in the letter: “Most agencies have had flat or declining budgets in real terms for more than a decade. More recent workforce reductions have brought headcounts to historic lows.”
As a follow-up to the letter, we got in touch with John Haltiwanger, Distinguished University Professor in the Department of Economics at the University of Maryland. Few economists have a deeper understanding of how federal statistics are made or their importance for the functioning of the economy.
Via email back and forth, we discussed with John the specific challenges now confronting the stats agencies and what they should do to modernize and improve the collection and processing of federal statistics. The exchange has been lightly edited for clarity and brevity.
[EIG] We're seeing more and more reporting about the statistical agencies being in crisis. Is this a fair assessment or an exaggeration, and how would you disentangle the short and long-term challenges facing the agencies?
[John Haltiwanger] We are in a time period where short-term and long-term challenges facing the statistical agencies are increasingly converging. The U.S. federal statistical system continues to produce timely and high-quality economic indicators, but it is built on a foundation that was largely conceptualized and developed in the mid-20th century.
This system remains heavily survey-centric, relying on a mix of high- and low-frequency surveys of households and businesses.
With the digitization of nearly all aspects of modern life, administrative records and private-sector digital data are now more accessible than ever. Federal statistical agencies recognize this opportunity, but given limited investment and resources the integration of these data sources has been slow and fragmented.
At the same time, the current system is under increasing strain and requires fundamental re-engineering. Survey response rates have been steadily declining, a trend that accelerated during the pandemic. Reaching households by phone has become more difficult, and many businesses find that survey instruments — even online forms — do not align with their internal information systems. In a telling development, the U.K.’s Office for National Statistics (ONS) recently suspended (temporarily) the publication of unemployment estimates based on household surveys due to critically low response rates.
Beyond operational challenges, the accelerated pace of economic change increasingly challenges the capacity of current statistical systems to measure innovation and productivity growth effectively. Accounting for the effects of product turnover and quality change in estimates of inflation, real output, and productivity has become increasingly difficult under the prevailing survey-based framework.
Federal statistical agencies are acutely aware of these challenges and the need to modernize. However, the imperative to maintain the flow of official statistics along with the lack of investment and limited resources overall have only permitted incremental steps to modernize.
The time has come to invest in a 21st-century statistical system that fully harnesses the potential of the digital economy. Such a system would deliver more accurate, timely, and detailed data while reducing the reporting burden on households and businesses. During the transition, it will be essential to maintain continuity and comparability; for example, legacy and modern systems will need to operate in parallel for a period of time. In addition, a modern system will likely blend survey, administrative, and private-sector data. Investing now is critical to building a future-ready infrastructure for economic measurement.
You mention the importance of alternative data sources and methods as a way to address both declining response rates and the need to modernize the federal statistical system. What are some of the most promising innovations you’ve seen?
There are significant opportunities to leverage both private-sector and administrative data across multiple dimensions.
One of the most promising is the use of item-level transactions data on prices, quantities, and product attributes to enhance key economic indicators, such as the Consumer Price Index (CPI) and the Personal Consumption Expenditures (PCE) statistics on prices and consumption. Currently, these indicators are derived from fragmented data sources: the Bureau of Labor Statistics (BLS) collects price data, while the consumption expenditure measures used in the PCE rely on surveys of retail trade activity conducted by the Census Bureau.
The Bureau of Economic Analysis (BEA) faces considerable challenges in integrating these disparate sources to produce the PCE statistics — a critical component of GDP measurement.
With barcode-level tracking of retail activity now widespread, it has become feasible to collect and process internally consistent, high-frequency data on prices, quantities and product attributes. Moving beyond the survey-centric methods of the 20th century, the statistical system must now evolve to harness these digitized data sources.
This modernized approach offers the potential to account for product turnover and quality change at scale in measuring inflation and real economic activity. The tracking of innovation and growth in the economy can improve substantially with this approach.
Private sector data sounds like it holds a lot of promise. Are there unique challenges to using that type of data in economic indicators? To what extent can we substitute private data for survey data?
It is essential to recognize that the expanded use of administrative and private-sector data can only succeed if federal statistical agencies conduct the collection and processing. Only these agencies have developed and maintained the comprehensive household, job, and business frames needed to ensure that statistics are representative of the full population. These frames are critical for benchmarking.
Moreover, federal statistical agencies have both the mandate and the incentive to produce indicators that are consistent and comparable over long horizons, enabling analysts and policymakers to track trends across very different economic conditions.
By contrast, private-sector data providers typically focus on the immediate present, producing statistics optimized for short-term relevance rather than historical continuity. Closely related, though distinct, is the principle that the independence and scientific integrity of the federal statistical agencies are essential for producing reliable and accurate economic statistics.
From this perspective, proposals to use digitized private-sector data should be seen as a way for statistical agencies to harvest information more directly than under the current survey-centric model. Businesses already rely on digital platforms that track core metrics such as sales, prices, employment, and expenditures. Rather than requesting data through survey forms, agencies could develop APIs or other tools to extract this information directly from widely used platforms.
A complementary approach is to leverage third-party data providers. Many firms already rely on payroll processors and data aggregators to manage and track employment, pricing, and sales. Instead of asking firms to report these data manually, statistical agencies could obtain them directly from such intermediaries with the firms’ consent. Some progress has already been made in this direction, but agencies remain constrained by limited investment and resources.
Several issues must be addressed before moving further. First, the strict confidentiality and privacy protections that govern survey data must also apply to alternative sources. Second, agencies must ensure coverage, maintain transparency, and preserve continuity of measurement.
These challenges are substantial, but traditional surveys face equally serious difficulties, most notably declining response rates. For this reason, adopting alternative data sources should not be seen as optional but as a necessary adaptation to the evolving data landscape.
Surveys, however, will remain indispensable, particularly for capturing contextual information that administrative and private-sector data cannot. For instance, while such sources can identify who is employed and who is not, they cannot provide the contextual information on the reasons for not working. Measuring core economic concepts such as unemployment (not employed but actively seeking work) and the labor force (employed plus unemployed) requires survey-based information. Surveys must nonetheless evolve to remain effective. The Census Bureau’s Business Trends and Outlook Survey offers a model of a more nimble, responsive approach to survey design.
As you noted, the digitization of the economy has greatly increased the supply of data. At the same time, statistical agencies face growing demand for more timely and granular information. How have the statistical agencies dealt with this in the past and what more should they be doing to meet rising demand for data?
Over the past few decades, statistical agencies have made substantial progress in utilizing the vast array of available administrative data. These data have long been used to construct survey frames for both businesses and households.
In addition, established programs such as the Bureau of Labor Statistics’ (BLS) Quarterly Census of Employment and Wages (QCEW) and the Census Bureau’s County Business Patterns (CBP) are rooted in administrative records and provide the kind of granular information that data users increasingly demand.
Building on these foundations, statistical agencies have developed more detailed longitudinal data products — namely the BLS’s Business Employment Dynamics (BED) and the Census Bureau’s Business Dynamics Statistics (BDS). These relatively recent data products have yielded novel insights into business dynamism, particularly the role of startups and young firms in driving U.S. economic growth.
These administrative data products are rich and highly granular, offering detailed coverage by industry and geographic location. However, their key limitation is timeliness. The current inflow and processing methods used by statistical agencies take a substantial amount of time, despite the fact that businesses often submit tax forms and other filings promptly.
Modernizing these procedures could substantially improve the timeliness of administrative data products. More timely processing of the administrative data would enable smaller data revisions as benchmark statistics would be available on a more frequent and timely basis.
Revisions are an inherent feature of real-time economic indicators, such as monthly statistics, because the initial data are incomplete and require time to be refined as more comprehensive information becomes available. In a highly dynamic economy, households, businesses, and policymakers demand timely, granular statistics.
Early estimates, though noisy, still provide valuable insights, especially when decision-makers understand their limitations. Reducing both the noise in early estimates and the magnitude of subsequent revisions is therefore highly beneficial.
There are already examples of timely administrative data use, such as the Department of Labor’s weekly Unemployment Insurance claims and the Census Bureau’s Business Formation Statistics, both of which rely on prompt filings.
To expand this capability, agencies must address technical and institutional challenges and explore ways to blend data from multiple sources to enable more complete real time releases of economic indicators. This could include integrating the digitized data sources discussed above, enhancing the value and responsiveness of official statistics.
This is great. You’ve given us a lot to think about. To wrap things up, I want to finish with a topic constantly making headlines — Artificial Intelligence. What impacts do you think AI could have on the statistical agencies? What do you think they should do to prepare for or take advantage of AI?
Artificial intelligence (AI) appears to be the latest general-purpose technology, transforming business operations, the organization of work, and interactions with the digital world. Its adoption has expanded rapidly in a remarkably short period, yet, consistent with previous general-purpose technologies, there remains considerable uncertainty about its most effective applications.
This uncertainty arises from several factors. First, despite significant advances, AI remains a technology in active development. Second, historical experience with computing and information technology in the 1980s and 1990s shows that substantial productivity and economic growth gains often materialize only after years of experimentation and complementary investments in intangible capital, such as changes in organizational structures and business processes. These dynamics are likely to characterize the trajectory of AI as well.
Monitoring these developments is of critical importance for statistical agencies. As with the earlier information and communication technology revolution, the advent of AI is likely to influence both the broader economy and the functioning of statistical systems. However, the scale and timing of AI’s economic impact, and its implications for statistical operations, remain difficult to predict.
AI is already being incorporated into statistical agency activities. For instance, machine learning is employed to classify new businesses and products into appropriate industry and product categories. In the re-engineering of key economic indicators, such as measures of inflation and real growth, machine learning can improve the capacity to account for product quality changes and turnover at scale. AI and machine learning could greatly expand this capability, particularly when combined with item-level transactions data containing real-time information on prices, quantities, and product attributes.
Nonetheless, the ultimate impact of AI on statistical agency operations remains uncertain. As with private-sector adoption, realizing the full potential of AI is likely to require substantial complementary investments in intangible assets such as organizational and human capital. For statistical agencies in particular, the imperative of transparency necessitates that any application of AI be designed and implemented with this principle at the forefront.


