
A new NBER working paper demonstrates that AI agents can compile multi-country economic databases for the cost of a standard LLM subscription, threatening the pricing power of incumbent data vendors. The next catalyst is whether asset managers or vendors publicly adopt similar methods.
A new National Bureau of Economic Research working paper demonstrates that AI agents can compile multi-country economic databases at a cost the authors compare to a few hours of research-assistant work. The methodology, called Deep Research on a Loop (DRIL), produced 129 distinct sources and 136 evidence records for a 2025 update of the Global Tax Expenditures Database covering eight Latin American and Caribbean countries. The run covered 22 qualitative fields fully and six quantitative estimate types with documented gaps, all for the price of a standard large-language-model subscription. The paper, by Santiago Afonso, Sebastian Galiani, Ramiro H. Gálvez, and Raul A. Sosa, is not an academic abstraction. It is a signal that the cost floor for creating usable economic data sets is dropping fast, and that shift has direct consequences for the market value of incumbent data vendors and the demand trajectory for AI infrastructure.
The DRIL architecture separates the design of a research instrument – what variables to collect, how to code them – from the implementation that AI agents carry out. The design stage sets the rules; the agents then retrieve, extract, and cite publicly available information, flagging gaps and uncertainty along the way. The result is a reproducible, auditable data set at a fraction of the cost of manual collection. For companies that sell curated economic data, such as S&P Global, Moody’s Analytics, and FactSet, the paper demonstrates that a small team with no dedicated data-collection staff can replicate work that previously took months of analyst time. If the marginal cost of assembling a high-quality cross-country data set falls to essentially zero, the pricing power of franchises built on hard-to-replicate data curation erodes. This is a structural headwind that equity analysts have not yet priced into long-term revenue growth assumptions for traditional data vendors. The mechanism is not full automation; the paper is careful to note that design remains human-intensive. Even partial automation, however, reduces the labor component of data work sharply. The authors argue that this can shift the production function of empirical economics. For stock pickers, the question becomes which companies are most exposed to the commoditization of economic data, and which are positioned to capture the savings from adopting such methods.
The flip side is that the paper makes a concrete case for enterprise adoption of AI agents. If a standard LLM subscription can replace dozens of research-assistant hours, the return on investment for firms that build internal agent-based data pipelines is immediate. That strengthens the demand case for cloud computing, LLM inference capacity, and the hardware layer underneath. The NBER paper does not name specific vendors. Every major cloud provider – and the companies that supply them with GPUs and networking gear – stands to gain if large-scale data construction moves from manual teams to automated loops. The run described in the paper is small, covering only eight countries. Scaling up would require more compute, not fewer researchers. That is a signal that current AI capital-expenditure forecasts may still be too low if adoption in research and finance accelerates. For traders tracking the AI buildout, the paper provides a tangible example of how agent-based workflows can convert a labor-intensive process into a compute-intensive one, reinforcing the demand narrative for NVIDIA and other hardware suppliers.
The paper itself is a working paper; it has not yet gone through peer review or broad adoption. The next concrete marker for traders is whether major data vendors or large asset managers publicly adopt similar methodologies. If announcements appear that a firm has cut its data-collection costs using agent-based pipelines, the re-rating of data-exposed stocks will accelerate. Conversely, if incumbents dismiss the paper and continue to invest in manual curation, the disruption may take longer to play out. The DRIL paper is not a trade signal. It is a catalyst that brings forward the timeline for the commoditization of economic data and the value of AI infrastructure. For a broader view on how shifting production functions affect equity sectors, see our stock market analysis.
Drafted by the AlphaScala research model and grounded in primary market data – live prices, fundamentals, SEC filings, hedge-fund holdings, and insider activity. Each story is checked against AlphaScala publishing rules before release. Educational coverage, not personalized advice.