← ALL PROBLEMS
water-aging-pipe-network-failure-prediction
Tier 22026-02-12

Cities Cannot Predict Which of Their Millions of Buried Water Pipes Will Fail Next

waterinfrastructure

Problem Statement

The United States has approximately 2 million miles of buried water pipes, most installed after World War II and now approaching or exceeding their designed lifespan. The EPA estimates that fixing these systems will cost $625 billion, but cities cannot spend that money rationally because they don't know which pipes are most likely to fail. Houston lost 32 billion gallons of drinking water to leaks in a single year. Atlanta experienced over 176 pipe breaks in nine months in 2024, including dramatic failures that triggered a 72-hour state of emergency. The fundamental problem is that pipes are buried, their condition is invisible, and the historical data that would enable failure prediction — installation date, material, soil conditions, pressure history, repair records — is incomplete, inconsistent, or locked in paper records across decades of municipal administration.

Why This Matters

Water main breaks cause immediate public safety hazards (sinkholes, flooding, contamination), disrupt transportation and commerce, and damage surrounding infrastructure. The median age of U.S. water utility workers is nearly 50, and more than half will approach retirement in the next decade, taking institutional knowledge of buried infrastructure with them. McKinsey estimates that waste and water infrastructure globally requires $6 trillion in investment through 2040. The problem is not confined to the U.S. — aging water infrastructure affects every developed country, and rapidly urbanizing developing countries are building systems that will face the same challenges within decades. Current replacement strategies are often politically driven (replacing pipes after visible failures) rather than risk-optimized, meaning limited budgets are spent on pipes that just happened to fail rather than those most likely to fail next.

What’s Been Tried

Statistical pipe failure models using age, material, and diameter as predictors have been developed since the 1990s but consistently underperform because they can't account for site-specific factors (soil corrosivity, traffic loading, water pressure transients, proximity to tree roots) that dominate actual failure risk. Acoustic leak detection — sending sound waves through pipes and listening for anomalies — is the current standard for finding active leaks but is reactive (finds leaks after they start) rather than predictive and requires trained operators with specialized equipment to walk every segment. AI-based satellite leak detection, a newer approach, uses satellite imagery and machine learning to identify subsurface moisture patterns, but a 2025 comparative study in Atlanta found it was 50% less cost-effective than conventional acoustic methods over a 3-year period. In-pipe inspection robots (e.g., SmartBall, PipeDiver) can assess condition directly but are expensive, slow, and can only access pipes large enough to enter — most of the distribution network consists of small-diameter pipes that are inaccessible to existing robots. Digital twin approaches show promise but require accurate asset data that most utilities don't have.

What Would Unlock Progress

A low-cost, scalable approach to estimating pipe condition without direct inspection would transform water infrastructure management. This could combine multiple indirect data sources — soil type from geological surveys, traffic data from transportation departments, historical repair records (even incomplete ones), water pressure data from existing SCADA systems, weather patterns, and satellite/aerial imagery — into a machine learning model that predicts failure probability at the pipe segment level. The key insight is that even a rough risk ranking that correctly identifies the top 10% highest-risk pipes would dramatically improve capital allocation compared to the current reactive approach. Utilities need tools that work with the incomplete, messy data they actually have, not tools that require comprehensive digital asset inventories they'll never build.

Entry Points for Student Teams

A student team could build a pipe failure risk model for a specific city using publicly available data: water main break reports (many cities publish these as open data), soil maps from USDA Web Soil Survey, street and traffic data from OpenStreetMap, and pipe age/material data from utility GIS layers (some cities publish these). The team would train a spatial model to predict break probability and validate it against held-out break data. Even a model that modestly outperforms age-based prediction would demonstrate the value of multi-source data integration. Skills in geospatial analysis, machine learning, and civil/environmental engineering would be most relevant. A more design-oriented team could prototype a mobile app for utility field crews to capture pipe condition data during routine work, building the asset database that currently doesn't exist.

Genome Tags

Constraint
dataeconomicinfrastructure
Domain
waterinfrastructure
Scale
regional
Failure
ignored-contextunrepresentative-datalab-to-field-gap
Breakthrough
algorithmsensingdata-integration
Stakeholders
institutional
Temporal
worsening
Tractability
proof-of-concept

Source Notes

- McKinsey's "The Infrastructure Moment" (September 2025) provides the investment context. The $625B EPA estimate is from the most recent Drinking Water Infrastructure Needs Survey and Assessment. - The Atlanta pipe break crisis (176+ breaks in 9 months, 72-hour emergency) is well-documented in local reporting and illustrates the real-world consequences. - The 2025 MDPI study comparing acoustic vs. satellite leak detection in Atlanta is a valuable reference for understanding the current state of detection technology and its limitations. - Cross-domain connection: this problem shares structure with `infrastructure-cascading-failure-modeling` — water main breaks can cascade to road failures, building damage, and service disruptions. Also related to `energy-grid-battery-scale-failure-prediction` in that both involve predicting failure in large, distributed infrastructure networks with heterogeneous assets and incomplete data. - The workforce retirement wave (median age ~50, half retiring in the next decade) makes this problem time-sensitive: institutional knowledge of buried infrastructure is being lost. - Many cities are beginning to publish water main break data as open data, creating an opportunity for external analysis that didn't exist five years ago.

Source

"The Infrastructure Moment," McKinsey Global Institute, September 2025. https://www.mckinsey.com/industries/infrastructure/our-insights/the-infrastructure-moment (accessed 2026-02-12). Supplemented with "Global Risks Report 2025," WEF, January 2025; ASCE 2025 Infrastructure Report Card data; and "Evaluating Acoustic vs. AI-Based Satellite Leak Detection in Aging US Water Infrastructure," *Smart Cities*, MDPI, 8(4):122, 2025.