Ideas & Insights / All Grantee Impact Stories / Ideas & Insights Grantee Impact Story

The Race to Eliminate Lead-Contaminated Drinking Water

Millions of U.S. families are exposed to lead through contaminated water piped into their homes; children are particularly at risk of neurological damage and developmental delays. But the data needed for efficient remediation is often incomplete or difficult to access, so our Data Science team stepped in to streamline lead removal efforts in four U.S. cities, building a Lead Pollution Dashboard and partnering with BlueConduit, whose mapping algorithms help accurately predict the location of lead pipes.

Investing in Data Access and Accuracy

America’s aging infrastructure means that residents in up to 10 million homes risk exposure to lead from water that arrives through toxic lead service lines, most often in low-income communities and communities of color. Troublingly, public health officials are warning that Covid-19-linked closures of schools and daycare centers may have further increased lead exposure for many more children.

Exposure to lead—even in low doses—has long been linked to irreversible neurological damage and developmental delays in children and can seriously harm people of any age. Lead poisoning can impair functioning of the heart, brain, kidneys, reproductive organs, nervous and immune systems, and more.

Water systems records are often inaccessible, missing, or inaccurate, making it difficult to remove toxic lead pipes quickly and efficiently. So The Rockefeller Foundation’s Data Science team created a publicly accessible Lead Pollution Dashboard to quickly identify disadvantaged communities that also have high levels of lead pollution. Using the Dashboard, the Data Science team selected four highly impacted U.S. cities to receive targeted support from our partner BlueConduit. BlueConduit’s predictive mapping algorithm has achieved 70% accuracy in locating toxic lead pipes for removal, compared to only 14% accuracy using more conventional methods – a fivefold improvement. The introduction of machine learning into the process translates into faster and more efficient lead pipe removal and has already saved one city tens of millions of dollars.

“This is not just about collecting numbers, this data represents the health of our nation’s children,” said Zia Khan, Senior Vice President of Innovation for The Rockefeller Foundation. “BlueConduit’s algorithm helps public officials pinpoint where to act, and fast, to eliminate lead in drinking water, particularly in lower income and minority communities, where children are often more exposed to lead.”

A Persistent, National Public Health Hazard

Since the 1970s, U.S. regulations have sought to reduce or eliminate sources of lead exposure—including from paint, water, air, and soil—to prevent lead poisoning. Despite these efforts, multiple sources of lead are still present in millions of homes, particularly in older buildings inhabited by low-income families.

Lead-contaminated water is a risk for the 6 million to 10 million U.S. households that still receive drinking water through lead service lines. The Centers for Disease Control (CDC) and the U.S. Environmental Protection Agency (EPA) have found that no amount of lead is safe in drinking water; even low levels of lead exposure can be harmful to human health.

Over the course of the Covid-19 pandemic, the CDC has reported steep declines in childhood lead screenings, and officials are concerned that up to 10,000 children who may have had increased exposure to lead due to stay-at-home orders won’t receive essential interventions that could mitigate harm.

The race to eliminate lead-contaminated drinking water across the U.S. could be significantly boosted by the Biden Administration’s proposed infrastructure plan, which, if passed, would allocate $45 billion towards the goal of eliminating all lead water service lines in the country. Now is the time to scale up the use of new tools that can strengthen and speed up lead remediation efforts and ensure smart, data-driven resource allocation on a national scale.

Our Approach: Innovative Tools and Strategic Partnerships

The difficulty of accessing and aggregating relevant datasets, including federal reporting on local water system characteristics, tap water lead test results, and safety violations, inspired the Data Science team to create a publicly available Lead Pollution Dashboard. The dashboard integrates cleaned up data from several datasets and couples it with socioeconomic variables from the American Community Survey.

At the municipal level, outdated, missing, and inaccurate water systems records create a great deal of uncertainty about where lead pipes are located. Unreliable data is a big driver of cost in an already-expensive process: it costs $3,000 per excavation to determine whether a home has a lead service line and, if lead is confirmed, the replacement adds an additional $2,000, on average. Without accurate data, cities risk conducting far too many unnecessary excavations as they attempt to replace hazardous lines.

In 2019, The Rockefeller Foundation began working with BlueConduit, a water infrastructure analytics company that uses machine learning to inventory water systems and make home-by-home predictions on where lead service lines are most likely to be found. BlueConduit’s predictive modeling algorithm was first deployed in Flint, Michigan.

Using the Lead Pollution Dashboard to identify resource-strapped cities with high levels of lead pollution, and factoring in BlueConduit’s knowledge of municipalities ready to begin or already engaged in lead remediation, four cities were selected to receive targeted financial support from The Rockefeller Foundation: Benton Harbor and Detroit in Michigan; Toledo, Ohio; and Trenton, New Jersey. The work with BlueConduit will jumpstart lead remediation for water systems that collectively serve more than one million residents, many of whom live in disadvantaged communities with a high risk of lead exposure.

The predictive maps and updated inventories that BlueConduit helps create are critical tools in estimating the costs of lead pipe replacement. Once in place, water systems inventories can be used to secure funding, track progress, increase transparency, and support community engagement.

“The city of Trenton has committed to replacing 100 percent of all lead pipes, but if we assume that every unknown water pipe contains lead, then the scale of the problem could be significantly overestimated,” explains Kristin Epstein, Assistant Director at the Department of Water in Trenton, NJ. “BlueConduit’s predictive mapping work will enable us to more accurately plan, fund and implement the critical work of lead pipe replacement throughout the city of Trenton.”

BlueConduit is also working to help cities secure additional funding where needed; for example, in Toledo, BlueConduit helped obtain federal funds for collaborative work with communities facing environmental justice challenges related to lead pollution.

While predictive models can guide cities in prioritizing areas to search for lead pipes, machine models should not be the only tool used to determine which houses are inspected for lead exposure; close consultation with community stakeholders is needed to ensure that data is appropriately used in the context of environmental justice and socioeconomic equity.

Blue Conduit’s Impact in Flint, Michigan

One of the most notorious public health crises caused by lead-contaminated water in recent history unfolded in Flint, Michigan. In 2014, the city’s water source was changed due to a budget crisis; officials failed to properly treat the water, exposing almost 100,000 people to lead that leached from aging pipes into the water supply. This crisis would eventually lead to one of the largest water service line infrastructure remediation projects ever attempted.

By 2016, lead service pipe replacement was underway, aided by the volunteer efforts of two university professors—Jacob Abernethy, Assistant Professor in Computer Science at Georgia Tech., and Eric Schwartz, Assistant Professor of Marketing at the University of Michigan—who designed a machine learning model to guide excavation decisions.

Their algorithm included data from multiple sources, including the age, value, and location of homes; historic records of water pipe materials; maintenance and repair records from the utility and city; and reports from commissioned targeted hydrovac inspections (a less disruptive method of physically verifying the presence of lead pipes). Data from inspections and excavations was continually fed into the machine learning model to improve house-by-house predictions.

This predictive mapping process played a critical role in accurately gauging the scale of the problem in Flint by pinpointing where lead service lines were likely to be found and boosting efficiencies in the speed and cost of replacement efforts.

In 2018, the city engaged a national water engineering firm for the project, and the algorithmic approach was abandoned, causing a steep decline in the efficiency of replacements and significantly increasing the cost per replaced pipe. In 2019, a court-ordered settlement forced the city to reinstate the predictive models. Over the course of the project, BlueConduit’s algorithmic predictions achieved an accuracy rate of 70%, compared to a 14% accuracy rate achieved during the engineering firm’s excavations in 2018.

Overall, the algorithmic model saved the city of Flint tens of millions of dollars by reducing the number of unnecessary excavations, while also reducing the total number of days that residents lived with lead-contaminated water supply lines.

The potential scalability of this machine learning approach motivated Dr. Abernethy and Dr. Schwartz to found BlueConduit with a social mission of supporting cities in eliminating lead from drinking water.

A critical component of the work in Flint, Michigan and in the four cities where The Rockefeller Foundation and BlueConduit are collaborating is strong community engagement and consultation. Machine-based algorithms must always be contextualized with real-world historical, social, environmental, and economic data. Without adequate transparency, public communication and community consultation, distrust of computer-based algorithms is a risk that can derail public confidence.

“When you have inequality or other kinds of distributional issues in a city, it’s very important to collect random samples that provide unbiased estimates of where lead is,” explains Dr. Schwartz. “If the model only includes data from places where records exist, it’s going to reinforce existing biases.”

Scaling Up to Meet National Ambitions

The collaboration between The Rockefeller Foundation, BlueConduit, and officials in Michigan, New Jersey and Ohio is yielding important lessons that will be critical to scaling up data-driven approaches to lead remediation on a national scale. Three key takeaways are.

  1. Develop a toolkit of publicly accessible, accurate data repositories
  2. Develop strong, collaborative partnerships with communities and stakeholders
  3. Support efforts to secure additional funding for lead remediation

Local governments grappling with the risks and repercussions of lead exposure and the need to replace aging infrastructure desperately need funding to accelerate lead remediation programs, particularly in vulnerable communities. If enacted into law by Congress, the Biden Administration’s proposed infrastructure plan would allocate $45 billion towards the goal of eliminating all lead water service lines across the country at no cost to the homeowner.

Equipping public health officials with tools to enable data-driven planning for this injection of federal funding will be critical in helping cities not only to accurately estimate the cost and scope of removing lead pipes, but also to ensure efficient replacement that can rapidly reduce lead exposure risks in their communities.

Reducing exposure to lead-contaminated drinking water is a social challenge that can greatly benefit from data-driven decision-making. Balancing technical advances with community needs will help ensure that when data science innovations are deployed at scale, the result is improved social outcomes. It’s time to deploy new data-driven approaches towards securing clean, safe drinking water for all Americans.

Related Updates