The Clean Water Voice

Leveraging AMI Data for Fair and Equitable Rate Structures

Mar 13, 2025

For public utilities, setting fair and equitable rates is a complex and ever-evolving challenge. Traditionally, cost-of-service studies have relied on historical assumptions, monthly meter reads, and estimations based on small subsamples of customer usage data. However, with the innovation of Advanced Metering Infrastructure (AMI), utilities now have access to an unprecedented level of granular data—hourly usage for every customer.

Palm Beach County Water Utilities Department (PBCWUD), which serves approximately 635,000 residents, embarked on a groundbreaking initiative to incorporate hourly AMI data into its rate-setting approach. While this method has the potential to create a more equitable cost-of-service allocation, it also presented significant data management and analytical challenges. We partnered with PBCWUD to process, analyze, and incorporate this data into their cost-of-service model.

Managing and utilizing AMI data

Historically, cost-of-service studies determine how different customer classes contribute to a utility’s costs based on estimated usage patterns. This ensures that customers pay their fair share and that the utility recovers its revenue requirements without unfairly burdening any specific group.

However, these estimations rely on broad assumptions. For example, peaking factors, the ratio of maximum demand to average demand, have been calculated based on small sample sets, industry norms, and professional judgment. With AMI, utilities can now calculate these factors based on actual hourly consumption data rather than assumptions.

Working with such a large AMI dataset presented significant technical challenges. The initial data delivery with PBCWUD consisted of:

Three fiscal years of data split into monthly CSV files, each averaging 375MB
Another dataset consisting of two TXT files per month, each approximately 170MB

Attempting to process this volume of data in Excel proved infeasible due to row and column limitations, slow computations, and frequent crashes. To make use of AMI’s full potential, a new analytical approach was needed.

A data-driven approach using Python and Tableau

There are several options for efficiently managing and analyzing data of this size. After considering multiple approaches, Raftelis opted for a combination of Python and Tableau.

Data pre-processing with Python

Python is a good option for analyzing and filtering large sets of data since there are no limits on data size or performance loss like you may have in Excel. One of the reasons for this is that Excel has to visualize the entire dataset all of the time, whereas Python—via DataFrames—does not. We used Python, particularly the Pandas and NumPy libraries, to summarize PBCWUD data into a single row per customer class, containing total usage for each hour over multiple fiscal years. This drastically reduced the size of the dataset, which we then exported back into Excel for integration into the cost-of-service model.

Data visualization and exploration with Tableau

Another option for analyzing and filtering large datasets is Tableau. Tableau Prep was used to merge monthly account details, creating a database with approximately 1.4 billion records. Using Tableau’s data visualization capabilities, we created advanced summary figures that helped identify patterns, peak demand periods, and anomalies within the data. We also analyzed time-of-day consumption characteristics to understand how different customer classes contribute to system demand at various times. Other characteristics included statistical measures for peaking such as averages, maximums, minimums, and total usage by account type by date. Visually exploring the data helps to identify key observations and can inform additional directions to explore.

Overcoming data quality challenges

One of the biggest hurdles in this project was ensuring data accuracy. Upon initial analysis, we identified extraordinarily high peaking factors for fiscal years (FY) 2021 and 2022, numbers that were physically impossible based on system capacity. Further investigation revealed faulty meters registering implausibly high hourly usage (e.g., 250 million cubic feet in a single hour).

To resolve this, several statistical techniques were explored:

Outlier removal based on standard deviations and interquartile range
Max-to-average ratio analysis at both the individual account level and the customer class level
Comparative filtering across fiscal years to detect and filter out mistakes in data

Ultimately, a threshold-based approach was implemented to remove any hourly data where usage exceeded four times the annual average for that customer class. This approach preserved over 99% of the valid dataset while eliminating problematic outliers. Additionally, due to continued discrepancies in FY21 and FY22, the final analysis relied exclusively on FY23 data, ensuring reliability while setting the foundation for future refinements.

A more accurate and equitable cost-of-service model

The refined dataset provided critical observations into actual customer demand patterns. By comparing hourly usage across different customer classes, PBCWUD was able to develop peaking factors rooted in real data rather than assumptions. Key findings included:

Commercial customers exhibited significantly higher peak-to-average usage ratios than previously estimated.
Residential customers demonstrated relatively stable usage patterns with moderate peaks.
Wholesale customers had lower peaks relative to their average usage, contradicting prior assumptions.

These findings directly influenced cost allocation decisions, ensuring that each customer class pays a fair share based on actual demand. The enhanced accuracy also enables PBCWUD to plan infrastructure investments more efficiently, optimizing capital expenditures for peak demand periods.

Lessons learned

The PBCWUD case study offers several important takeaways for other local governments and utilities considering AMI integration:

Excel is not sufficient for large-scale AMI data processing.

Using Python or other advanced data analytics tools is essential for handling massive datasets efficiently.

Data visualization is crucial for understanding consumption patterns.

Tools like Tableau enable utilities to extract meaningful insights from AMI data.

Data quality control is a necessary step.

AMI data may contain anomalies due to faulty meters or transmission errors, requiring validation and filtering.

Transitioning to data-driven cost-of-service models improves fairness and transparency.

Eliminating assumptions in rate setting leads to more equitable customer charges and better financial sustainability.

Continuous refinement is needed.

As AMI adoption grows, utilities should periodically revisit their datasets and methodologies to incorporate new insights and improve accuracy.

The integration of AMI data into cost-of-service studies represents a paradigm shift for the utility industry. Palm Beach County Water Utilities Department demonstrated that although transitioning from assumption-based analysis to data-driven decision-making presents challenges, the benefits far outweigh the difficulties.

By leveraging advanced analytics tools like Python and Tableau, PBCWUD successfully processed massive amounts of AMI data, improved the accuracy of its peaking factors, and is in the process of developing a more equitable rate structure. As more utilities embrace this approach, the industry will move toward more fair, more transparent, and more financially sustainable rate-setting practices.

For more information on using AMI datasets for rate setting, please contact Rocky Craley at rcraley@raftelis.com.

The views expressed in this resource are those of the individual contributors, and do not necessarily reflect those of NACWA.