The Water Program Portal contains a geographical hierarchy to allow users to drill up and down to explore local factors not evident from higher levels of data aggregation and to see the “forest through the trees.” The software we use on Water Program Portal, Microsoft Power BI, supports hierarchical datasets natively. It does not store geography, however, and we must add our own version of geographical data to the Water Program Portal. This is a complex process to improve accuracy and usability. As the figure below shows, the ZIP Code serves as the base geography for the Water Program Portal. For each ZIP Code, a corresponding primary city or town, county, electric utility, combined statistical area, state, and one of several regional divisions are defined. Essentially, this looks like a giant table with a single row for each ZIP Code and columns for all the other geographic categories.

Diagram of Water Program Portal geography

The Water Program Portal’s geographical hierarchy allows users to quickly drill up and down to gain unique insights without having to crunch the numbers. The ZIP Code is the base geography subsequent map directly to a ZIP Code.

Why ZIP Codes?

ZIP Codes in the United States are a great way to explore local data below the level of a city or town. They’re also an easy way to identify a location, since all ZIP Codes are unique. There are exceptions to the ease of using ZIP Codes, of course, as some can cross municipal boundaries. At this time, there is no better way to connect a broad swath of datasets at the local level than using the ZIP Code. For quantitative datasets (like building surveys), the hierarchy lets users quickly add up data or breakdown data. The Water Program Portal blends this data with the geography data on the Hub and allows users to easily add up the number of vehicles in a city, county, electric utility territory, and statewide.

Potential Errors in Geography

Microsoft Power BI only supports a one-to-many relationship, which means a ZIP Code can only map to a single city, county, electric utility, combined statistical area, state, or regional division. As a result, we had to make tradeoffs when assigning the geography for each ZIP Code, which creates an error when the data is aggregated. The structure of the geography allows for some inconsistencies in geography data, however. A single city can be listed in more than one county, a county can be listed in more than one electric utility territory, and so forth, as long as each instance can be connected to a unique ZIP Code. For example, both a municipal utility and American Electric Power operate in Columbus, Ohio. The Water Program Portal distinguishes these two territories by ZIP Code, though that may not align well with the actual operating territories of the two utilities. Since the base geography is the ZIP Code, users should exercise caution when using the Water Program Portal to aggregate and use data for statistical analyses.

Methodology to Map ZIP Codes to Electric Utility Territories

ZIP Codes are the base geography for the Water Program Portal. Electric utility territories in the United States are not neatly divided by ZIP Code or many other geographic boundaries. In fact, many electric utilities can operate in the same ZIP Code, such as investor-owned utilities and municipal utilities, each serving different segments of a city. The Water Program Portal uses a two-step process to make a best approximation of the primary electric utility in each ZIP Code. We rely on data from the U.S. Energy Information Administration (EIA-860 and EIA-861), which provides a listing of all electric utilities and counties that they operate in. We also use the utility-ZIP mapping from OpenEI. For each ZIP Code, we first choose the utility defined on OpenEI. If it’s not there, then we fall back on the utility-county lookup data from the U.S. EIA. If more than one utility operates in a ZIP code from OpenEI or a county from the EIA surveys, then we’ll choose the largest utility as measured in megawatt hours served, according to EIA data. The data is updated annually and the mapping of utility to ZIP code can be re-assigned based on the data from OpenEI and EIA.

As a last step in this process, we include a custom override table for ZIP codes using data provided directly by electric utilities in some states.

Corrections to the Geography Data

No one’s perfect and we know ZIP Codes, utility territories, etc. can change. Please fill out the form below and tell us if you’ve identified a problem and how we can fix it.