__STYLES__

Department of Energy Power Outage

Tools used in this project
Department of Energy Power Outage

Power BI Dashboard

About this project

OBJECTIVE:

This project was made from Maven's Power Outage Challenge, where the aim was to function as a Senior Analytics Consultant hired by the U.S. Department of Energy. The task involved cleansing data and designing a report/dashboard to highlight trends and significant insights collected from two decades' worth of event-level power outage data.

GOAL:

My goal is to create a one page summary report that focuses on the comparison of the demand loss and customers affected and provide insights to the DOE on which area they want to improve on.

DATA CLEANING:

This project involved working primarily with a messy dataset. In this project, I decided to use python for the data cleaning process as it provides more options and a more efficient way of cleaning the data. The dataset was divided into multiple sheets which represents the years 2002-2023. Here are the steps made in cleaning all the columns in the dataset.

Date Event Began, Date of Restoration, Time Event Began, Time of Restoration: Dates and Time are in different formats so I decided to make it so that if follows only one format for all years (MM-DD-YYYY) for dates and (HH-MM-SS) for time.

NERC Region: Referring to this page https://www.nerc.com/AboutNERC/keyplayers/Pages/default.aspx, I removed all inputs that is not included from the picture and replaced other inputs according to their respective NERC Region. undefined

Area Affected and Customers Affected: Inputs are the combination of the area of the state the power outage affected and the state itself so I made a script to only get the state affected. Most or the records have multiple areas affected in one cell. I decided to separate each areas affected and created a duplicate row for other columns except the area affected column. Assumption: Since I created new rows, the number of demand loss and customers affected are being doubled. Therefore, I made it so that each new row created, divide the number of rows created to the total number of demand loss or customer affected from the original row.

undefinedTo

undefined

Event Type: Inputs have different wordings but have the same context. I categorized these inputs into Severe Weather, Natural Disasters, Vandalism etc.

Demand Loss: One cell have multiple inputs. I decided to get only the maximum value from rows multiple inputs. undefinedTo

undefined

KEY INSIGHTS:

Most power outages occurs in the late months of the year. This is because of the seasonal weather patterns in the United States where numerous severe weather occurs.

California, Texas, Carolina, Michigan, New York and Washington are the leading states with the most demand loss and customers affected. These states are most affected by the power outage as these states have massive amounts of businesses operating, therefore a lot of people are affected.

Common power outages starts at 12 pm and ends at 6pm while most power outage restorations ends between 12pm and 5pm indicating that the time it takes to fix outages are not long.

Yearly Demand Loss has an upward trend which indicates that more outages are occurring over the years. However, Yearly Customers Affected has a downward trend showing improvements on the power grid and operator's response time. This could also mean that a better and more efficient power backup are being used by businesses to lessen the impact of these power outages.

Discussion and feedback(0 comments)
2000 characters remaining