INFO511 Fall 2024 Project Proposal

Exploring evoling electric vehicle charging efficiency

The team has agreed to select the EV Charging Dataset for the project out of an interest to explore that evolving technical space and build predictive models for charging efficiency.
Author
Affiliation

ChilePepper

School of Information, University of Arizona

import warnings
import pandas as pd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

warnings.filterwarnings("ignore")

Introduction to the Data

url = "https://raw.githubusercontent.com/INFO-511-F24/final-project-ChilePeppers/main/data/ev_charging_patterns.csv"
ev_charging  = pd.read_csv(url)

url = "https://raw.githubusercontent.com/INFO-511-F24/final-project-ChilePeppers/main/data/Spotify_Most_Streamed_Songs.csv"

Spotify_db = pd.read_csv(url)

url = "https://raw.githubusercontent.com/INFO-511-F24/final-project-ChilePeppers/main/data/Mobile_user_behavior_dataset.csv"

mobile_data = pd.read_csv(url)

Dataset - Electric Vehicle Charging

  • Source of Data: https://www.kaggle.com/datasets/valakhorasani/electric-vehicle-charging-patterns

  • Description of Observations: This dataset provides a comprehensive analysis of electric vehicle (EV) charging patterns and user behavior. It contains 1,320 samples of charging session data, including metrics such as energy consumption, charging duration, and vehicle details. Each entry captures various aspects of EV usage, allowing for insightful analysis and predictive modeling.

  • Ethical Concerns: The dataset has user IDs and specific charging station locations, which means there’s a chance it could reveal patterns in people’s movements and behaviors. To protect privacy, it’s important to keep user IDs anonymous and possibly generalize location data so individuals can’t be tracked. Researchers also need to handle this information carefully and follow data protection rules to use it responsibly.

  • Question:

    1. How do vehicle model, user type, and starting state of charge influence the cost and duration of EV charging sessions at public stations?
    2. Exploring energy consumption and charging behaviors
    3. Building predictive models for charging efficiency
  • Importance:Understanding the costs and durations associated with different EV types and user profiles can help:

    • Consumers make cost-effective charging decisions.
    • Charging service providers optimize station usage and pricing strategies by identifying patterns in energy demand and time usage.
  • Hypothesis:

    • Vehicle Model: Larger battery capacity models will have longer charging times and higher costs.
    • User Type: Frequent users (like commuters) may incur lower costs per session due to shorter, more regular charging patterns.
    • Starting State of Charge: Lower starting charge levels are expected to lead to longer and more costly charging sessions.
  • Variable Types: Categorical Variables: Vehicle Model, User Type Quantitative Variables: Charging Cost (USD), Charging Duration (hours), State of Charge (Start %)

Glimpse of the Data: Dataset 1 - EV Charging

print(f'Table 1: EV Charging Dataset, Summary of Column Headings:\n\n')
ev_charging.head()
Table 1: EV Charging Dataset, Summary of Column Headings:

User ID Vehicle Model Battery Capacity (kWh) Charging Station ID Charging Station Location Charging Start Time Charging End Time Energy Consumed (kWh) Charging Duration (hours) Charging Rate (kW) Charging Cost (USD) Time of Day Day of Week State of Charge (Start %) State of Charge (End %) Distance Driven (since last charge) (km) Temperature (°C) Vehicle Age (years) Charger Type User Type
0 User_1 BMW i3 108.463007 Station_391 Houston 1/1/2024 0:00 1/1/2024 0:39 60.712346 0.591363 36.389181 13.087717 Evening Tuesday 29.371576 86.119962 293.602111 27.947953 2.0 DC Fast Charger Commuter
1 User_2 Hyundai Kona 100.000000 Station_428 San Francisco 1/1/2024 1:00 1/1/2024 3:01 12.339275 3.133652 30.677735 21.128448 Morning Monday 10.115778 84.664344 112.112804 14.311026 3.0 Level 1 Casual Driver
2 User_3 Chevy Bolt 75.000000 Station_181 San Francisco 1/1/2024 2:00 1/1/2024 4:48 19.128876 2.452653 27.513593 35.667270 Morning Thursday 6.854604 69.917615 71.799253 21.002002 2.0 Level 2 Commuter
3 User_4 Hyundai Kona 50.000000 Station_327 Houston 1/1/2024 3:00 1/1/2024 6:42 79.457824 1.266431 32.882870 13.036239 Evening Saturday 83.120003 99.624328 199.577785 38.316313 1.0 Level 1 Long-Distance Traveler
4 User_5 Hyundai Kona 50.000000 Station_108 Los Angeles 1/1/2024 4:00 1/1/2024 5:46 19.629104 2.019765 10.215712 10.161471 Morning Saturday 54.258950 63.743786 203.661847 -7.834199 1.0 Level 1 Long-Distance Traveler
print(f'Table 2: EV Charging Dataset, Variables and their Type (Dtype)\n\n')

ev_charging.info()
Table 2: EV Charging Dataset, Variables and their Type (Dtype)


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1320 entries, 0 to 1319
Data columns (total 20 columns):
 #   Column                                    Non-Null Count  Dtype  
---  ------                                    --------------  -----  
 0   User ID                                   1320 non-null   object 
 1   Vehicle Model                             1320 non-null   object 
 2   Battery Capacity (kWh)                    1320 non-null   float64
 3   Charging Station ID                       1320 non-null   object 
 4   Charging Station Location                 1320 non-null   object 
 5   Charging Start Time                       1320 non-null   object 
 6   Charging End Time                         1320 non-null   object 
 7   Energy Consumed (kWh)                     1254 non-null   float64
 8   Charging Duration (hours)                 1320 non-null   float64
 9   Charging Rate (kW)                        1254 non-null   float64
 10  Charging Cost (USD)                       1320 non-null   float64
 11  Time of Day                               1320 non-null   object 
 12  Day of Week                               1320 non-null   object 
 13  State of Charge (Start %)                 1320 non-null   float64
 14  State of Charge (End %)                   1320 non-null   float64
 15  Distance Driven (since last charge) (km)  1254 non-null   float64
 16  Temperature (°C)                          1320 non-null   float64
 17  Vehicle Age (years)                       1320 non-null   float64
 18  Charger Type                              1320 non-null   object 
 19  User Type                                 1320 non-null   object 
dtypes: float64(10), object(10)
memory usage: 206.4+ KB

Analysis Plan EV Charting Dataset

The EV charging dataset comprises 20 columns, 10 object variables and 10 float64 variables. The analysis plan will be completed in three steps to answer the questions and hypothesis stated above.

Step 1: A data wrangling effort to clean up the DataFrame and a graphical analysis of the dataset.

Step 2: A statistical analysis of the dataset using (i) analysis of variance and (ii) mullivariable regression to understand patterns within the data.

Step 3: Build a predictive model from the data to understand and predict user trends with respect to electric vehicle charging.