Project Proposal
U.S. Natural Resources Revenue (2003-2023)
Introduction and data
Source: Kaggle - https://www.kaggle.com/datasets/saurabhbadole/u-s-natural-resources-revenue-2003-2023
Collection date/method: Data spanning 2003 to present collected and managed by the Department of the Interior’s Office of Natural Resources Revenue
Description: This dataset comprises revenue generated from and attributes of natural U.S. resources such as federal lands, waters, and indigenous lands. Data includes land classification, lease types, revenue types, and commodities/products that are extracted from the natural resources.
Ethical concerns: Revenue data for indigenous/Native American resources was only reported at a national level to protect private/sensitive information
Research Question
Research question: In what ways have revenue patterns from renewable versus non-renewable resource extraction (e.g., geothermal, oil, and gas) evolved over the past two decades, and how does the interaction between resource type and land category (onshore versus offshore) influence these revenue trends across different regions?
Importance: This question leverages the dataset’s extensive time span and diverse resource categories, addressing the critical topic of renewable versus non-renewable energy. It enables an exploration of long-term trends, analysis by land type, and comparisons across revenue sources, providing a relevant perspective on the economics of sustainable energy.
Description: This research examines revenue trends from renewable and non-renewable resources on federal and Native American lands over the past two decades, focusing on differences between onshore and offshore land categories. The study aims to reveal whether renewable resource revenue, particularly from geothermal energy, has grown over time relative to non-renewables like oil and gas. The hypothesis is that revenue from renewable resources has increased over time, while offshore lands generate higher revenue from non-renewables, and onshore lands contribute more to renewables. Additionally, non-renewable revenue remains high but fluctuates due to market and policy changes.
Variable types:
Categorical: Land Category, State, Commodity
Quantitative: Revenue, Calendar Year
Glimpse of data
Calendar Year | Land Class | Land Category | State | County | FIPS Code | Offshore Region | Revenue Type | Mineral Lease Type | Commodity | Product | Revenue | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2003 | Federal | Onshore | Pennsylvania | Armstrong | 42005.0 | NaN | Royalties | Oil & Gas | Gas | Unprocessed (Wet) Gas | 341.47 |
1 | 2003 | Federal | Onshore | Louisiana | Natchitoches | 22069.0 | NaN | Other revenues | Oil & Gas | Oil & gas (pre-production) | NaN | 331.30 |
2 | 2003 | Federal | Onshore | Missouri | Iron | 29093.0 | NaN | Royalties | Hardrock | Copper | Copper Concentrate | 57929.02 |
3 | 2003 | Federal | Onshore | Missouri | Iron | 29093.0 | NaN | Rents | Hardrock | Hardrock | NaN | -51533.57 |
4 | 2003 | Federal | Onshore | Missouri | Iron | 29093.0 | NaN | Royalties | Hardrock | Hardrock | Copper Concentrate | 14834.41 |
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48413 entries, 0 to 48412
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Calendar Year 48413 non-null int64
1 Land Class 48413 non-null object
2 Land Category 48413 non-null object
3 State 46458 non-null object
4 County 46458 non-null object
5 FIPS Code 46458 non-null float64
6 Offshore Region 933 non-null object
7 Revenue Type 48413 non-null object
8 Mineral Lease Type 48326 non-null object
9 Commodity 48413 non-null object
10 Product 26254 non-null object
11 Revenue 48413 non-null float64
dtypes: float64(2), int64(1), object(9)
memory usage: 4.4+ MB
Analysis plan
Plan: Data wrangling will initially be required to create variables that group U.S. states into geographic regions so that we can compare the Revenue patterns across Calendar Years between Commodity types, Land Category, and geographic regions. We will also group renewable resources versus non-renewable resources. The comparisons will then primarily be done by constructing a variety of data visualizations.
Variables to be created: A categorical variable will be needed to identify whether a resource is considered renewable or non-renewable. Another categorical variable would be created to identify which geographic region a state that has a particular resource belongs to (e.g., Texas would be in the Southern region, etc.).
External data: No external data to be merged.