Project Proposal

U.S. Natural Resources Revenue (2003-2023)

Author
Affiliation

Indecision Scientists

College of Information Science, University of Arizona

import numpy as np
import seaborn as sns
import pandas as pd

Introduction and data

  • Source: Kaggle - https://www.kaggle.com/datasets/saurabhbadole/u-s-natural-resources-revenue-2003-2023

  • Collection date/method: Data spanning 2003 to present collected and managed by the Department of the Interior’s Office of Natural Resources Revenue

  • Description: This dataset comprises revenue generated from and attributes of natural U.S. resources such as federal lands, waters, and indigenous lands. Data includes land classification, lease types, revenue types, and commodities/products that are extracted from the natural resources.

  • Ethical concerns: Revenue data for indigenous/Native American resources was only reported at a national level to protect private/sensitive information

    natResRev = pd.read_csv("data/us_natResources_revenue.csv")

Research Question

  • Research question: In what ways have revenue patterns from renewable versus non-renewable resource extraction (e.g., geothermal, oil, and gas) evolved over the past two decades, and how does the interaction between resource type and land category (onshore versus offshore) influence these revenue trends across different regions?

  • Importance: This question leverages the dataset’s extensive time span and diverse resource categories, addressing the critical topic of renewable versus non-renewable energy. It enables an exploration of long-term trends, analysis by land type, and comparisons across revenue sources, providing a relevant perspective on the economics of sustainable energy.

  • Description: This research examines revenue trends from renewable and non-renewable resources on federal and Native American lands over the past two decades, focusing on differences between onshore and offshore land categories. The study aims to reveal whether renewable resource revenue, particularly from geothermal energy, has grown over time relative to non-renewables like oil and gas. The hypothesis is that revenue from renewable resources has increased over time, while offshore lands generate higher revenue from non-renewables, and onshore lands contribute more to renewables. Additionally, non-renewable revenue remains high but fluctuates due to market and policy changes.

  • Variable types:

    • Categorical: Land Category, State, Commodity

    • Quantitative: Revenue, Calendar Year

Glimpse of data

natResRev.head()
Calendar Year Land Class Land Category State County FIPS Code Offshore Region Revenue Type Mineral Lease Type Commodity Product Revenue
0 2003 Federal Onshore Pennsylvania Armstrong 42005.0 NaN Royalties Oil & Gas Gas Unprocessed (Wet) Gas 341.47
1 2003 Federal Onshore Louisiana Natchitoches 22069.0 NaN Other revenues Oil & Gas Oil & gas (pre-production) NaN 331.30
2 2003 Federal Onshore Missouri Iron 29093.0 NaN Royalties Hardrock Copper Copper Concentrate 57929.02
3 2003 Federal Onshore Missouri Iron 29093.0 NaN Rents Hardrock Hardrock NaN -51533.57
4 2003 Federal Onshore Missouri Iron 29093.0 NaN Royalties Hardrock Hardrock Copper Concentrate 14834.41
natResRev.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48413 entries, 0 to 48412
Data columns (total 12 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Calendar Year       48413 non-null  int64  
 1   Land Class          48413 non-null  object 
 2   Land Category       48413 non-null  object 
 3   State               46458 non-null  object 
 4   County              46458 non-null  object 
 5   FIPS Code           46458 non-null  float64
 6   Offshore Region     933 non-null    object 
 7   Revenue Type        48413 non-null  object 
 8   Mineral Lease Type  48326 non-null  object 
 9   Commodity           48413 non-null  object 
 10  Product             26254 non-null  object 
 11  Revenue             48413 non-null  float64
dtypes: float64(2), int64(1), object(9)
memory usage: 4.4+ MB

Analysis plan

  • Plan: Data wrangling will initially be required to create variables that group U.S. states into geographic regions so that we can compare the Revenue patterns across Calendar Years between Commodity types, Land Category, and geographic regions. We will also group renewable resources versus non-renewable resources. The comparisons will then primarily be done by constructing a variety of data visualizations.

  • Variables to be created: A categorical variable will be needed to identify whether a resource is considered renewable or non-renewable. Another categorical variable would be created to identify which geographic region a state that has a particular resource belongs to (e.g., Texas would be in the Southern region, etc.).

  • External data: No external data to be merged.