Time Axis Manipulation#
Data Transforms Module
This module contains functions for transforming PV power data, including time-axis standardization and 2D-array generation
- solardatatools.time_axis_manipulation.fix_daylight_savings_with_known_tz(df, tz='America/Los_Angeles', inplace=False)#
- solardatatools.time_axis_manipulation.get_index_timezone(df: DataFrame) str | None#
Returns the timezone name or offset amount (in hours) of the index of a pandas DataFrame. This function was written with ChatGPT and checked by Bennet Meyers
Parameters: df (pandas.DataFrame): The DataFrame to check.
Returns: str or None: The timezone name or offset amount (in hours) of the index, or None if the index is not timezone aware.
- solardatatools.time_axis_manipulation.make_time_series(df, return_keys=True, localize_time=-8, timestamp_key='ts', value_key='meas_val_f', name_key='meas_name', groupby_keys=['site', 'sensor'], filter_length=200)#
Accepts a Pandas data frame extracted from a relational or Cassandra database. These queries often result in data with repeated timestamps, as you might have multiple columns stacked into rows in the database. Defaults are intended to work with GISMo’s VADER Cassandra database implementation.
Returns a data frame with a single timestamp index and the data from different systems split into columns.
- Parameters:
df – A Pandas data from generated from a query the VADER Cassandra database
return_keys – If true, return the mapping from data column names to site and system ID
localize_time – If non-zero, localize the time stamps. Default is PST or UTC-8
filter_length – The number of non-null data values a single system must have to be included in the output
- Returns:
A time-series data frame
- solardatatools.time_axis_manipulation.remove_index_timezone(df)#
Removes the timezone information from the index of a pandas DataFrame, if it is timezone aware. This function was written with ChatGPT and checked by Bennet Meyers
Parameters: df (pandas.DataFrame): The DataFrame to modify.
Returns: pandas.DataFrame: The modified DataFrame.
- solardatatools.time_axis_manipulation.standardize_time_axis(df, timeindex=True, power_col=None, datetimekey=None, correct_tz=True, verbose=True)#
This function takes in a pandas data frame containing tabular time series data, likely generated with a call to pandas.read_csv(). It is assumed that each row of the data frame corresponds to a unique date-time, though not necessarily on standard intervals. This function will attempt to convert a user-specified column containing time stamps to python datetime objects, assign this column to the index of the data frame, and then standardize the index over time. By standardize, we mean reconstruct the index to be at regular intervals, starting at midnight of the first day of the data set. This solves a couple common data errors when working with raw data. (1) Missing data points from skipped scans in the data acquisition system. (2) Time stamps that are at irregular exact times, including fractional seconds.
- Parameters:
df – A pandas data frame containing the tabular time series data
datetimekey – An optional key corresponding to the name of the column that contains the time stamps
- Returns:
A new data frame with a standardized time axis