Loss Factor Analysis#

Combined Soiling and Degradation Estimation Module

This module is for estimation of degradation and soiling losses from unlabeled daily energy production data. Model is of the form

y_t = x_t * d_t * s_t * c_t * w_t, for t in K

where y_t [kWh] is the measured real daily energy on each day, x_t [kWh] is an ideal yearly baseline of performance, and d_t, s_t, and w_t are the loss factors for degradation, soiling, capacity changes, and weather respectively. K is the set of “known” index values, e.g. the days that we have good energy production values for (not missing or corrupted).

Author: Bennet Meyers

class solardatatools.algorithms.loss_factor_analysis.LossFactorAnalysis(energy_data, capacity_change_labels=None, outage_flags=None, **kwargs)#

Bases: object

estimate_degradation_rate(max_samples=500, median_tol=0.005, confidence_tol=0.01, fraction_hold=0.2, method='median_unbiased', verbose=False)#

This function runs a Monte Carlo simulation to estimate the uncertainty in the estimation of the degrdation rate based on the loss model. This will randomly sample problem parameters (quantile level and soiling stiffness weight), while randomly holding out 20% of the days each time. The algorithm exits when the estimates of the median, 2.5 percentile, and 97.5 percentile have stabilized. Results are stored in the following class .. attribute:: self.degradation_rate

self.degradation_rate_lb#

self.degradation_rate_ub#

self.MC_results#

Parameters:

max_samples – maximimum number of MC samples to generate (typically exits before this)
median_tol – tolerance for median estimate stability
confidence_tol – tolerance for outer percentile estimate stability
fraction_hold – fraction of values to holdout in each sample
method – quantile estimation method (see: https://numpy.org/doc/stable/reference/generated/numpy.quantile.html)
verbose – control print statements

Returns:

None

estimate_losses(solver='CLARABEL', verbose=False)#

holdout_validate(seed=None, solver='CLARABEL')#

make_problem(tau=0.9, num_harmonics=4, deg_type='linear', include_soiling=True, weight_seasonal=0.1, weight_soiling_stiffness=0.5, weight_soiling_sparsity=0.01, weight_deg_nonlinear=100000.0, deg_rate=None, use_capacity_change_labels=True)#

Constuct the signal decomposition problem for estimation of loss factors in PV energy data.

Parameters:

tau (float) – the quantile level to fit
num_harmonics (int) – the number of harmonics to include in model for yearly periodicity
deg_type (str) – the type of degradation to model (“linear”, “nonlinear”, or “none”)
include_soiling (bool) – whether to include a soiling term
weight_seasonal (float) – the weight on the seasonal penalty term (higher is stiffer)
weight_soiling_stiffness (float) – the weight on the soiling stiffness (higher is stiffer)
weight_soiling_sparsity (float) – the weight on the soiling stiffness (higher is sparser)
weight_deg_nonlinear (float) – only used if ‘nonlinear’ degradation model is selected
deg_rate (None or float [%/yr]) – pass to set a known degradation rate rather than have the SD problem estimate it

Returns:

a gfosd.Problem instance

plot_decomposition(plot_capacity_component=True, figsize=(16, 8.5))#

Creates a figure with subplots illustrating the estimated signal components found through decomposition

Parameters:: figsize – size of figure (tuple)
Returns:: matplotlib figure

plot_mc_by_tau(figsize=None, title=None)#

Creates a scatterplot of the Monte Carlo samples versus tau (quantile level) and colors the points by the weight of the soiling stiffness term

Parameters:

figsize – size of figure (tuple)
title – title for figure (string)

Returns:

matplotlib figure

plot_mc_by_weight(figsize=None, title=None)#

Creates a scatterplot of the Monte Carlo samples versus weight (soiling stiffness) and colors the points by the tau (quantile level)

Parameters:

figsize – size of figure (tuple)
title – title for figure (string)

Returns:

matplotlib figure

plot_mc_histogram(figsize=None, title=None)#

Creates a historgram of the Monte Carlo samples and annotates the chart with mean, median, mode, and confidence intervals.

Parameters:

figsize – size of figure (tuple)
title – title for figure (string)

Returns:

matplotlib figure

plot_pie(figsize=None)#

Create a pie plot of losses

Returns:: matplotlib figure

plot_waterfall(plot_capacity_component=True, figsize=(10, 4))#

Create a waterfall plot of losses

Returns:: matplotlib figure

report()#: Creates a machine-readable dictionary of result from the loss factor analysis :return: dictionary

class solardatatools.algorithms.loss_factor_analysis.SetEqual(val, *args, **kwargs)#: Bases: GraphComponent

solardatatools.algorithms.loss_factor_analysis.attribute_losses(energy_model, use_ixs)#

This function assigns a total attribution to each loss factor, given a multiplicative loss factor model relative to a baseline, using Shapley attribution.

Parameters:

energy_model (2d numpy array of shape n x T, where T is the number of days and n is the number of model factors) – a multiplicative decomposition of PV daily energy, with the baseline first – ie: baseline, degradation, soiling, capacity changes, and weather (residual)
use_ixs (1d numpy boolean array) – a numpy boolean index where False records a system outage

Returns:

a list of energy loss attributions, in the input order

Return type:

1d numpy float array

solardatatools.algorithms.loss_factor_analysis.enumerate_paths(n, dtype=<class 'int'>)#: enumerates all possible paths from the origin to the ones vector in R^n

solardatatools.algorithms.loss_factor_analysis.enumerate_paths_full(origin, destination, path=None)#: recursive algorithm for generating all possible monotonically increasing paths between two points on a n-dimensional hypercube

solardatatools.algorithms.loss_factor_analysis.make_sawtooth_dictionary(T)#

solardatatools.algorithms.loss_factor_analysis.make_st(k, phase, t)#

solardatatools.algorithms.loss_factor_analysis.model_wrapper(energy_model, use_ixs)#

solardatatools.algorithms.loss_factor_analysis.waterfall_plot(data, index, figsize=(10, 4))#

Create a waterfall plot to visualize the breakdown of energy loss factors.

This function generates a waterfall plot to display the cumulative impact of sequential loss factors. Each bar in the plot represents a specific loss factor..

Parameters:

data (pd.Series or pd.DataFrame) – Data to be plotted. This should be a pandas.Series or pandas.DataFrame where the index represents categories and the values represent the amounts.
index (pd.Index) – Index to use for the plot. Should match the length of the data.
figsize (tuple of int, optional) – Size of the figure to create, given as (width, height) in inches. Defaults to (10, 4).

Returns:

The figure object containing the waterfall plot.

Return type:

matplotlib.figure.Figure