Demonstration of statistical clear sky fitting#

6/24/2025

This notebook demontrates how to use the newly updated clear sky fitting method, based on smooth, periodic quantile estimation. The basic steps are:

load data
run pipeline
estimate clear sky model

Notebook setup and library imports#

[1]:

from solardatatools import DataHandler, plot_2d
from solardatatools.dataio import get_pvdaq_data

Load data#

We use NREL’s PVDAQ API with a function from the Solar Data Tools DataIO module

[2]:

data_frame = get_pvdaq_data(sysid=34, year=range(2011, 2015), api_key="DEMO_KEY")

[============================================================] 100.0% ...queries complete in 3.2 seconds

Run main pipeline#

[3]:

dh = DataHandler(data_frame)
dh.run_pipeline(power_col="ac_power")


            *********************************************
            * Solar Data Tools Data Onboarding Pipeline *
            *********************************************

            This pipeline runs a series of preprocessing, cleaning, and quality
            control tasks on stand-alone PV power or irradiance time series data.
            After the pipeline is run, the data may be plotted, filtered, or
            further analyzed.

            Authors: Bennet Meyers and Sara Miskovich, SLAC

            (Tip: if you have a mosek [https://www.mosek.com/] license and have it
            installed on your system, try setting solver='MOSEK' for a speedup)

            This material is based upon work supported by the U.S. Department
            of Energy's Office of Energy Efficiency and Renewable Energy (EERE)
            under the Solar Energy Technologies Office Award Number 38529.

task list: 100%|██████████████████████████████████| 7/7 [00:09<00:00,  1.38s/it]



total time: 9.67 seconds
--------------------------------
Breakdown
--------------------------------
Preprocessing              2.18s
Cleaning                   0.11s
Filtering/Summarizing      7.38s
    Data quality           0.08s
    Clear day detect       0.16s
    Clipping detect        3.24s
    Capacity change detect 3.89s

Fit clear sky model#

The nvals_dil parameter controlls how many data points used each day in the model. The higher the number of values, the longer the fit time (and it scales linearly).

5-minute data has 288 data points per day, roughly 50% of which are nighttime zeros, on average, and so the default is set to nvals_dil=101, to give you 100 data points between sunrise and sunset plus one “nighttime” value.

This data example happens to have 15-minute data, or 96 data points per data. Setting nvals_dil=31 avoids excessive upsampling of the data and reduces the fit time for the clear sky model by 1/3.

the regularization parameter controls the “stiffness” of the fit, with high values resulting in less “wiggly” fits. The defualt value is 0.1, and we bumped it up to 1.0 for this example (simple data set with not a lot of variability or shade effects).

[16]:

dh.fit_statistical_clear_sky_model(nvals_dil=31, regularization=1)

100%|█████████████████████████████████████████████| 1/1 [01:06<00:00, 66.79s/it]

[17]:

dh.plot_daily_signals(show_clear_model=True, start_day=650);

../../_images/getting_started_notebooks_clearsky_estimation_demo_8_0.png

The estimated clear sky signal is available in the self.scsf attribute

[18]:

dh.scsf

[18]:

array([808.74757358, 808.74757358, 808.74757358, ...,   0.        ,
         0.        ,   0.        ], shape=(140256,))

The clear sky fitting module is a wrapper on the more general quantile estimation module. As seen below, estimating the clear sky model is the same as estimating the 0.9 quanile.

[19]:

dh.quantile_object.quantiles_original

[19]:

{np.float64(0.9): array([808.74757358, 808.74757358, 808.74757358, ...,   0.        ,
          0.        ,   0.        ], shape=(140256,))}

Compare actual and estimated power heatmaps#

measured power:

[20]:

dh.plot_heatmap('raw');

../../_images/getting_started_notebooks_clearsky_estimation_demo_14_0.png

estimated clear sky system response:

[21]:

plot_2d(dh.scsf.reshape(dh.raw_data_matrix.shape, order='F')/1000);

../../_images/getting_started_notebooks_clearsky_estimation_demo_16_0.png