PV Analytics

The prediction problem

Predicting how much power a solar panel will actually generate on a given day is surprisingly hard. Cloud cover, temperature, humidity, panel angle, dust buildup. Dozens of variables interact in ways that aren’t obvious.

For solar plant operators, inaccurate forecasts mean either wasted capacity or unmet energy commitments. Both cost money.

Why neural networks

This was an academic project, but I approached it like a real business problem: can we predict daily solar output accurately enough to be useful for grid planning?

Traditional regression models struggle here because the relationships aren’t linear. A 10% increase in cloud cover doesn’t mean a 10% drop in output. It depends on cloud type, altitude, time of day, and a dozen other factors. Neural networks figure out those relationships on their own. I don’t have to tell it that overcast mornings in winter behave differently from overcast mornings in summer.

What I built

I built a pipeline in Python to clean up messy weather station data and align it with solar panel output records. Timestamps alone were a nightmare. From there, I engineered features around time of day, season, day length, and weather combos. The model itself was a TensorFlow neural network trained on the historical data. Nothing exotic, but it worked.

Results

The model beat linear regression baselines
Turns out the weather-to-energy relationship is too messy for simple models. The neural net picked up on patterns I couldn’t have coded by hand

The real work was data cleaning. The weather data had gaps, the solar output data had sensor errors, and aligning timestamps across sources was its own mini-project. By the time I had clean, aligned data, the modeling part was almost straightforward.

The prediction problem

Why neural networks

What I built

Results

More Work

The Margin Whisperer

The Spreadsheet Killer