(IGS Energy via Snowflake)

Overview

Kermit Troy Berry spearheaded the development of a Snowflake based demand forecasting and anomaly detection platform for IGS Energy, a major retail utilities provider. This platform unified hundreds of disparate forecasting models into a single Snowflake ML pipeline, dramatically simplifying operations. By leveraging five years of historical consumption data and advanced machine learning, the solution increased forecasting accuracy and provided early detection of underperforming solar assets, aligning with IGS Energy’s mission of a sustainable, reliable energy future.

 

Objective

The objective was to improve the accuracy and efficiency of energy demand forecasts while reducing maintenance overhead and cost. IGS Energy needed to replace its legacy on premises forecasting processes with a scalable cloud solution that could forecast energy consumption more precisely and identify anomalies (like solar panel underperformance) in real time. The goal was a unified forecasting model that could handle all customer accounts, cutting complexity and cost without sacrificing accuracy, and an integrated anomaly detection system to enhance customer experience.

 

Role and Responsibilities

In his role as Senior Machine Learning Engineer (consulting via Snowflake), Kermit was responsible for end to end solution delivery:

  • Architecting ML Pipelines: Designed and implemented Snowpark ML pipelines for time series forecasting, transitioning from many per customer models to one unified model.
  • Data Engineering and ELT: Automated data ingestion and transformation using Snowflake Streams and Tasks to continuously feed the latest consumption data into models.
  • Model Development: Trained and fine tuned forecasting models on 5+ years of energy consumption and weather data, improving the Mean Absolute Percentage Error (MAPE) by an estimated 22% (indicating significantly higher prediction accuracy).
  • Anomaly Detection: Developed an anomaly detection module to flag irregularities in solar generation output, utilizing SQL window functions and Python UDFs (User Defined Functions) within Snowflake to pinpoint solar panel performance issues.
  • Data Governance: Enforced strict data security and privacy measures. Implemented GDPR compliant role based access control (RBAC) and dynamic data masking to protect sensitive customer information in analytical views.
  • Visualization & Stakeholder Enablement: Delivered interactive Streamlit dashboards for scenario analysis, allowing business stakeholders to simulate “what if” scenarios (e.g. demand spikes, weather events) and explore forecast outcomes via a user friendly interface.

 

Approach and Technology Stack

Architecture: Kermit’s approach centered on Snowflake’s Data Cloud as the unified data and compute platform. He migrated fragmented forecasting jobs (previously running in siloed environments) into Snowflake, harnessing Snowpark (Snowflake’s Python API) to develop and deploy the models directly where the data resides. This consolidation eliminated data transfer overhead and ensured consistency. By moving from hundreds of per account models to a single multi account model, the training process became far more efficient, yielding a 75% reduction in model training costs.

Data Pipeline: A robust ELT pipeline was built using Snowflake Streams and Tasks. Incoming usage data (e.g. smart meter readings, weather feeds) were continuously captured; Streams tracked new data in ingestion tables and Tasks orchestrated scheduled merges and transformations. This automation ensured the forecasting model was always training on up to date data without manual intervention.

Time Series Modeling: The core forecasting model was a time series ML algorithm tailored for utility demand patterns. Training data spanned five years of hourly consumption across 1M+ customers, incorporating seasonality and weather features. By training one unified model on this rich dataset, the solution achieved high granularity forecasts while avoiding duplication of effort. Model experimentation was done in Snowpark using Python (pandas, statsmodels/PyTorch), and the final model was deployed as a Snowflake UDAF (user defined aggregate function) for scalable inference across all accounts.

Anomaly Detection: For solar panel performance monitoring, Kermit implemented a hybrid rule based and ML approach. Snowflake’s SQL window functions were used to compute rolling generation baselines and detect deviations per panel. Detected anomalies were then fed into a Python UDF running a predictive model (considering weather and panel specs) to confirm true anomalies, minimizing false alerts. This approach replaced a manual, Excel based process and more precisely pinpointed underperforming installations, triggering proactive maintenance tickets.

Tech Stack: The solution leveraged Snowflake Data Cloud (for data warehousing, Snowpark execution, Streams/Tasks automation), Python (Snowpark API, data science libraries), SQL (analytics queries, window functions), and Streamlit (for web UI dashboards). Integration with Snowflake’s security features (RBAC, masking policies) ensured compliance. All development and deployment were done in an agile manner, with iterative model improvements and dashboard feedback cycles.

 

Challenges and Solutions

Scaling & Complexity: The team faced the challenge of maintaining thousands of individual forecast models (one per customer) which was complex and costly. Kermit’s solution was to refactor this approach into a single unified model, leveraging Snowflake’s power to handle large scale data. This not only cut costs by 75% in model training infrastructure but also simplified maintenance. By proving that one model could achieve comparable or better accuracy (indeed improving accuracy by double digits), he gained stakeholder buy in for this innovative approach.

Data Volume and Performance: Handling terabytes of time series data (billions of rows) for training and inference required careful optimization. Kermit addressed this by using Snowflake’s elastic compute and partitioning strategies. Data was partitioned by region/time, and the Snowpark pipeline was optimized to process partitions in parallel. The adoption of in database ML (via Snowpark UDFs) eliminated data movement, ensuring that even with massive data, training and forecasting ran in minutes rather than hours.

Anomaly Detection Precision: Previously, anomaly detection had both false negatives (missed issues) and false positives (unnecessary field checks). Distinguishing true solar panel issues from normal variance was challenging. Kermit solved this by incorporating domain knowledge (e.g. expected output given weather conditions) into the model. The two stage detection (SQL baseline + ML verification) significantly improved precision. This increased customer satisfaction, as maintenance teams now trust that an alert warrants action, and it reduced wasted truck rolls to check false alarms.

Data Privacy Compliance: Working with customer energy usage data in the cloud introduced compliance requirements (GDPR, state regulations). Kermit made data governance a design cornerstone: implementing column level encryption and masking for personal identifiers and using Snowflake’s RBAC to strictly limit access on a need to know basis. He also established audit logging for data access. These measures ensured the project met all regulatory requirements and built customer trust, without hindering analytical capabilities.

Stakeholder Adoption: Another challenge was making the advanced ML outputs usable for business decision makers (energy traders, planners). Kermit’s introduction of Streamlit dashboards addressed this by presenting forecasts and anomaly flags in an intuitive web app format. He iterated with end users on the UI/UX, adding features like scenario sliders (for testing assumptions) and alert summaries. This rapid feedback loop (enabled by Streamlit’s quick deployment) led to enthusiastic adoption — business users could now interact with the model results directly rather than sifting through reports, thus using the insights more effectively.

 

Results and Impact (with Metrics)

The Demand Forecasting Platform delivered substantial business and technical benefits:

Higher Forecast Accuracy: By deploying the unified model, IGS Energy achieved more accurate demand predictions, reflected in a 22% improvement in MAPE (forecast error) compared to the legacy approach. This accuracy boost means more optimal energy procurement decisions, reducing over or under buying of power.

Cost Savings: Simplifying hundreds of models into one yielded significant cost reduction. Training resource costs dropped by roughly 75%, saving infrastructure spend and time. Additionally, maintenance effort was slashed, as one model is far easier to update than many.

Faster Insights: Forecast generation that used to take ~30 minutes for all models now runs in just a few minutes on Snowflake. This speed enables near real time forecasting and the ability to run many simulations (e.g., scenario forecasting) on demand. Stakeholders can quickly get answers for “what if” questions (like extreme weather impact), improving strategic agility.

Improved Anomaly Response: The new anomaly detection system is catching solar panel issues that were previously missed, while avoiding false alarms. IGS can now intervene promptly when a customer’s solar array underperforms, minimizing downtime. This has led to an increase in customer satisfaction and trust – customers see that their green energy investment is actively monitored and kept optimal. Internally, the operations team saved labor by not having to do manual comparisons; they can allocate those resources to more value added tasks.

Regulatory Compliance and Security: The platform operated fully within GDPR guidelines, with zero incidents of data privacy breach. By baking compliance into the design, Kermit ensured IGS Energy avoided potential fines or reputational damage, while safely leveraging customer data for ML. This also positioned the company well for upcoming privacy requirements, future proofing the solution.

Business Impact: The forecasting improvements and anomaly management collectively support IGS’s business goals. More accurate forecasts translate to cost savings in energy purchasing and a competitive pricing advantage. Better anomaly detection protects revenue (as solar customers remain satisfied and continue their service). Moreover, the intuitive dashboards bridged the gap between data science and business teams, accelerating data driven decision making across the organization. This project has been highlighted in Snowflake’s customer story library as a model for leveraging AI in the utility sector.

 

Lessons Learned

This case study offered rich learnings for Kermit and the team:

Unification Over Fragmentation: Consolidating many small models into a single robust model can greatly reduce complexity and cost, provided that data volume and diversity are handled carefully. The success here underlined that a well designed unified model can perform as well as many specialized ones, challenging the assumption that more models always mean better accuracy.

Embed ML in the Data Ecosystem: Running ML directly within the data warehouse (in database ML) proved to be a game changer. It eliminated engineering friction and ensured models were always close to fresh data. The team learned the value of choosing tools that minimize data movement and leverage existing platforms an approach that sped up development and simplified governance.

Importance of Domain Context: Incorporating domain specific factors (like weather patterns for solar output) significantly improved ML outcomes. The lesson was that pure algorithms aren’t enough for complex real world problems; blending domain knowledge with ML techniques yields more practical and credible results, especially for anomaly detection in niche areas.

Data Governance is Non Negotiable: Kermit’s experience reinforced that building in security/privacy from day one is far easier than retrofitting it later. The project’s smooth compliance audit and zero security incidents underscored how proactive data governance (RBAC, masking, audit trails) not only avoids problems but can even accelerate adoption (stakeholders were more comfortable using the system knowing controls were in place).

User Centric Delivery: Finally, the success of the Streamlit apps highlighted the importance of presenting technical work in a user friendly way. By engaging end users early and often, and iterating on visualizations and interface, the team ensured the sophisticated backend actually translated into actionable insight on the front end. In essence, a lesson learned is that the effectiveness of an ML project is not just measured by model metrics, but also by how well its outputs drive decisions and value for the business.

 

Visual Summary

To communicate this project, the following visuals are suggested:

Architecture Diagram: A schematic showing the end to end pipeline: data sources (smart meters, weather APIs, solar sensors) flowing into Snowflake, the Snowpark ML pipeline performing forecasting, and the output streaming into dashboards and alerts. This diagram would highlight components like Snowflake Streams/Tasks (for ELT), the unified forecasting model, and the anomaly detection loop, illustrating how data and results move through the system.

Forecast vs Actual Chart: A time series line graph depicting predicted vs actual energy demand over a sample period, demonstrating the improved accuracy. An annotation can highlight where the new model’s prediction closely tracks actual usage (as opposed to the old method’s wider error margin). This could also include markers where anomalies were detected (e.g., a dip in solar output that the model caught).

Streamlit Dashboard Snapshot: A screenshot of the interactive dashboard used by business stakeholders. For example, a view of the scenario analysis interface with sliders (temperature, consumption growth, etc.) and resulting forecast changes, or a dashboard page listing sites with solar array health status. This visual emphasizes how the complex analytics were made accessible and actionable for decision makers.

View Work