Carl Data Solutions employs Machine Learning to manage water resources in Columbia Basin

Canada is in a privileged position, holding 20% of the world’s freshwater supply and 9% of the world’s renewable fresh water. But climate change is affecting this precious resource. In northern regions, it is already having devastating impacts on ecological, economic and human systems. To help mitigate these impacts, Carl Data Solutions Inc. led a Fresh Water Data Commons (FWDC) project which is now deployed in Anderson Creek in the Columbia Basin near the city of Nelson, BC.


Because this region of British Columbia presents a diverse set of data types and monitoring conditions owing to varied geographies ranging from remote areas to locations with large industrial users of water, advanced water monitoring becomes essential to understand and characterize challenges. Collecting regional data to provide a real-time view of what’s happening is fundamental in helping water authorities and communities forecast and predict.

Achieving better water balance is one of the project objectives. By developing and using a machine learning predictive analytics platform, Carl Data Solutions’ team of software engineers are gaining a deeper understanding of environmental conditions and the impact of development, industrial and climate change in target areas. Breakthrough features of the platform enable its users to:
  • Connect to private data sources such as corporate systems, SCADA, and other cloud platforms Connect to public data sources such as Environment Canada and other government Application Program Interfaces (APIs)

  • Connect to remote sensors and dataloggers deployed in the field

  • Monitor user data through a GIS-based web interface

  • Create real-time data alerts to notify users when events occur

  • Visualize data with graphs and reports

  • Share data with other users and organizations with full control of permissions and access to data

  • Transform data using Python scripts deployed in the platform to build real-time analytics, predictive analytics and Machine Learning (ML) models

  • Build applications and services using a REST API that supports integration applications and services like Excel and PowerBI, cloud-based services like Azure and AWS, or on-premise ERP systems

The platform employs pre-configured and user defined algorithms and calculations that run against collected data and instantaneously merge other information sources to create predictive information data channels. Quality control algorithms are available to flag suspect data points to minimize error in alarms and reporting produced in real time.

All data is accessible either through a web service or a SaaS-based application to end users assigned with the appropriate security permissions. Additional data sources through third party systems (e.g., SCADA) can also be added so end users have a comprehensive, standardized view of all data sources required for analysis without the need to consult multiple, disparate applications and data sources.


Benefits of the machine learning predictive analytics platform


The platform was designed and built to be deployed using the Microsoft Azure cloud computing service and its architecture was designed with six layers: Collect, Ingest, Transform, Store, Serve, and Present.


Some of the noticeable outputs of the FWDC project are measured in operational efficiencies because the highly customizable predictive analytics platform was developed specifically for environmental data monitoring in rugged natural or industrial environments. Accurately predicting future volumes of water flowing out of the Anderson Creek watershed is an important benefit to water authorities.


Predictive channels are produced through a dynamic water balance model calibrated for the freshet period. The freshet represents the period with the highest potential for peak annual flows in the Anderson Creek. Based on the 2021 freshet from April 1st, 2021, to August 1st, 2021:

  • The averaged observed volume over the period from Environment Canada was 12,900 cu.m /d

  • The corresponding average modeled volume was 11,569 cu.m/d. The platform yielded 89.7% of the observed volume for the freshet period

  • For the freshet peak occurring on May 18th, 2021, the observed 24h avg peak flow was 0.4607 m3/s. The 24hr avg peak flow of the model was 0.3475 m3/s

  • The platform yielded 75.4% of the observed peak flow during the freshet


Predicted versus actual water volume

The platform automates most of the tasks required for data acquisition and preparation. Its Machine Learning algorithm also optimizes the underlying model which improves accuracy as more data is collected. The combined impact is a significant savings in time and resources required to produce similar results through less automated methods.

New predictive analytics products created from the project


Software developed as part of the FWDC project is now being offered in a series of new predictive analytics products from Carl Data Solutions that are also aimed at Smart City and industrial infrastructure customers. These include Auto I&I and Auto QA/QC.


Auto I&I was successfully Beta-tested with AECOM in York Region's long-term flow monitoring program. Auto I&I detects storm events that match defined rainfall criteria. By automating Inflow and Infiltration (I&I) data gathering and reports, municipal water treatment engineers can quickly see which areas of their systems are most impacted by a storm event and visualize real-time I&I metrics on a Geographic Information System (GIS) map.


Auto I&I detects storm events that match defined rainfall criteria for Inflow & Infiltration

Auto QA/QC detects can prevent errors in sensor data from generating poor results or triggering false alerts. Data anomalies are flagged for further investigation by engineers and data analysts in their manual workflow. Certain anomalies may be deemed acceptable, and automatically eliminated from the data stream. Suggested replacement values generated by the application based on past patterns may be used to fill in errors as a substitute when errors are flagged and confirmed.


Auto QA/QC detects and prevents bad sensor data from generating results or triggering false alerts

Lessons learned and benefits derived from the project


At the beginning of the FWDC project a significant amount of effort was placed in developing and integrating advanced tools for predictive analytics. What became obvious over time was that the highest value features were linked to automating tasks, streaming analytics and access to information with a solid API.


There have been many attempts to consolidate data collection. Impediments to the consolidation of various disparate data sets include normalization of time series data, queuing, mass storage and access challenges and ownership of the data.


The platform addresses each of these areas. It can accommodate data sources produced in real time, make any updates required and then store information alongside data collected from different sources and systems. The platform is best suited to collect and aggregate data using advanced analytics features to create new channels of information. Analysis using connected datasets from other analytics applications can be used for deeper insight without the need for lengthy data preparation prior to analysis. The platform easily handles data aggregation, scrubbing and enhancement.


Importantly, the organization that collects the data remains the owner of the that data but has the ability to extend access as much as they like to other organizations or even make the data publicly available. What data to collect and who to share it with is up to each organizations system administrator. Continuing innovation accelerates as the sources, frequency and volumes of data coming into the system expands.


Another goal of the FWDC project was to make things easier for data analysts by providing access tools to that can replace repetitive time-consuming tasks like finding similar patterns of events over time.


The platform offers significant benefit to smaller municipalities that need to start monitoring water. As communities grow, new development puts pressure on existing water infrastructure as it is expanded to serve a larger area. Just like big cities, smaller centres can’t continue to grow without expanding services. The platform can assist in both estimating the current capacity and determining if additions are required before development begins. If infrastructure is ageing the application can be used to estimate the decline in capacity and prioritize capital replacement projects focusing on sections with the greatest need.


The platform can accept all time series environmental inputs and arrange data automatically, so a comparison of variables is possible. The application automatically joins disparate data sets. The associated time savings means certain data gathering tasks that were resource-intensive can now be performed quickly and automated to re-run on a scheduled basis for sharing information, trigger-based notification or regulatory reporting. All of this makes it much easier to alert the appropriate group if a negative event is detected or simply to remain in compliance with regulations specific to a particular industry or infrastructure environment.


The Machine Learning predictive analytics technology developed for the FWDC project offers a technological toolset to help every part of Canada, and the world, use data to protect water, people, ecosystems, and the economy as the demand for fresh water continues to rise from impacts of climate change.


About Carl Data Solutions


Trusted since 2014 to provide environmental monitoring to many of the largest water utilities in the U.S and Canada, Carl Data Solutions has evolved into a leader in AI-driven predictive analytics for industrial and Smart City infrastructure applications that rely on time-series data. The company serves its customers via a trusted partner network that includes engineering and IT services companies like AECOM, Core & Main, Kerr Wood Leidal, K2 Geospatial and CSL Services.


Carl Data Solutions software performs real-time analysis, checks flow monitoring status, sets alarms through a single interface, accepts all types of data from any source and offers predictive and prescriptive analytics. From real-time, historic, wireless, satellite and SCADA data to public data sets including USGS, NOAA and weather forecasts – it doesn’t matter where the data originates – Carl Data Solutions transforms it into actionable information.