Loading...
HomeMy WebLinkAbout20210512Avista to Staff 147 Attachment B.pdfProject Name: Data Science App 2020 - Pkg. 2 - 09906625 Clarity Project ID: PR00015550 Acctg Project #: 09906625 Business Case Name: Enterprise Data Science BI ER/BI: 5038-38X09 Item # Approved Scope Item Scope Item Detail Business Value 1 Customer Housing & Demographics a) Extracted, fused and prepared dataset that combines information from Avista's CC&B system with 3rd-party demographic housing data, allowing DSM team to better understand energy efficiency outreach opportunities, particularly for low-income 'at risk' customers. Dataset prepared using newly developed algorithms written in programming language R. b) Visualization of above developed dataset enables users to explore our customers in visual manner, accelerating the understanding of information for more effective customer outreach. Newly fused data is displayed with interoperability of attributes, allowing users to isolate certain subsets of prepared data for targeted analysis. Enables enterprise users to more effectively segment our customers, particularly for energy efficiency and/or low- income energy programs 2 Identify usage for 'net aggregated accounts' that have multiple Service Agreements corresponding with only one common Service Point Implement programmatic (SQL) solution to identify the handful of customer accounts that are deemed 'net aggregated accounts' and insert into the DSW (Data Science Warehouse) tables as a result from data fusion using CC&B and MDM. Net aggregated accounts are unique from nearly all other customers, in that 'net aggregated accounts' have one service point (SP) applied to many different Service Agreements (SAs). Usage statistics from net aggregated Yes Captures and makes available usage statistics for non-traditional customer-types (community assets) that are likely to grow in counts. Staff_PR_147 Attachment B Page 1 of 7 accounts can inadvertently get omitted when deploying typical SQL code against CC&B/MDM databases. 3 Budget dashboard reporting to VP Community & Economic Vitality Visualization tool to monitor expense budgets of various groups reporting to VP Community & Economic Vitality. Added non-labor YTD actual, budget and variance for seven Orgs that report to the VP. Yes Provides visibility for the VP and direct reports for actual expenses 4 Certified Data Set: Substation & Feeder data Certified Data Sets (CDS) are a collection of data that has been prepared (joined) with other data for use by Analysts. These data sets have been tested and published to allow for easy access to data for Analysts. All data requires approval prior to accessing. Substation & Feeder Data: This CDS associates each active Service Point ID to an Avista substation and Avista feeder, with certain intentional exclusions (such as the downtown Spokane network and non-metered service such as streetlights). Avista developed SQL and FME code to prepare this CDS, with weekly updates based upon changes to geodatabase the prior week. Yes The Substation & Feeder Certified Data Set enables enterprise analysts to perform as-desired analytics to advance understanding of customers, our distribution system, how to better respond in emergencies, etc. This CDS is a foundational element for enhanced AMI analytics. 5 Tracker analysis for Deployment Services team Analytics of Avista's Tracker ticket system, displaying performance statistics in visual format on how long Tracker tickets take to complete. Display allows managerial insights into workload and task duration by team, worker, etc. SQL code was written to prepare dataset for Yes Allows users to understand workflow of Tracker tasks to better configure resources to meet demands of operational teams. Staff_PR_147 Attachment B Page 2 of 7 Tableau to consume and display to end- users in Deployment Services team. 6 Generations metrics reporting Hydro and thermal generation metrics (of Avista owned or controlled generation) that assist GPSS team monitor performance of generation units thru visualization interface. Metrics display both (i) summary of hydro and thermal units performance statistics, as well as (ii) unit specific performance statistics. Historical summary dates back to 2018, with automation built in tool to populate for more current periods. Data preparation includes joining and merging datasets in order to show plant attributes and monetary value of lost generation. Tool contains visual interoperability that assists users better understand how plant attributes relate to generation performance metrics. Formulas and algorithms were implemented, with GPSS subject matter experts providing necessary business knowledge, with Data Science automating the calculation and visual display for users. Yes Hydro and thermal generation metrics that assist GPSS team monitor performance thru visualization interface. 7 Application Org annual cost dashboard Expose and visualize software licensing cost data that tracks our monthly spend on application licensing; enables users to better understand and manage budgetary forecasts using different filters (i.e., by business unit, vendor, expenditure type, etc.) Yes Allows for analysis of projected growth in A09 (Application Org) software licensing costs for 2021 and 2022. 8 Application Org Software Expense Allows for review of amortization schedules for individual product purchases. Reports on forecasted spend by vendor/budget owner/project/etc. Allows for a comparison of forecasts from Yes Allows for review of amortization schedules for individual product purchases and comparison of forecasts Staff_PR_147 Attachment B Page 3 of 7 one forecast period to another. FME process pulls data from SharePoint, insets into Budget schema tables. from one period to another. 9 Vulnerable Populations Data Fusion delivered to Power Supply (thru GIS) Assembled and fused dataset of Avista residential customers 5-year historical monthly bills (looking at each electric, gas, and combined electric & gas), both bill amount(s) and energy volume consumed, in comparison to income metrics from either 3rd party vendor and/or US Census data. Fused data delivered to Power Supply for Integrated Resource Planning requirements as legislated by State of WA. GIS used data to consolidate into pre-defined geographic zones designated as 'vulnerable populations' and/or 'Pollution Zones' by State of WA. Yes Developed insights into our customers "energy burden", and how the results are distributed throughout pre-defined geographic zones designated by State of WA. Information was used for TAC (Technical Advisory Committee) meetings with stakeholders, as part of Avista's IRP (Integrated Resource Planning). 10 Enable alation to crawl SDEPRD data source Identified and pointed to source data sets for enterprise data catalogue. For reference, crawling SDEPRD includes DISTOPS. Yes With proper permissions, these data sets can be used by various users in the organization. The value of pointing to these source data sets is to avoid one-off extractions made from non-source data, and to conform users into looking at the same source data for all potential business applications, thereby eliminating discrepancies between users. Staff_PR_147 Attachment B Page 4 of 7 11 811 Risk Tool Developed 811 Excavation Risk tool to calculate consequence risk of excavations as called into 811 "Call Before You Dig" service for OR & WA service territories. Consequence Model developed and deployed in FME (811 Gas Risk Tool) based on DIMP model provided by Gas Compliance using spatial intersection of intended dig area with existing Avista gas facilities. Additionally, an ArcGIS dashboard was created in ArcGIS Online for the Gas Compliance team. Work product included API call for data from ELM (3rd party call center), ArcGIS tool displaying risk information as calculated by proprietary algorithms developed. Yes Enables Avista to identify and evaluate high-risk excavation tickets to assist gas compliance and operations with mitigation strategies. 12 Capital Cost Plan ETL for EPBCS a) Extracted and fused information from Avista's Clarity and Oracle Financials to generate Construction Work in Progress (CWIP) and Transfer to Plant (TTP) for monthly upload into EPBCS. Process developed using algorithms written in R (coding language). b) Developed visualization to support capital planning. Yes Enabled PMO and Finance to obtain more accurate data and gain efficiency through a more automated process. Staff_PR_147 Attachment B Page 5 of 7 13 Certified Data Set: 2019 and 2020 Usage Certified Data Sets (CDS) are a collection of data that has been prepared (joined) with other data for use by Analysts. These data sets have been tested and published to allow for easy access to data for Analysts. All data requires approval prior to accessing. 2019 and 2020 Usage Data: Depending upon meter device type, usage data has been prepared to easily obtain hourly, daily or monthly usage, customer type (i.e., residential, commercial, industrial, irrigation, etc.), rate schedule, and revenue class. Data is further partitioned into service territory or other geographical classifications such as State, Zip, City, County, including Latitude and Longitude to facilitate data integration more easily into other systems (such as ArcGIS, Tableau, etc.). Avista developed proprietary SQL and FME codes to prepare this CDS, which includes automated updates as additional usage information is captured in source systems (i.e., MDM, CC&B, etc.) Yes The usage Certified Data Set contains high quality data, useful for trending analysis and for exploring insights that may assist with decision making. 14 Visualization Tool 2019 & 2020 Usage Tableau visualization developed by Data Science, displays delivered usage for the following: - YTD usage for the current year as compared to the previous year for both kWh and Therms - Usage as compared to weather over a user-specified time period - Interactive map that includes filters for location, revenue class, rate schedule & commodity - Usage/Weather calendars that further show relationship between weather and Yes Enables business users to understand usage trends that span periods, as well as better understand the relationship between weather and usage. The dashboard also enables users to better understand usage patterns for multiple different filters, such as location (state, city, or zip), revenue class, rate schedule, and commodity. Staff_PR_147 Attachment B Page 6 of 7 usage - Up to date meter counts 15 COVID-19 adjusted usage by Revenue Class & by State Modeling script developed in R on the Trove server. Builds both linear and polynomial regressions on pre-COVID-19 (03/01/2020 and prior) usage and weather data using weather as the predictor to determine a relationship. Then applies model to all 2020 weather data to generate usage values which could be expected in a world without COVID-19's effects. Compares modeled values to actual usage values to determine what the effect of COVID-19 is on usage. Yes Enables business users to see the impact of COVID- 19 on Avista usage. 16 Certified Data Set: Customer Demographic & Housing Info Certified Data Sets (CDS) are a collection of data that has been prepared (joined) with other data for use by Analysts. These data sets have been tested and published to allow for easy access to data for Analysts. All data requires approval prior to accessing. Customer Demographics (3rd Party): 3rd party data attributes that are matched (fused) to Avista's customers for subsequent analytics. Resulting 'fused' dataset contains 800+ attributes for each matched customer account. Avista developed ETL process to facilitate data preparation, including translation of 3rd party data records into meaningful values for enterprise analytics. Yes The demographic Certified Data Set contains 3rd party data, fused with Avista's high quality customer data, useful for enterprise analysts seeking to understand customer segmentation and/or to inform marketing outreach. Staff_PR_147 Attachment B Page 7 of 7