HomeMy WebLinkAbout20210512Avista to Staff 147 Attachment B.pdfProject Name: Data Science App 2020 - Pkg. 2 - 09906625
Clarity Project ID: PR00015550
Acctg Project #: 09906625
Business Case Name: Enterprise Data Science BI
ER/BI: 5038-38X09
Item
#
Approved
Scope Item
Scope Item Detail Business Value
1 Customer
Housing &
Demographics
a) Extracted, fused and prepared dataset
that combines information from Avista's
CC&B system with 3rd-party
demographic housing data, allowing DSM
team to better understand energy
efficiency outreach opportunities,
particularly for low-income 'at risk'
customers. Dataset prepared using
newly developed algorithms written in
programming language R.
b) Visualization of above developed
dataset enables users to explore our
customers in visual manner, accelerating
the understanding of information for
more effective customer outreach.
Newly fused data is displayed with
interoperability of attributes, allowing
users to isolate certain subsets of
prepared data for targeted analysis.
Enables enterprise users
to more effectively
segment our customers,
particularly for energy
efficiency and/or low-
income energy programs
2 Identify usage
for 'net
aggregated
accounts' that
have multiple
Service
Agreements
corresponding
with only one
common
Service Point
Implement programmatic (SQL) solution
to identify the handful of customer
accounts that are deemed 'net
aggregated accounts' and insert into the
DSW (Data Science Warehouse) tables as
a result from data fusion using CC&B and
MDM. Net aggregated accounts are
unique from nearly all other customers,
in that 'net aggregated accounts' have
one service point (SP) applied to many
different Service Agreements (SAs).
Usage statistics from net aggregated
Yes Captures and makes
available usage statistics
for non-traditional
customer-types
(community assets) that
are likely to grow in
counts.
Staff_PR_147 Attachment B Page 1 of 7
accounts can inadvertently get omitted
when deploying typical SQL code against
CC&B/MDM databases.
3 Budget
dashboard
reporting to VP
Community &
Economic
Vitality
Visualization tool to monitor expense
budgets of various groups reporting to VP
Community & Economic Vitality. Added
non-labor YTD actual, budget and
variance for seven Orgs that report to the
VP.
Yes Provides visibility for the
VP and direct reports for
actual expenses
4 Certified Data
Set:
Substation &
Feeder data
Certified Data Sets (CDS) are a collection
of data that has been prepared (joined)
with other data for use by Analysts.
These data sets have been tested and
published to allow for easy access to data
for Analysts. All data requires approval
prior to accessing.
Substation & Feeder Data: This CDS
associates each active Service Point ID to
an Avista substation and Avista feeder,
with certain intentional exclusions (such
as the downtown Spokane network and
non-metered service such as
streetlights). Avista developed SQL and
FME code to prepare this CDS, with
weekly updates based upon changes to
geodatabase the prior week.
Yes The Substation & Feeder
Certified Data Set enables
enterprise analysts to
perform as-desired
analytics to advance
understanding of
customers, our
distribution system, how
to better respond in
emergencies, etc. This
CDS is a foundational
element for enhanced
AMI analytics.
5 Tracker
analysis for
Deployment
Services team
Analytics of Avista's Tracker ticket
system, displaying performance statistics
in visual format on how long Tracker
tickets take to complete. Display allows
managerial insights into workload and
task duration by team, worker, etc. SQL
code was written to prepare dataset for
Yes Allows users to
understand workflow of
Tracker tasks to better
configure resources to
meet demands of
operational teams.
Staff_PR_147 Attachment B Page 2 of 7
Tableau to consume and display to end-
users in Deployment Services team.
6 Generations
metrics
reporting
Hydro and thermal generation metrics (of
Avista owned or controlled generation)
that assist GPSS team monitor
performance of generation units thru
visualization interface. Metrics display
both (i) summary of hydro and thermal
units performance statistics, as well as (ii)
unit specific performance statistics.
Historical summary dates back to 2018,
with automation built in tool to populate
for more current periods. Data
preparation includes joining and merging
datasets in order to show plant attributes
and monetary value of lost generation.
Tool contains visual interoperability that
assists users better understand how plant
attributes relate to generation
performance metrics. Formulas and
algorithms were implemented, with GPSS
subject matter experts providing
necessary business knowledge, with Data
Science automating the calculation and
visual display for users.
Yes Hydro and thermal
generation metrics that
assist GPSS team monitor
performance thru
visualization interface.
7 Application Org
annual cost
dashboard
Expose and visualize software licensing
cost data that tracks our monthly spend
on application licensing; enables users to
better understand and manage
budgetary forecasts using different filters
(i.e., by business unit, vendor,
expenditure type, etc.)
Yes Allows for analysis of
projected growth in A09
(Application Org) software
licensing costs for 2021
and 2022.
8 Application Org
Software
Expense
Allows for review of amortization
schedules for individual product
purchases. Reports on forecasted spend
by vendor/budget owner/project/etc.
Allows for a comparison of forecasts from
Yes Allows for review of
amortization schedules for
individual product
purchases and
comparison of forecasts
Staff_PR_147 Attachment B Page 3 of 7
one forecast period to another. FME
process pulls data from SharePoint,
insets into Budget schema tables.
from one period to
another.
9 Vulnerable
Populations
Data Fusion
delivered to
Power Supply
(thru GIS)
Assembled and fused dataset of Avista
residential customers 5-year historical
monthly bills (looking at each electric,
gas, and combined electric & gas), both
bill amount(s) and energy volume
consumed, in comparison to income
metrics from either 3rd party vendor
and/or US Census data. Fused data
delivered to Power Supply for Integrated
Resource Planning requirements as
legislated by State of WA. GIS used data
to consolidate into pre-defined
geographic zones designated as
'vulnerable populations' and/or 'Pollution
Zones' by State of WA.
Yes Developed insights into
our customers "energy
burden", and how the
results are distributed
throughout pre-defined
geographic zones
designated by State of
WA. Information was
used for TAC (Technical
Advisory Committee)
meetings with
stakeholders, as part of
Avista's IRP (Integrated
Resource Planning).
10 Enable alation
to crawl
SDEPRD data
source
Identified and pointed to source data sets
for enterprise data catalogue. For
reference, crawling SDEPRD includes
DISTOPS.
Yes With proper permissions,
these data sets can be
used by various users in
the organization. The
value of pointing to these
source data sets is to
avoid one-off extractions
made from non-source
data, and to conform
users into looking at the
same source data for all
potential business
applications, thereby
eliminating discrepancies
between users.
Staff_PR_147 Attachment B Page 4 of 7
11 811 Risk Tool Developed 811 Excavation Risk tool to
calculate consequence risk of excavations
as called into 811 "Call Before You Dig"
service for OR & WA service territories.
Consequence Model developed and
deployed in FME (811 Gas Risk Tool)
based on DIMP model provided by Gas
Compliance using spatial intersection of
intended dig area with existing Avista gas
facilities. Additionally, an ArcGIS
dashboard was created in ArcGIS Online
for the Gas Compliance team. Work
product included API call for data from
ELM (3rd party call center), ArcGIS tool
displaying risk information as calculated
by proprietary algorithms developed.
Yes Enables Avista to identify
and evaluate high-risk
excavation tickets to assist
gas compliance and
operations with mitigation
strategies.
12 Capital Cost
Plan ETL for
EPBCS
a) Extracted and fused information from
Avista's Clarity and Oracle Financials to
generate Construction Work in Progress
(CWIP) and Transfer to Plant (TTP) for
monthly upload into EPBCS. Process
developed using algorithms written in R
(coding language).
b) Developed visualization to support
capital planning.
Yes Enabled PMO and Finance
to obtain more accurate
data and gain efficiency
through a more
automated process.
Staff_PR_147 Attachment B Page 5 of 7
13 Certified Data
Set:
2019 and 2020
Usage
Certified Data Sets (CDS) are a collection
of data that has been prepared (joined)
with other data for use by Analysts.
These data sets have been tested and
published to allow for easy access to data
for Analysts. All data requires approval
prior to accessing.
2019 and 2020 Usage Data: Depending
upon meter device type, usage data has
been prepared to easily obtain hourly,
daily or monthly usage, customer type
(i.e., residential, commercial, industrial,
irrigation, etc.), rate schedule, and
revenue class. Data is further partitioned
into service territory or other
geographical classifications such as State,
Zip, City, County, including Latitude and
Longitude to facilitate data integration
more easily into other systems (such as
ArcGIS, Tableau, etc.). Avista developed
proprietary SQL and FME codes to
prepare this CDS, which includes
automated updates as additional usage
information is captured in source systems
(i.e., MDM, CC&B, etc.)
Yes The usage Certified Data
Set contains high quality
data, useful for trending
analysis and for exploring
insights that may assist
with decision making.
14 Visualization
Tool
2019 & 2020
Usage
Tableau visualization developed by Data
Science, displays delivered usage for the
following:
- YTD usage for the current year as
compared to the previous year for both
kWh and Therms
- Usage as compared to weather over a
user-specified time period
- Interactive map that includes filters for
location, revenue class, rate schedule &
commodity
- Usage/Weather calendars that further
show relationship between weather and
Yes Enables business users to
understand usage trends
that span periods, as well
as better understand the
relationship between
weather and usage. The
dashboard also enables
users to better
understand usage
patterns for multiple
different filters, such as
location (state, city, or
zip), revenue class, rate
schedule, and commodity.
Staff_PR_147 Attachment B Page 6 of 7
usage
- Up to date meter counts
15 COVID-19
adjusted usage
by Revenue
Class & by
State
Modeling script developed in R on the
Trove server. Builds both linear and
polynomial regressions on pre-COVID-19
(03/01/2020 and prior) usage and
weather data using weather as the
predictor to determine a relationship.
Then applies model to all 2020 weather
data to generate usage values which
could be expected in a world without
COVID-19's effects. Compares modeled
values to actual usage values to
determine what the effect of COVID-19 is
on usage.
Yes Enables business users to
see the impact of COVID-
19 on Avista usage.
16 Certified Data
Set:
Customer
Demographic &
Housing Info
Certified Data Sets (CDS) are a collection
of data that has been prepared (joined)
with other data for use by Analysts.
These data sets have been tested and
published to allow for easy access to data
for Analysts. All data requires approval
prior to accessing.
Customer Demographics (3rd Party): 3rd
party data attributes that are matched
(fused) to Avista's customers for
subsequent analytics. Resulting 'fused'
dataset contains 800+ attributes for each
matched customer account. Avista
developed ETL process to facilitate data
preparation, including translation of 3rd
party data records into meaningful values
for enterprise analytics.
Yes The demographic Certified
Data Set contains 3rd
party data, fused with
Avista's high quality
customer data, useful for
enterprise analysts
seeking to understand
customer segmentation
and/or to inform
marketing outreach.
Staff_PR_147 Attachment B Page 7 of 7