Learn how the newest data quality functionality in Syncsort’s Trillium Software System will ensure your enterprise data - from CRM to data lakes and everything in between - provide complete, integrated, fit-for-purpose information you can trust.
View this webcast on demand to discover how to:
• Process data quality jobs at massive scale with Trillium Quality for Big Data, running natively within Big Data frameworks like Hadoop MapReduce and Apache Spark, with no end-user coding and no system tuning or re-coding required
• Tightly integrate data quality assessment with third-party tools using our newly published Trillium Discovery REST API
• Ensure data processing complies with data governance policy through out-of-the-box data discovery integration with Collibra Data Governance Center
• Eliminate duplicate, incomplete CRM data using Trillium Quality for Dynamics CRM and achieve a true single view of the customer
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
What’s New in Syncsort’s Trillium Software System (TSS) 15.7
1. WHAT’S NEW - TRILLIUM DATA QUALITY
Harald Smith
February 2018
2. Speaker
Harald Smith
▪ Director of Product Management, Syncsort
▪ 20 years in Information Management focused on data quality,
integration, and governance
▪ Consulting, product management, software & solution development
▪ Co-author of Patterns of Information Management, as well as
two Redbooks on Information Governance and Data Integration
▪ Current Blog on InfoWorld: “Data Democratized”
3. Agenda
3
Syncsort Confidential and Proprietary - do not copy or distribute
▪ Big Data Quality
▪ Data Governance
▪ Operational Data Quality
▪ Production Processing & Application Integration
4. Data is Top of Mind
Volume and Complexity Is Growing
Compliance Demands are Broader and
Deeper
Trust and Confidence in Data Is Decreasing
“Only 3% of the DQ scores in our study can be rated ‘acceptable’ using the loosest-possible standard.”
-- Harvard Business Review, September 2017
4
5. Insights from Syncsort’s 2017 Big Data Trends survey
▪ Data Quality is
recognized as a mission-
critical success factor for
the Data Lake
▪ Data Quality tops the list of
challenges of data lake
implementation, followed
closely by Data Governance
▪ But… not everyone is
making the connection
between Data Quality
and Big Data success
▪ Participants who did not
include data quality as a top
3 priority for implementing
the data lake expressed the
most interest in analytically-
intensive data lake uses…
which are highly dependent
on proper data quality
▪ Financial services and
insurance industries are the
most focused on Data
Quality and Data
Governance
▪ Named Data Quality as top
priority 50% more often than
participants from other
industries
▪ Also identified Data Governance
as a top priority at more than
twice the rate of those from
other industries
5
6. Redefine the value of quality data
▪ Enable business leaders and users to seamlessly anticipate opportunities, uncover hidden risks
and make better decisions by rapidly providing complete, accurate and trusted data in
everything they do
▪ Enable governance of critical data elements through integration of data quality in the right place
at the right time
▪ Enable enrichment, validation, & verification of data central to the Customer 360 view,
including Big Data environments (e.g. Hadoop and Spark)
▪ Simplify User Experience by focusing on core use cases and patterns
▪ Provide consistent processing and results on premise and in the cloud
Syncsort Confidential and Proprietary - do not copy or distribute
6
7. Trillium Software Product Portfolio
Trillium Software System
On Premise or via Trillium Cloud
Deploy any or all products to the cloud
Completely managed SaaS in AWS or Azure deployed in 30 days or less
Trillium Discovery 15.7
Automated data profiling and discovery tool that identifies data
quality issues, facilitates business rule management, and provides
data quality metrics
Trillium Quality 15.7
Data quality engine that provides data cleansing, matching, and
enrichment for multi-domain, global data (including global address
validation)
Trillium Precise 1.0
Integrates data enrichment for all Trillium
products for key data elements including
phone, email, IP address, and person
Trillium Solutions
CRM, ERP, MDM
Customized solutions for leading platforms:
• Trillium Quality for Dynamics CRM 2.3
• Trillium Quality for SAP
Discovery Center/Administration Center
Web-browser based UI’s for specific users
Trillium Quality for Big Data 15.7
Enables data quality processing including
cleansing, matching, and global data
enrichment on Big Data platforms
Trillium Director/TSI - Real-time integration
Enables real-time, secure data quality within any application via web
services or API’s
7
8. Trillium Software Functional Overview
Data Profiling
Trillium QualityTrillium Discovery
Business Rules &
Data Quality
Assessment
Data Validation,
Standardization,
Linking & more
Data
Verification &
Enrichment
• CRM
• Customer
360
Operational Integrations
Data
Governance
Analytics &
Reporting
8
9. Trillium Discovery – Profiling and Monitoring
▪ Measure and mitigate risk and cost
associated with poor data quality
▪ Profile data sources to understand
current conditions and quality issues
▪ Report on data quality metrics for
accuracy, consistency and completeness
▪ Create and validate business rules
▪ Monitor data quality thresholds and
trends over time
▪ Quantify, annotate, and prioritize data
quality issues
▪ Generate recode and lookup tables, and
prepare remediated files
Syncsort Confidential and Proprietary - do not copy or distribute
9
10. Trillium Quality – Cleansing, Verification, Enrichment, and Matching
▪ Develop workflows to transform, parse,
standardize, match and survive best record
▪ Consolidate data sources on input
▪ Match on party, household, business or any
custom identifiers
▪ De-duplicate and unify data sources to create a
single golden record
▪ Global address validation with individual
country postal rules to clean, correct and
complete name and address data
▪ Enrich missing postal information,
latitude/longitude and other reference data
▪ Deploy in batch, real-time, Hadoop or in
multiple applications
10
Syncsort Confidential and Proprietary - do not copy or distribute
11. Trillium Director/TSI
(Real-time DQ Application Server)
Core EngineCore Engine
Rules
CustomApplications
CleanseCleanse
ClientRequests
C,Java,WebService,XMLoverHTTPS
MatchMatch
Customer Applications
Trillium Director – Real-time Data Quality
▪ Deploy batch or real-time
Trillium Quality services across
multiple platforms, servers,
and applications through one
interface
▪ Integrate into multiple
workflows to provide data
quality services to applications
throughout the enterprise
▪ REST, SOAP web services
▪ Monitor the availability of
each Director and facilitate
fail-over if a Director becomes
unavailable
11
Syncsort Confidential and Proprietary - do not copy or distribute
12. TRILLIUM QUALITY FOR BIG DATA
New Release!
Syncsort Confidential and Proprietary - do not copy or distribute
12
13. Trillium Quality for Big Data
Benefits
Data Lake is the source of
TRUSTED data for analytics
Robust data quality
processing at Big Data scale
to meet SLAs, support use
cases like Customer 360
No coding or tuning saves
time and resources – and
helps address Big Data skills
shortages
Save time and network
resources by keeping data in
place in the data lake
SolutionKey Challenges
Big Data projects require:
Massive scalability
Low latency
Many data sources for a
complete view
Data quality processing using a
standalone server is no longer
adequate to keep up:
Millions of transactions per
day now very common
Critical for data quality
processing to meet end
user SLAs and/or key
success factors
Trillium Quality for Big Data
executes data quality jobs
natively within Big Data
frameworks (Hadoop
MapReduce, Apache Spark)
Leverages the DMX-h
execution framework
(Intelligent Execution)
No need to move/copy huge
volumes of data for quality
processing; Big Data remains
in place
No coding or tuning; jobs are
automatically optimized
13
14. New: Trillium Quality for Big Data – Key Features
▪ 1st release of Trillium Quality for Big Data leveraging the Syncsort DMX-h execution
framework
▪ What’s Included:
▪ Support for Hortonworks Data Platform (HDP) and Cloudera
▪ Dynamically leverages MapReduce and Apache Spark (1.0, 2.0)
▪ Standard OOTB support for cleansing, address verification, and matching, including multi-match
implementations
▪ Project Deployment to Big Data from the Trillium Control Center including Software and Postal
Directories offering Global project support
▪ UUID functionality supported and automated from the Trillium Control Center
▪ Comprehensive documentation set - including install and developer guides - is available with the
release
Syncsort Confidential and Proprietary - do not copy or distribute 14
15. Trillium Quality for Big Data – Functional Architecture
Development
Environment
Data Quality
Processing
1
n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Runtime
Environment
Data Quality
Processing
INPU
T
Flat Files
REF
.
Trillium
Control
Center
Deploy to
Hadoop
Trillium Quality Hadoop Cluster
Develop Once – Deploy Anywhere
▪ Reuse existing Trillium Data Quality projects
▪ Reuse existing skills and experience in Trillium Software
▪ Harness Trillium Software functions and preconfigured workflows and rules
▪ Maintain parsing, standardization, matching, postal enrichment rule sets
▪ Easy-to-deploy via automation
▪ Leverages DMX-h Intelligent Execution to determine optimal execution: Hadoop or Spark
(No recoding required!)
Syncsort Confidential and Proprietary - do not copy or distribute
15
16. Trillium Quality for Big Data
Focus on Data Quality, not the Big Data platform
▪ Use existing Data Quality skills and expertise
▪ No need to worry about mappers, reducers, big side or small side of joins, etc
▪ Automatic optimization for best performance, load balancing, etc.
▪ No changes or tuning required, even if you change execution frameworks
▪ Future-proof job designs for emerging compute frameworks, e.g. Spark 2.x
▪ Run multiple execution frameworks in a single job
Single GUI Execute Anywhere!
16Syncsort Confidential and Proprietary - do not copy or distribute
Intelligent Execution - Insulate your organization from underlying complexities of Hadoop
18. Trillium Integrations for Data Governance
Benefits
End-to-End Data Governance:
From defining policies/rules for
data quality…
To technical implementation of
data testing and metrics to
ensure rules are complied with in
data management
Data quality metrics results not
meeting thresholds can alert data
steward(s) to take corrective
action or provide remediated
data
Automated integration saves
time and resources – and helps
ensure trust in data
SolutionKey Challenges
Organizations invest in data
governance solutions to support
compliance, ensure data is
actionable by the business, etc.
Many of these solutions
define and manage data
quality rules, but don’t
provide the processing
through which the rules are
executed on the data, or the
quality is measured
Trillium Discovery provides REST
API integration with solutions
such as Collibra DGC and ASG
Enterprise Data Intelligence or BI
tools (e.g. Qlik, Tableau)
Automated delivery of
policy-based rules to Trillium
Discovery
Automated delivery of rule
or profiling results to data
governance solution
API’s for support of custom
integrations
18
19. TSS v15.7 REST APIs and Governance Integrations
▪ Data Quality is a key core component to a Data
Governance process to address business compliance,
risk and data management requirements and
standards.
▪ What’s included:
▪ Published Trillium Discovery APIs with full
documentation
▪ All features available in the Discovery Center UI
available via API: data source metadata, profiling
results, data quality rules & rule results, join and
dependency analysis
▪ Standard GET/POST functions using JSON
▪ Filter & sort rows, select columns of interest, and page
result sets
Syncsort Confidential and Proprietary - do not copy or distribute 19
20. Trillium Software System
Governance Integration – Functional Architecture
TS Discovery 15.7
Automated data profiling and discovery
that identifies data quality issues,
facilitates business rule management, and
provides data quality metrics
Trillium Clients
Control Center (rich)
Discovery Center (browser)
Administration Center (browser)
Trillium Repository
Export data grid
rows to file
Data extracts
.csv
ODBC Reporting
Adapter (SQL)
REST API’s
REST API’s
Data
interchange
.json
GET/POST
Data extracts
.csv
Import/export
rule sets
Rule set
packages
.xml
Excel
Queries
Application integration
BI & reporting tools
20
21. Data Governance integration with Collibra
▪ Trillium Discovery and Collibra DGC:
▪ Bi-directional linkage of Collibra data
quality rules with Trillium Discovery
business rules
▪ Packaged workflow can run OOTB
▪ Develop and modify in Collibra and transfer
those rules to Discovery to apply
appropriate syntax and connect to data
sources
▪ Collibra data quality rules become available
in Trillium Discovery
▪ Automatically deliver results from
associated data sources to Collibra as Data
QualityMetrics
21
22. Use Case: Data Governance Policy Management
Trillium Discovery
Converts DGC rules into technically
executable data quality rules
Constantly runs data quality metrics on
near real-time basis
Closing the loop with DGC…
If Data Quality metrics fall below defined
thresholds, Collibra users are alerted via
their dashboards
Data stewards can review in Trillium
Discovery and take corrective actions
Bi-directional
connectivity to
constantly sync:
▪ Collibra
rulebooks and
Discovery Center
rules
▪ Results of
Discovery quality
tests
Collibra DGC
Lets non-technical users define
business policies and data rules in
plain language
22
23. Trillium Discovery-Collibra Integration – Functional Architecture
Collibra DGC Trillium DiscoveryCollibra Connect
DQ MetricDQ Rule
1. Create the ‘semantic’ Data Quality Rule in
the Rulebook
2. Optionally edit the Predicate (expression)
and Threshold
5. Review results for all associated Data
Quality Metrics
3. Edit the DQ Rule Expression and associate
to 1 or more data sources
4. Run the ‘technical’ Data Quality Rule and
generate Passing/Failing counts
• Ongoing bi-directional polling or scheduling
• Linked on Collibra domain & object
identifiers
Syncsort Confidential and Proprietary - do not copy or distribute
23
24. Data Governance integration with ASG
▪ Trillium Discovery and ASG Enterprise
Data Intelligence:
▪ Uni-directional delivery of Trillium
Discovery profiling results
▪ ASG scanner runs OOTB
▪ End-to-End data transparency using
data lineage/data relationship
graphs (ASG) with data quality
metrics (Trillium Discovery) through
each transformation
▪ Visual evidence that data processing
is in fact taking place where & when
intended, with no unexpected
results/impairment to data quality
levels through a process flow
24
25. Use Case: Data Governance Compliance Tracing
25
Validity
Sun 05/01/2016 12:00:00 PM MDT
Threshold: 100
Pass: 96
Dimensions: Accuracy
Completeness
Sun 05/01/2016 12:00:00 PM MDT
Threshold: 98
Pass: 96
Dimensions: Completeness
ASG Enterprise Data intelligence
Lets users trace data quality issues
with critical data elements through the
data lineage graph to find where &
when the issue appeared
26. TRILLIUM QUALITY FOR DYNAMICS CRM
New Features!
Syncsort Confidential and Proprietary - do not copy or distribute
26
27. Trillium Quality for Dynamics CRM
Benefits
Rapid time to value – can be
operational within a day
High-quality data is constantly
maintained exactly where it is
needed the most - directly
within MS Dynamics CRM
SolutionKey Challenges
Organizations invest in CRM to
drive insight for sales,
customer support in order to
provide better service and
increase revenue.
Duplicates, old data,
incomplete data,
misspellings, data in wrong
fields, ...
Poor data quality is a
primary reason for 40% of
all business initiatives
failing to achieve their
target*
Trillium Quality for Dynamics CRM
Real-time cleansing &
matching right in MS
Dynamics CRM
Comprehensive batch
cleansing of the entire CRM
database (including
resolution from prior data
migrations)
Leads Analysis tool for insight
into the value of prospect lists
before loading into CRM
27*source: Gartner Research
28. New v15.7 MS Dynamics CRM 2.3 – New Key Features
▪ Seamless Data Quality integration directly in the Dynamics CRM environment enabling
evaluation, cleansing, de-duplication, and merging of global customer and leads
records both on premise and in the Cloud.
▪ What’s included:
▪ Enable Cross-Entity Matching for Leads
▪ Match Leads to Contacts for Newly Entered Leads
▪ Match Link Table for Leads shows Matches to both Leads and Contacts
▪ Enable Merge Between Leads and Contacts
▪ Enable Editing of Web Resource XML (Deployment Manager)
▪ New Leads Analysis tool – Examine Leads and Directly; Import only qualified, new records into CRM
▪ End User Master Batch Availability for cleansing full CRM database
▪ Microsoft Dynamics 365 support
▪ Deploy Global Data Quality solution, including email/phone verification through via Trillium Precise
Syncsort Confidential and Proprietary - do not copy or distribute 28
29. Trillium for MS Dynamics CRM v2.3 – Online Process
Integration directly in Dynamics CRM screens
▪ Enter Contacts or Leads as normal
▪ Popup validation of Address (optionally email & phone)
▪ Match to existing Contacts and Leads (cross-entity) with
cross-population of validated fields
Syncsort Confidential and Proprietary - do not copy or distribute
29
30. Trillium for MS Dynamics CRM v2.3 – Batch Process
Integration into Batch Upload
▪ Use the Trillium web interface to analyze a lead list against the CRM instance to determine new or existing leads
▪ Establish valid patterns for matches
▪ Confirm the data prior to import
Syncsort Confidential and Proprietary - do not copy or distribute
30
31. TRILLIUM SOFTWARE SYSTEM
Postal Maintenance & Real-time Integrations
Syncsort Confidential and Proprietary - do not copy or distribute
31
32. Trillium Quality Postal Download Web Service
▪ A web-based application to download, activate, and update all postal and geocoding directories
where TSS administrators may manage licensing, view status and perform maintenance on
directories as scheduled or required, such as monthly or quarterly, to assure accurate
processing against freshest possible data with minimal downtime.
▪ What’s included:
▪ Support for ASCII directories for TSS Control Center and 32-bit processing
▪ Support for UTF-8 directories for 64-bit processing with EDQ
▪ Support for Trillium Cloud to perform updates in AWS
▪ Interface to view/examine current directory status on a per country or per directory file basis
▪ User defined 1-Step or 2-Step process for directory Update and Activation through the User Interface
▪ Implementation via a User Interface or an Automated Script run on a user-defined schedule
▪ User-defined transfer processing speed - via number of workers, delay, buffer-size
▪ Enabled via a secure postal access key
Syncsort Confidential and Proprietary - do not copy or distribute 32
33. Trillium Quality Postal Update – Functional Process
Readily managed Postal Table update process
▪ Managed through the the TSI Web Server Administration browser UI
▪ Set configuration for the Postal Download Service
▪ Can choose to use 1- or 2-step process
▪ Check on Postal Directory Status and see what is out-of-date
▪ Review Postal License Management and request new countries
▪ Track updates to the Postal Directories as they happen
Syncsort Confidential and Proprietary - do not copy or distribute
33
34. TSS v15.7 Real-time REST Web Services
▪ New Web Service methods and samples are available for real-time data cleansing and
matching through the Trillium Server Interface (TSI) for application integration
▪ What’s included:
▪ Trillium Server Interface supports the industry standard REST web services requests (in addition to
SOAP) with full JSON support
▪ Adds REST real-time Cleanse and Match (both Reference and Window match) support via Apache
Tomcat
▪ Includes Software Development Kit (SDK) for REST and SOAP web services in support of both Java
and .Net C# (including sample files to facilitate upgrading from the Director)
▪ Incorporates SSL (Secure Socket Layer) implementation to ensure secure data transfer with TSI Web
Services
Syncsort Confidential and Proprietary - do not copy or distribute 34
35. TRILLIUM CLOUD
All features available on-premise or in Cloud
Syncsort Confidential and Proprietary - do not copy or distribute
35
36. Trillium Cloud
▪ Entire Trillium product portfolio is
available via the Cloud
▪ Cloud based solutions licensed on a
‘subscription basis’
▪ Complete Infrastructure & Data
Center Facilities
▪ Program or Project Management at
a Technical Level
▪ Technical Operations and Monitoring
of Infrastructure / Solution
▪ Trillium Cloud Solution Benefits:
▪ No Long Term Capital Investment
▪ Faster ROI
▪ Removal of Technical Complexity
36 36
37. Questions and Next Steps
▪ For more information on Trillium Software and our data quality solutions, please visit:
www.trilliumsoftware.com/products
▪ For the latest Trillium Software release, please visit our Customer Portal or contact us at:
www.trilliumsoftware.com/contact-us
▪ Contact Info:
Harald Smith, Director of Product Management, Syncsort
Harald.Smith@trilliumsoftware.com
https://www.linkedin.com/in/harald-smith-71028b
twitter: @haraldsmith1
37