Choosing the Right Data Architecture
for Your Big Data Projects
Acknowledgements
Planning Your Enterprise Data Strategy
John Ladley
President
IMCue Solutions
Metrics for Information Management
Business Analysis Techniques for Data Professionals
Alec Sharp
Senior Consultant
Clariteq Systems Consulting
Steps to a Successful Enterprise Information Management
ProgramMichael F. Jennings
Executive Director - Data Governance
Walgreens
Meta Data Requirements for the Enterprise
David Loshin
President
Knowledge Integrity
Advanced MDM: Moving to the Next Level of MDM Success
Choosing the Right Data Architecture
for Your Big Data Projects
Acknowledgements
Key Ideas
One Big Data database cannot accommodate all the Big Data types
One size DOES NOT fit all.
You need to know the data type and data architecture to select the most
appropriate Big Data database.
Choosing a Big Data Architecture
Big Data
Platform
Big Data
Architecture
What is Big Data?
Big Data is about textual analytics (deriving data from unstructured content)
[not dimension or fact tables]
Web data
click stream data
social network data
Semi-structured data email
Unstructured content comments
Sensor data
Vertical industries structured transaction data tweets , text messages
Choosing a Big Data Architecture
Analysis
Type
Choosing a Big Data Architecture
What do we need to consider when classifying Big Data?
Real
Time
Batch
Processing
Methodology
Predictive
Analytics
Analytical
Querying
&
Reporting
Misc.
Data Type
Meta Data
Master
Data
Historical Transactional
Data Frequency
On
Demand
Feeds
Continuous
Feeds
Real Time
Feeds
Time Series
Structured
Un-
Structured
Semi-
Structured
Web and
Social
Media
Machine
Generated
Human
Generated
Internal
Data
Sources
Transaction
Data
Biometric
Data
Via Data
Providers
Via Data
Originators
Data Consumers
Human
Business
Process
Other
Enterprise
Applications
Other Data
Repositories
Hardware
Commodity
Hardware
State of the Art
Hardware
Choosing a Big Data Architecture
Classify Big Data Type According to the Business Needs
Big data business problems by type
Business problem
Big Data
Type Description
Utility companies have rolled out smart meters to measure the consumption of water, gas, and electricity at regular intervals of
one hour or less. These smart meters generate huge volumes of interval data that needs to be analyzed.
Utilities also run big, expensive, and complicated systems to generate power. Each grid includes sophisticated sensors that
monitor voltage, current, frequency, and?other important operating characteristics.
To gain operating efficiency, the company must monitor the data delivered by the sensor. A big data solution can analyze power
generation (supply) and power consumption (demand) data using smart meters.
Web and
social data
Telecommunications operators need to build detailed customer churn models that include social media and transaction data,
such as CDRs, to keep up with the competition.
The value of the churn models depends on the quality of customer attributes (customer master data such as date of birth, gender,
location, and income) and the social behavior of customers.
Transaction
data
Telecommunications providers who implement a predictive analytics strategy can manage and predict churn by analyzing the
calling patterns of subscribers.
Marketing departments use Twitter feeds to conduct sentiment analysis to determine what users are saying about the company
and its products or services, especially after a new product or release is launched.
Customer sentiment must be integrated with customer profile data to derive meaningful results. Customer feedback may vary
according to customer demographics.
Utilities: Predict power
consumption
Machine-
generated
data
Telecommunications:
Customer churn
analytics
Marketing: Sentiment
analysis
Web and
social data
Choosing a Big Data Architecture
Big data business problems by type
Business problem
Big Data
Type Description
Customer service:
Call monitoring Human-
generated
IT departments are turning to big data solutions to analyze application logs to gain insight that can improve system performance.
Log files from various application vendors are in different formats; they must be standardized before IT departments can use
them.
Web and
social data
Retailers can use facial recognition technology in combination with a photo from social media to make personalized offers to
customers based on buying behavior and location.
Biometrics
This capability could have a tremendous impact on retailers? loyalty programs, but it has serious privacy ramifications. Retailers
would need to make the appropriate privacy disclosures before implementing these applications.
Machine-
generated
data
Retailers can target customers with specific promotions and coupons based location data. Solutions are typically designed to
detect a user's location upon entry to a store or through GPS.
Transaction
data
Location data combined with customer preference data from social networks enable retailers to target online and in-store
marketing campaigns based on buying history. Notifications are delivered through mobile applications, SMS, and email.
Machine-
generated
data
Fraud management predicts the likelihood that a given transaction or customer account is experiencing fraud. Solutions analyze
transactions in real time and generate recommendations for immediate action, which is critical to stopping third-party fraud, first-
party fraud, and deliberate misuse of account privileges.
Solutions are typically designed to detect and prevent myriad fraud and risk types across multiple industries, including:
Transaction
data
Credit and debit payment card fraud
Deposit account fraud
Human-
generated Technical fraud
Bad debt
Healthcare fraud
Medicaid and Medicare fraud
Property and casualty insurance fraud
Worker compensation fraud
Insurance fraud
Telecommunications fraud
Retail and marketing:
Mobile data and
location-based
targeting
FSS, Healthcare:
Fraud detection
Retail: Personalized
messaging based on
facial recognition and
social media
Classify Big Data Type According to the Business Needs
Key Idea
There are guidelines to help suggest the Big Data Types that are
commonly used by each industry.
Choosing a Big Data Architecture
Classify Big Data Type According to the Business Needs
Validate the data being collected has business value.
Critical Success Factor
55% of Big Data projects don’t get completed,
…and many others fall short of their objectives.
http://www.infochimps.com/resources/report-cios-big-data-what-your-it-team-wants-you-to-know-6/
Report: CIOs & Big Data: What Your IT Team Wants You to Know
Choosing a Big Data Architecture
Big Data
Platform
Big Data
Architecture
Big Data
Business
Needs
by type
Ten Big Data SchemasRelational - Graph
A graph database stores data in a graph, the most generic of data structures, capable of
elegantly representing any kind of data in a highly accessible way.
Graph databases can make a difference in harvesting more value in your data
by looking at its relationships.
Provides index-free adjacency where every element contains a direct pointer to its
adjacent elements and no index lookups are necessary.
Ten Big Data Schemas
Relational - Analytics / MPP Columnar
Column-oriented storage organization, which increases performance of
sequential record access at the expense of common transactional operations
such as single record retrieval, updates, and deletes
Shared nothing architecture, which reduces system contention for shared
resources and allows gradual degradation of performance in the face of hardware
failure
Ten Big Data Schemas
Relational - Analytics / MPP Columnar
Ten Big Data SchemasRelational - Analytics / MPP
Delivers extreme performance and scalability for all your database applications
including Online Transaction Processing (OTLP), data warehousing (DW) and mixed
workloads
Ten Big Data SchemasRelational - NewSQL
Scale out relational databases by virtualizing a distributed database environment.
Provides organizations the relational data integrity combined with the
scalability and flexibility of a modern distributed, multi-site database to
support an unlimited numbers of users, larger data volumes and extremely
high TPS
Ten Big Data SchemasPolyStructured – Document Indexing
Provides full-text search, hit highlighting, faceted search, dynamic clustering, database
integration, and rich document (e.g., Word, PDF) handling.
Provides distributed search and index replication
Highly scalable
Ten Big Data SchemasPolyStructured – Document Indexing
Ten Big Data SchemasPolyStructured - Document
Document databases completely embrace the web.
Store data with JSON documents.
Access documents and query indexes with web browsers, via HTTP.
Index, combine, and transform documents with JavaScript.
Works well with modern web and mobile apps.
Serve web apps directly.
On-the-fly document transformation and real-time change notifications
Ten Big Data SchemasPolyStructured - Document
Document databases lack a schema, or rigid pre-defined data structures such as tables.
Data stored in document databases commonly use JSON document(s)
JavaScript for MapReduce indexes
Ten Big Data SchemasPolyStructured – Key Value Stored – InMemory - Data Grid
In-Memory Accelerator for Apache Hadoop, high performance computing, streaming
and database, HDFS and MongoDB
Eliminate MapReduce Overhead
Dynamically caches, partitions, replicates, and manages application data and business
logic across multiple servers.
Fully elastic memory based storage grid. Virtualized the free memory of a potentially
large number of Java virtual machines and makes them behave like a single key
addressable storage pool for application state.
IBM WebSphere eXtreme Scale
Ten Big Data SchemasPolyStructured – Key Value Stored – InMemory - Data Grid
Ten Big Data SchemasPolyStructured – Key Value Stored – InMemory - Caching
Run atomic operations like appending to a string; incrementing the value in a hash; pushing
to a list; computing set intersection, union and difference; or getting the member with
highest ranking in a sorted set.
With an in-memory dataset, depending on your use case, you can persist it either by
dumping the dataset to disk every once in a while, or by appending each command to
a log.
Ten Big Data SchemasPolyStructured – Key Value Stored – InMemory - Caching
Ten Big Data SchemasPolyStructured – Key Value Stored – Columnar
Random, real time read/write access to your Big Data
Hosting of very large tables -- billions of rows X millions of columns --
atop clusters of commodity hardware
Ten Big Data SchemasPolyStructured – Key Value Stored – Columnar
Ten Big Data SchemasPolyStructured – Distributed File System
Storage and large-scale processing of data-sets on clusters of commodity hardware.
Distributed, scalable, and portable file-system
Ten Big Data SchemasPolyStructured – Distributed File System
Key Ideas
Hadoop is the #1 distributed file system used for Big Data Projects
Hadoop is used as the shared data source platform to merge and
standardize big data with legacy data
Data As A Service
Single System Management
API’s
Data as a Service
Applications (API) should be based from a single data source platform.
Web and
Social
Media
Machine
Generated
Human
Generated
Internal
Data
Sources
Transaction
Data
Biometric
Data
Via Data
Providers
Via Data
Originators
Key Ideas
Hadoop is the #1 distributed file system used for Big Data Projects
Hadoop is used as the shared data source platform to merge and
standardize big data with legacy data
Hadoop is an excellent choice to start building your shared data source
platform
Hadoop can become your System of Record (SOR) for Big Data and part of
your Master Data Management system (MDM)
The date time format must be standardized across the
data platform
Critical Success Factors
The time format of International Standard ISO 8601 specifies numeric
representations of date and time.
YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00) is suggested
and preferred.
Unique identifiers (domain keys) must be clearly
described using friendly terminology
For example:
‘ID’ should never be a column name
‘Sales ID’ is too generic
‘Sales Representative Reporting ID’ is friendly
and clearly named
Key Idea
Hadoop is used as the shared analytical platform to merge and
standardize analytics
Single System Management
Analytics should be based from a single data source platform.
Analytics As A Service
IBM WebSphere eXtreme Scale
Analytics
Analytics as a Service
Key Ideas
Hadoop is used as the shared analytical platform to merge and standardize
analytics
There are guidelines to help suggest the analytics, KPI’s and Profit Drivers
for Big Data that are commonly used by each industry.
Examples of tasks Algorithms to use (2)
Predicting a discrete attribute
•Flag the customers in a prospective buyers list as good or poor
prospects.
•Calculate the probability that a server will fail within the next 6
months.
•Categorize patient outcomes and explore related factors.
Decision Trees Algorithm
Naive Bayes Algorithm
Clustering Algorithm
Neural Network Algorithm
Predicting a continuous attribute
•Forecast next year's sales.
•Predict site visitors given past historical and seasonal trends.
•Generate a risk score given demographics.
Decision Trees Algorithm
Time Series Algorithm
Linear Regression Algorithm
Predicting a sequence
•Perform clickstream analysis of a company's Web site.
•Analyze the factors leading to server failure.
•Capture and analyze sequences of activities during outpatient
visits, to formulate best practices around common activities.
Sequence Clustering Algorithm
Finding groups of common items in transactions
•Use market basket analysis to determine product placement.
•Suggest additional products to a customer for purchase.
•Analyze survey data from visitors to an event, to find which
activities or booths were correlated, to plan future activities.
Association Algorithm
Decision Trees Algorithm
Finding groups of similar items
•Create patient risk profiles groups based on attributes such as
demographics and behaviors.
•Analyze users by browsing and buying patterns.
•Identify servers that have similar usage characteristics.
Clustering Algorithm
Sequence Clustering Algorithm
Key Ideas
Hadoop is used as the shared analytical platform to merge and standardize
analytics
There are guidelines to help suggest the analytics, KPI’s and Profit Drivers
for Big Data that are commonly used by each industry.
You do not need to know how the algorithm works or is designed. You
only need to know the parameters needed to run them.
Task Description Algorithms
Market Basket Analysis Discover items sold together to create recommendations on-the-fly
and to determine how product placement can directly contribute to
your bottom line.
Association
Decision Trees
Churn Analysis Anticipate customers who may be considering canceling their service
and identify the benefits that will keep them from leaving.
Decision Trees
Linear Regression
Logistic Regression
Market Analysis Define market segments by automatically grouping similar customers
together. Use these segments to seek profitable customers.
Clustering
Sequence Clustering
Forecasting Predict sales and inventory amounts and learn how they are
interrelated to foresee bottlenecks and improve performance.
Decision Trees
Time Series
Data Exploration Analyze profitability across customers, or compare customers that
prefer different brands of the same product to discover new
opportunities.
Neural Network
Unsupervised Learning Identify previously unknown relationships between various elements
of your business to inform your decisions.
Neural Network
Web Site Analysis Understand how people use your Web site and group similar usage
patterns to offer a better experience.
Sequence Clustering
Campaign Analysis Spend marketing funds more effectively by targeting the customers
most likely to respond to a promotion.
Decision Trees
Naïve Bayes
Clustering
Information Quality Identify and handle anomalies during data entry or data loading to
improve the quality of information.
Linear Regression
Logistic Regression
Text Analysis Analyze feedback to find common themes and trends that concern
your customers or employees, informing decisions with unstructured
input.
Text Mining
Data Mining Tasks (4)
Data Mining Algorithms
(Analysis Services - Data Mining)
Choosing an Algorithm
by Task
To help you select an
algorithm for use with a
specific task, the
following table provides
suggestions for the
types of tasks for which
each algorithm is
traditionally used.
Examples of tasks Microsoft algorithms to use
Predicting a discrete attribute Microsoft Decision Trees Algorithm
Flag the customers in a prospective buyers list as good
or poor prospects. Microsoft Naive Bayes Algorithm
Calculate the probability that a server will fail within
the next 6 months. Microsoft Clustering Algorithm
Categorize patient outcomes and explore related
factors. Microsoft Neural Network Algorithm
Predicting a continuous attribute Microsoft Decision Trees Algorithm
Forecast next year's sales. Microsoft Time Series Algorithm
Predict site visitors given past historical and seasonal
trends. Microsoft Linear Regression Algorithm
Generate a risk score given demographics.
Predicting a sequence Microsoft Sequence Clustering Algorithm
Perform clickstream analysis of a company's Web site.
Analyze the factors leading to server failure.
Capture and analyze sequences of activities during
outpatient visits, to formulate best practices around
common activities.
Finding groups of common items in transactions Microsoft Association Algorithm
Use market basket analysis to determine product
placement. Microsoft Decision Trees Algorithm
Suggest additional products to a customer for
purchase.
Analyze survey data from visitors to an event, to find
which activities or booths were correlated, to plan
future activities.
Finding groups of similar items Microsoft Clustering Algorithm
Create patient risk profiles groups based on attributes
such as demographics and behaviors. Microsoft Sequence Clustering Algorithm
Analyze users by browsing and buying patterns.
Identify servers that have similar usage
characteristics.
Analytic Algorithm Categories
Regression
a powerful and commonly used algorithm that evaluates the relationship of one variable, the
dependent variable, with one or more other variables, called independent variables. By measuring exactly how
large and significant each independent variable has historically been in its relation to the dependent variable,
the future value of the dependent variable can be estimated. Regression models are widely used in applications,
such as seasonal forecasting, quality assurance and credit risk analysis.
Analytic Algorithm Categories
Clustering /
Segmentation
the process of grouping items together to form categories. You might look at a
large collection of shopping baskets and discover that they are clustered corresponding to health food buyers,
convenience food buyers, luxury food buyers, and so on. Once these characteristics have been grouped
together,
they can be used to find other customers with similar characteristics. This algorithm is used to create groups for
applications, such as customers for marketing campaigns, rate groups for insurance products, and crime
statistics
groups for law enforcement.
Analytic Algorithm Categories
Nearest Neighbor
quite similar to clustering, but it will only look at others records in the dataset that are “nearest” to a chosen
unclassified record based on a “similarity” measure. Records that are “near” to each other tend to have similar
predictive values as well. Thus, if you know the prediction value of one of the records, you can predict its
nearest neighbor. This algorithm works similar to the way that people think – by detecting closely matching
examples. Nearest Neighbor applications are often used in retail and life sciences applications.
Analytic Algorithm Categories
Association Rules
detects related items in a dataset. Association analysis identifies and groups together similar
records that would otherwise go unnoticed by a casual observer. This type of analysis is often used for market
basket analysis to find popular bundles of products that are related by transaction, such as low-end digital
cameras being associated with smaller capacity memory sticks to store the digital images.
Analytic Algorithm Categories
Decision Tree
a tree-shaped graphical predictive algorithm that represents alternative sequential decisions and the possible outcomes for
each decision. This algorithm provides alternative actions that are available to the decision maker, the probabilistic events
that follow from and affect these actions, and the outcomes that are associated with each possible scenario of actions and
consequences. Their applications range from credit card scoring to time series predictions of exchange rates.
Analytic Algorithm Categories
Sequence Association
detects causality and association between time-ordered events, although the associated events may be spread
far apart in time and may seem unrelated. Tracking specific time-ordered records and linking these records to a
specific outcome allows companies to predict a possible outcome based on a few occurring events. A sequence
model can be used to reduce the number of clicks customers have to make when navigating a company’s
website.
Analytic Algorithm Categories
Neural Network
a sophisticated pattern detection algorithm that uses machine learning techniques to generate predictions. This technique
models itself after the process of cognitive learning and the neurological functions of the brain capable of predicting new
observations from other known observations. Neural networks are very powerful, complex, and accurate predictive models
that are used in detecting fraudulent behavior, in predicting the movement of stocks and currencies, and in improving the
response rates of direct marketing campaigns.
Choosing a Big Data Architecture
Big Data
Platform
Big Data
Analytical
Platform
Big Data
Analytics
Big Data
Business
Needs
by type
Big Data
Architecture
Analytics Data Sources
Analytics should be based from a single data source platform.
Analytics As A Service
Analytics as a Service
IBM WebSphere eXtreme Scale
Analytics As A Service
When you write data to a traditional database, either through loading external data,
writing the output of a query, doing UPDATE statements, etc., the database has total
control over the storage. The database is the "gatekeeper." An important implication
of this control is that the database can enforce the schema as data is written. This is
called schema on write.
Hive has no such control over the underlying storage. There are many ways to create,
modify, and even damage the data that Hive will query. Therefore, Hive can only
enforce queries on read. This is called schema on read.
So what if the schema doesn’t match the file contents? Hive does the best that it can
to read the data. You will get lots of null values if there aren’t enough fields in each
record to match the schema. If some fields are numbers and Hive encounters
nonnumeric strings, it will return nulls for those fields. Above all else, Hive tries to
recover from all errors as best it can.
Analytics As A Service
Benefits of schema on write:
• Better type safety and data cleansing done for the data at rest
• Typically more efficient (storage size and computationally) since the data is already parsed
Downsides of schema on write:
• You have to plan ahead of time what your schema is before you store the data (i.e., you have to do ETL)
• Typically you throw away the original data, which could be bad if you have a bug in your ingest process
• It's harder to have different views of the same data
Benefits of schema on read:
• Flexibility in defining how your data is interpreted at load time
• This gives you the ability to evolve your "schema" as time goes on
• This allows you to have different versions of your "schema"
• This allows the original source data format to change without having to consolidate to one data format
• You get to keep your original data
• You can load your data before you know what to do with it (so you don't drop it on the ground)
• Gives you flexibility in being able to store unstructured, unclean, and/or unorganized data
Downsides of schema on read:
• Generally it is less efficient because you have to reparse and reinterpret the data every time (this can be expensive with
formats like XML)
• The data is not self-documenting (i.e., you can't look at a schema to figure out what the data is)
• More error prone and your analytics have to account for dirty data
http://nosql.mypopescu.com/post/48638541973/schema-on-writes-vs-schema-on-reads-apache-hadoop-and
Reporting users
make their own schemas
and naming standards
Reporting users run their own analytics
--- as many times as they want
Key Ideas - Summary
One Big Data database cannot accommodate all the Big Data types
You need to know the data type and data architecture to select the most
appropriate Big Data database.
There are guidelines to help suggest the Big Data Types that are commonly
used by each business type.
Hadoop is used as the shared data source platform to merge and
standardize big data with legacy data
Hadoop is used as the shared analytical platform to merge and standardize
analytics
Hadoop is an excellent choice to start building your shared data source
platform
Hadoop can become your System of Record (SOR) for Big Data and part of
your Master Data Management system (MDM)
Hadoop is used to standardize and centralize the Key Performance
Indicators (KPI) and Profit Drivers for an Enterprise Analytical Platform
There are guidelines to help suggest the analytics, KPI’s and Profit Drivers
for Big Data that are commonly used by each industry.
Schema on read
Critical Success Factors - Summary
Validate the data being collected has business value.
The date time format must be standardized across the data platform.
Unique identifiers (domain keys) must be clearly described using friendly
terminology
1) Pervasive insights produce better business decision opening access to business intelligence by
embedding analytics capabilities into everyday software tools pays substantial dividends.
By Lauren Gibbons Paul
2) Data Mining Algorithms (Analysis Services - Data Mining)
http://msdn.microsoft.com/en-us/library/ms175595.aspx
3) Data Mining Query Task
http://msdn.microsoft.com/en-us/library/ms141728.aspx
4) Predictive Analysis with SQL Server 2008 - White Paper - Microsoft - Published: November 2007
5) Predictive Analytics for the Retail Industry - White Paper - Microsoft - Writer: Matt Adams
Technical Reviewer: Roni Karassik, Published: May 2008
6) Breakthrough Insights using Microsoft SQL Server 2012 - Analysis Services
https://www.microsoftvirtualacademy.com/tracks/breakthrough-insights-using-microsoft-sql-server-2012-a
7) Useful DAX Starter Functions and Expressions
http://thomasivarssonmalmo.wordpress.com/category/powerpivot-and-dax/
8) Stairway to PowerPivot and DAX - Level 1: Getting Started with PowerPivot and DAX
By Bill_Pearson, 2011/12/21
9) Data Mining Tool
http://technet.microsoft.com/en-us/library/ms174467.aspx
10) DAX Cheat Sheet
http://powerpivot-info.com/post/439-dax-cheat-sheet
11) Big Data Landscape - http://arnon.me/2012/11/nosql-landscape-diagrams/
References
On the Internet, the World Wide Web Consortium (W3C) uses ISO 8601 in defining a profile of the standard that restricts the supported date and time formats
to reduce the chance of error and the complexity of software.[19]
RFC 3339 defines a profile of ISO 8601 for use in Internet protocols and standards. It explicitly excludes durations and dates before the common era. The more
complex formats such as week numbers and ordinal days are not permitted.[20]
RFC 3339 deviates from ISO 8601 in allowing a zero timezone offset to be specified as "-00:00", which ISO 8601 forbids. RFC 3339 intends "-00:00" to carry the
connotation that it is not stating a preferred timezone, whereas the conforming "+00:00" or any non-zero offset connotes that the offset being used is
preferred. This convention regarding "-00:00" is derived from earlier RFCs, such as RFC 2822 which uses it for timestamps in email headers. RFC 2822 made no
claim that any part of its timestamp format conforms to ISO 8601, and so was free to use this convention without conflict. RFC 3339 errs in adopting this
convention while also claiming conformance to ISO 8601.
http://www.w3.org/TR/NOTE-datetime
http://stackoverflow.com/questions/16307563/utc-time-explanation
International Standard ISO 8601 specifies numeric representations of date and time.
YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00) where:
YYYY = four-digit year
MM = two-digit month (01=January, etc.)
DD = two-digit day of month (01 through 31)
hh = two digits of hour (00 through 23) (am/pm NOT allowed)
mm = two digits of minute (00 through 59)
ss = two digits of second (00 through 59)
s = one or more digits representing a decimal fraction of a second
TZD = time zone designator (Z or +hh:mm or -hh:mm)
Times are expressed in UTC (Coordinated Universal Time), with a special UTC designator ("Z"). Times are expressed in local time, together with a time zone
offset in hours and minutes. A time zone offset of "+hh:mm" indicates that the date/time uses a local time zone which is "hh" hours and "mm" minutes ahead of
UTC. A time zone offset of "-hh:mm" indicates that the date/time uses a local time zone which is "hh" hours and "mm" minutes behind UTC.