4. Teradata Company Highlights
• Founded 1979 – West LA
• First product to market – 1984
• First Terabyte system – 1987
• Acquired by AT&T and
merged with acquired NCR – 1992
• Tri-vested as part of NCR - 1997
• Teradata Corporation – (re)Launched October 1, 2007
– Global Leader in Enterprise Data Warehousing
• EDW/ADW Database Technology
• Analytic Solutions
– Positioned in Gartner’s Leaders Quadrant
in data warehousing since 1999
• Top 10 U.S. publicly-traded software company
– S&P 500 Member
– Listed NYSE: “TDC”
– 2007 - $1.7B revenue
7. Continuous (R)evolution
Sell the HW, give everything else
away
Sell the SW with some HW to
run on
Sell solving business problems – and technology to
solve them
Sell applications with consulting, SW
and HW inside
9. Scale
• Every dimension of the technology must scale to meet today’s requirements
– Data, Data model complexity, Users, Performance, queries, Data loading, …
• What is a big Data Warehouse?
• Total spinning disk?
– 2.5 Petabytes
• Big table?
– 150 billion rows
• Number of tables?
– 300,000
• Insert/Update per day?
– 5 billion records
• Identified users?
– 100,000
• Queries per day?
– 5 million
• Data Turnover rate?
– 1TB per 5 seconds
10. The Problem
10 > 09/2009
Accts. Payable
Accts. Receivable
Invoicing
Sales/Orders
Finance G/L
Customer Support
HR
Payroll
Purchasing
Order Fulfillment
Manufacturing
Inventory …
Marketing
Supply Chain
Finance
Risk Management
Maintenance
Sales
Operations
Inventory
Call Center …
Operational Systems Decision Makers
11. The EDW Solution
Accts. Payable
Accts. Receivable
Invoicing
Sales/Orders
Finance G/L
Customer Support
HR
Payroll
Purchasing
Order Fulfillment
Manufacturing
Inventory …
EnterpriseEnterprise
DataData
WarehouseWarehouse
(EDW)(EDW)
Marketing
Supply Chain
Finance
Risk Management
Maintenance
Sales
Operations
Inventory
Call Center …
Operational Systems Decision Makers
12. Active Enterprise Intelligence™
An Obvious Trend: More Speed, More Users
Strategic Intelligence Operational Intelligence
Enterprise Data Warehouse
BI Tools & reports
Analysis & visualization
Predictive Analytics
EDW Enterprise Integration
Mixed workload management
SOA, BPMS, IDEs
Portals/composite applications
Days
Seconds
13. Active Enterprise Intelligence™ enabled by an
Active Data Warehouse™
STRATEGIC INTELLIGENCEOPERATIONAL INTELLIGENCE
Business Intelligence
Tools and Applications
Teradata Warehouse
Workflow & Applications
Active EventsActive Access
Suppliers Customers Call
Center
Logistics MarketingFinanceProduct/
Services
Executive
Active Enterprise Integration
Active
Availability
Active
Workload
Management
Active
Load
14. Active Enterprise Intelligence™ in Retail
Detecting Retail Fraud
Situation
Thieves make copies of cash register receipts, walk into
the store, pick up merchandise, and return items for
cash.
Problem
Associates in returns department did not have historical
POS receipt retrieval access to verify against previously
“returned” receipts or to do returns without receipts.
Solution
Associates query Teradata to quickly check if a return
has already occurred on that receipt number. Also used
by analysts to understand and prevent excessive
returns.
Impact
(for 500-store chain)
• 100% ROI in 5 months
• Stopped a crime ring on the
first day of rollout
• “Cost savings have been
huge”
15. Active Enterprise Intelligence™ in Retail
Single View of the Customer Across All Channels
Situation
Needed to add Web channel for selling shoes.
Problem
Too much time and cost to keep multiple customer
systems synchronized. Realized they needed just
one customer database, not one more for the Web,
in addition to Call Center, and POS/Store databases.
Solution
Adopted an ADW strategy, moved all customer data
to one Teradata system, revised data models to
cover all channels, added web channel for
commerce, used web services, added TASM to
handle multiple workload types
Impact
• 1M tactical hits to the
EDW per day from the
POS, Call Center, and
Web with 0.11 sec
response time
• Runs simultaneously
with back-office BI,
reports, and ETL
workloads
• Eliminated all other
customer data systems
16. What is the Measure of a Great
Architecture?
Handle huge changes of underlying technologies and
dependent components while continuing to deliver the
key value proposition.
17.
18. Processor RoadmapCPU power radically increasing
2003 2005 2009 2011
90nm
process
45nm
process
65nm
process
32nm
process
22nm
process
Hyper-Threading Dual Core Multi Core
20002000 2008+2008+
SPECInt2000SPECInt2000
5X5X
SINGLE-CORESINGLE-CORE
PERFORMANCEPERFORMANCE
DUAL/MULTI-CORE
PERFORMANCE
2007
20042004
20. Teradata MPP Server Architecture
• Nodes
– Incrementally scalable to 1024
nodes
• Operating System
– Linux, Windows, Unix
• Storage
– Independent I/O
– Scales per node
• BYNET Interconnect
– Fully scalable bandwidth
• Connectivity
– Fully scalable
– Channel – ESCON/FICON
– LAN, WAN
• Server Management
– One console to view
the entire system
SMP Node1 SMP Node2 SMP Node3 SMP Node4
Server
Management
Dual BYNET Interconnects
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
CPU1 CPU2
Memory
Operating Sys
21. Shared Nothing - Dividing the Work
• “Virtual processors” (vprocs) do the work
• Two types
– AMP: owns and operates on the data
– PE: handles SQL and external interaction
• Configure multiple vprocs per hardware node
– Take full advantage of SMP CPU and memory
• Each vproc has many threads of execution
– Many operations executing concurrently
– Each thread can do work for any user, transaction
• Software is equivalent regardless of configuration
– No user changes as system grows from small SMP to huge MPP
22. Shared Nothing - Dividing the Work
• Basis of Teradata scalability
– Each AMP owns an equal slice of the disk
– Only that AMP reads that slice
• No single point of control for any operation
– I/O, Buffers, Locking, Logging, Dictionary
– Nothing centralized
– Exponential communication costs avoided
AMPsLogs
Locks
Buffers
I/O
# Nodes
Coordination
cost
Teradata
23. Teradata Data Distribution
• Rows automatically distributed evenly by hash partitioning
– Even distribution results in scalable performance
– Done in real-time as data are loaded, appended, or changed.
– Hash map defined and maintained by the system
• 2**32 hash codes, 64K buckets distributed to AMPs
– Prime Index (PI) column(s) are hashed
– Hash is always the same - for the same values
– No reorgs, repartitioning, space management
Table A Table B Table C
AMP1 AMP2 AMP3 AMP4 ……………………………………………………… AMPn
Primary Index
Teradata Parallel Hash Function
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
P
DM
RowHash (Hash Bucket) Data Fields
24. Disk Capacity Exploding
with Little Increase in Performance
36 GB
5.5
73 GB
6.0
146 GB
6.4
.044
.080
.155
PerformanceperCapacity
MB/Sec/GB
DiskDriveBandwidth(MB/Sec)
1
2
3
4
5
6
7
8
Disk Drive Capacity
25. Platform Change
• Focus used to be
– Optimization of expensive CPU cycles
– Micro-management of precious disk space
• Now
– Manage I/O
– Balance CPU power to the I/O capacity
– Find new ways to optimize I/O, trading for CPU use as necessary
– Pulling 2.5GB/sec per node continuous
• Discontinuity coming
– SSDs become price competitive and reliable
26. File System
• Teradata wrote a new rule book
– Old one written by IBM 35 years ago, used by all mainstream DBMSs today - except Teradata
• File system built of raw slices
• Rows stored in blocks
– Variable length
– Grow and shrink on demand
– Rows located dynamically
• May be moved to reclaim space, defrag
– Maximum block size is configurable
• System default or per table
• 8K to 128K
• Change dynamically
• Indexes are just rows in tables
• Has evolved from direct management of single spindles to completely virtualized storage, not even
knowing spindle location
27. Workload Management Evolution
• 1984 – pure timeshare
• 1987 – 4 priorities, defined by user
• 1995 – multiple priorities in multiple partitions
• 2000 – weighted workload groups
• 2004 – queuing, reserved resources, focus on tactical work
• 2009 – Visualization and detailed workgroup management
• Future – Set service level goals, our job to deliver
28. Active Workload Management
• Manage workloads
– Reduce server congestion
• Dynamically adjust
in-flight task priority
– Turn the dial – change priorities
• Fast active access queries
– Performance, performance,
performance
• Get maximum throughput
Speed
10
Active
Events
Active
Access
Query and
ReportingActive Load
Active Data
Warehouse
Speed
60
Speed
75
Speed
25
30. Availability Requirements
IT, Finance,
Planners, Power
Users,
Data Miners
Executives,
Middles
Managers,
Marketing
1000000
100000
10000
1000
100
10
Consumers
Suppliers
B2B
Operational
Employees
Category Mgr,
Line Managers,
Service Managers
Users
Mission Critical
Dual
Active
Strategic Intelligence Operational Intelligence
31. “Always ON” – An Elusive Challenge
• Unplanned downtime
– Hardware faults
– Software faults
– Hangs
• Planned downtime
– Software upgrade
– Hardware upgrade
– Data center maintenance
• “Disasters”
– Multi-component failures
– Building disasters
– Area disasters
• And optimize resource value to the business
• And avoid hidden costs and surprises
– Eg Major performance variations
• Major opportunity for research – but must be holistic
– Reaches far beyond core database
32. Real time Operational Actions
Strategic
Intelligence
Operational
Intelligence
1. Customer makes
multi-segment
travel reservation
2. Flight rerouted
causing missed
connections.
“Active”
Enterprise Data
Warehouse
3. What are the customers’
flying history?
4. How profitable is each
customer?
5. Which customers
experienced delays or
other problems in last 6
months?
WebSphere MQ,
Oracle AQ,
Microsoft MSMQ
6. Customer re-booked
and notified.
7. Airport operations
adjusted
33. Real Time Customer Management
Strategic
Intelligence
Operational
Intelligence
4. Is this customer
approaching the
predicted loss rate for
their segment?
5. What offers are
available for this
customer?6. Message sent to floor
Luck Ambassador with
customer offer to
prevent additional
losses.
TIBCO
2. What is the customer’s past
spending history in all our
casinos?
3. What is a significant loss
for this person based on
market segment, past and
predicted behavior?“Active”
Enterprise Data
Warehouse
1. Customer inserts
Total Rewards
Card at Slot
Machine
34. That’s a Wrap!
• Business requires a new level of decision making
– Many more decisions by many more people much faster
– Current representation of the state of the enterprise
• Data Warehouse must evolve to support the requirements of Active
Enterprise Intelligence
• Technology must evolve to deal with the new requirements
– Rich area for research and innovation
– Change view of what data warehouse/BI means
• Teradata driving an aggressive roadmap to meet real business
requirements
35.
36. For More Information click below link:
Follow Us on:
http://vibranttechnologies.co.in/teradata-classes-in-mumbai.html
Thank You !!!
Notas del editor
[Enter any extra notes here; leave the item ID line at the bottom]
Avitage Item ID: {{E3648B2F-FB1B-499B-B91B-8871943BA5EE}}
Retail Fraud is a $16 B year problem in the USA alone. With web receipts and better copying capabilities, thieves can make multiple copies of a single receipt and make multiple returns for cash or other merchandise. Or they can bring back shoplifted items and try to exchange for cash.
The problem is that often the associates in Returns department don’t have access to past sales information and can’t keep track easily of returned merchandise. This is especially problematic if the policy is to make returns without receipts.
So the solution is straightforward: hook up the Point of Sale systems so within seconds, the Teradata data warehouse is updated with sales, return, exchange, and void data, and provide the Returns department with the entire history of purchases by that customer,, so they can ensure that a sold product can only be returned once.
<Click>
The impact? Huge, according to one Teradata customer who has already built this system. They stopped a crime ring in the first day of their rollout, a group that had defrauded the company of thousands of dollars. They saw a 100% payback on their investment in just 5 months, and continue to reap the benefits of this example use of Active Enterprise Intelligence.
[Enter any extra notes here; leave the item ID line at the bottom]
Avitage! Item ID: {{33DC1405-7316-423E-B269-8F92054D20CE}}
(CLICK)
In this chart, we have 3 different disk drive sizes, and you can see that per generation, disk drive bandwidth hasn’t increased very much.
(CLICK)
As disk capacities get larger (36 GB 73 GB 146 GB) the performance per capacity ratio (Capacity vs. Disk Bandwidth on right side of chart) declines significantly.
The key metric on this slide is performance per capacity (MB/ SEC/ GB)
Look at this slide! Capacity is doubling, but throughput is diminishing! If you fill all the drives up with data, you will not have enough I/O or bandwidth!
Choosing twice as much storage capacity in a configuration, but not increasing the number of physical disks (to keep I/O constant), will result in performance degradation.
Assuming workloads are categorized, this illustration shows “speed limits” which are actually resource limits for each workload. Each workload is allowed to consume a limited amount of resources at any given time to ensure other workloads get their rightful share.
Dynamic Resource Prioritization
Inside every fully utilized active data warehouse, there’s a major turf battle going on. Each job in the database is engaged in an ongoing struggle for more and more resources for its own work, often competing against other diverse activities. In most databases, these me-first conflicts result in short, resource-light queries falling victim to the heavier jobs. Those batch fraud-detection reports and long-running market share analysis queries essentially take ownership of the database and all it has to give. But Teradata Database lets your specific business needs determine how your precious database resources are divided. Once a definition for equitable sharing of database assets is in place, it automatically controls what percent of the CPU and disk I/O those batch reports and complex queries, as well as those vulnerable short queries, will receive. When there’s a handful of users on the system, Teradata Database spreads available resources out relative to the priorities and assignments that have been made to those particular users, without a single sub-second of CPU being wasted.
Teradata Database has made job scheduling and prioritization of the work a core competency since 1988. And recently, that technology has deepened and matured offering even more flexibility. Teradata’s Priority Scheduler can be used to ensure that the event-driven work coming from the web is allowed to cut into line to grab the CPU it needs to get that promotion back to the client quickly. For example, if the tactical query that comes up with that promotion returns an answer in 1 second when running alone in the database, that same query, if armed with a high Teradata Database priority, can maintain a similar turnaround even if multiple complex inventory adjustment queries begin executing at the same time. For the active data warehouse, it will be critical to keep more resource-hungry complex queries from dominating the resources in the system, starving out the shorter tactical work. Teradata’s Dynamic Workload Manager will play a big role in enabling favored work to be as near to real time as it needs to be.
While no 2 dimensional drawing can accurately portray such complex issues, this graphic frames the discussion around when to move to mission critical and dual active solutions. In general, the type of users often correlates with the population of users. For example, we know that the consumer population for many industries can mean 10 of thousands to millions of possible users via the internet . Similarly, for some industries, the population of supplier employees who access your data warehouse can be enormous, maybe not always in concurrent users but certainly in potential users. At the other end of the spectrum, planning, analysis, and power users tend to be a small community albeit an influential one. In the middle of the graphic we see overlaps of many kinds because line managers (category managers, sales managers, service managers, etc.) often bounce between strategic decisio0ns and operational decisions, with probably more time spent in the operational tasks.
Business critical is not a well defined term in our industry. It tends to mean anything less than mission critical. These users can often tolerate downtime, from a few hours perhaps even an entire day. But many data warehouse sites have become so dependent on the EDW, that they have “hardened” the server, software, and procedures to a mission critical level. This means the executives realize how many decisions are made daily based on BI Tools based reporting that they are willing to fund the project to increase system availability.
Mission critical can begin in the EDW and certainly extends all the way to the end of the graphic. These clients understand that large populations of front line users will demand 24X7 data availability. With operational employees you MIGHT be able to tolerate a 10-20 minute outage every month. It depends very much on the business use of the EDW. As the EDW evolves to larger populations and more operational ACTIVE tasks, outrages become increasingly expensive so additional investments in availability become mandatory. In some cases, an active data warehouse begins being so critical to the operational employee that it becomes necessary to step up to a dual active configuration. This is particularly true in retail with 100s of concurrent employees and suppliers using the data, but it may also occur with large call centers or sales staff.
Finally, we hope it is obvious that when consumers gain access to the data warehouse, it is typically for eCommerce purchasing. No downtime is tolerated in this case because the loss of revenue cannot be tolerated.
Problem:
Lack of ability to track customer gaming behavior and Comp redemption.
No mechanism to communicate or react to specific behaviors and trends
Solution:
Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata.
The player profile is accessed and it is determined if the casino should make personal contact with that player.
Allows Harrah’s to provide real-time offers to customers at each gaming point
Enables Harrah’s to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to “over-comp” guests.
Future:
“Marketing At The Slots” initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications.
This will drive CRM to a new “real-time” level allowing interaction with the customer while they are gaming.
Problem:
Lack of ability to track customer gaming behavior and Comp redemption.
No mechanism to communicate or react to specific behaviors and trends
Solution:
Player Contact System - when a patron swipes his/her card at a casino that information is sent to Teradata.
The player profile is accessed and it is determined if the casino should make personal contact with that player.
Allows Harrah’s to provide real-time offers to customers at each gaming point
Enables Harrah’s to track the redemption of any comp provided to a guest as the comp is redeemed or partially redeemed. Allows them not to “over-comp” guests.
Future:
“Marketing At The Slots” initiative. This implementation has a BusinessWorks process receiving inbound card-swipes from the Slot Data System and building an EDW query. It then makes a Request/Reply call to Teradata to solicit and compile an XML message which is then published back out on the TIB for consumption by other applications.
This will drive CRM to a new “real-time” level allowing interaction with the customer while they are gaming.