SlideShare una empresa de Scribd logo
1 de 66
The Power of Big Data
Tim Wiles, Iain Batty and David Turnbull
31 January 2014
Outline
• Big data
Iain Batty
• NoSQL: The future of data storage?
Tim Wiles

• Data security
David Turnbull
Big Data
What is Data?
• Data is everywhere
• You have more than you think
• It’s your biggest asset
So What Is “Big” Data?
• Many Definitions
• Study by Ward & Barker of St Andrews
• “Big data is a term describing the storage
and analysis of large and or complex data
sets using a series of techniques
including, but not limited to: NoSQL, Map
Reduce and machine learning.”
So What Is “Big” Data?
• We have a huge amount of data:
– 90% of data was created in the last two years
– 2.5 Exabyte's (2.5×1018) of data created
every day

• Data Analysis on a huge scale
THANKS TO BIG DATA…
How and Why Big Data is Used
• Healthcare
• Scientific Research
(Folding@Home, SETI)
• Market Research
• Business Operation Optimisation
Why to use Big Data
• Investigative and Predictive
• Increasing amount of public data access
• Enables high level understanding of previously
unfathomable datasets
Why Not to Use Big Data
• Expensive
• Limited Pool of talent
• Not always applicable
• Must be used correctly: Correlation does not mean
causation

...yaaaarrrr?!
Conclusion
• Big Data technologies may or may not be
right for you
• But the principles are universal:
– Gather your data
– Use novel new sources such as Social Media
and public data initiatives
– Analyse it intelligently
NoSQL: The future of data storage?
20 years ago…

Hard drives ~ 500 MB

Floppy disks ~ 1.44 MB

Modems ~ 28-56 Kbps

Digital cameras emerging

BBC front page (1996): bit.ly/Kc6ojz
Today…

BBC front page (today): bbc.in/18lsxlx
Data storage
Relational (SQL)

NoSQL

Highly structured

Flexible structure

Single type

Many types

£

££££
Which horse do you back?
vs
VHS

Betamax
vs
HD-DVD

Blu-ray
Flavours of NoSQL
Amazon Dynamo

HBase

Key-value

Column

Apache Cassandra

Google BigTable

CouchDB

AllegroGraph

Document

Graph

MongoDB

Neo4j
Comparing the options
• Right tool for the job?
• Relational database →
Can be adapted.

• NoSQL database →
Specialised problem
solving.
Relational Database
EmployeeID

Employee

PayID

Payment Method

1

Tim Wiles

1

Salaried

2

Iain Batty

2

Ad Hoc

3

David Turnbull

3

Digestive Biscuits

EmployeeID

PayID

1

3

2

1

3

1
Key-value stores

Key

Value

teh

the

hlelo

hello

edn

end

tol

tool
…
Column stores

Item Name

Number Of Sales

Total Cost (£)

Total Revenue (£)

Origin

Orange Juice

152,000

76,000

152,000

Spain

Apple Juice

137,000

54,800

123,300

UK

Pineapple Juice

63,000

37,800

78,750

Brazil

Grape Juice

84,000

46,200

92,400

Spain
Column stores

Item Name

Number Of Sales

Total Cost (£)

Total Revenue (£)

Origin

Orange Juice

152,000

76,000

152,000

Spain

Apple Juice

137,000

54,800

123,300

UK

Pineapple Juice

63,000

37,800

78,750

Brazil

Grape Juice

84,000

46,200

92,400

Spain

= 436,000
Column stores

Item Name

Number Of Sales

Total Cost (£)

Total Revenue (£)

Origin

Orange Juice

152,000

76,000

152,000

Spain

Apple Juice

137,000

54,800

123,300

UK

Pineapple Juice

63,000

37,800

78,750

Brazil

Grape Juice

84,000

46,200

92,400

Spain
Column stores

Item Name

Number Of Sales

Total Cost (£)

Total Revenue (£)

Origin

Orange Juice

152,000

76,000

152,000

Spain

Apple Juice

137,000

54,800

123,300

UK

Pineapple Juice

63,000

37,800

78,750

Brazil

Grape Juice

84,000

46,200

92,400

Spain

Profit = £231,650
Document stores
Document stores
Document stores
Company

Location 1

City:
Durham

Employee
List

Employee 1

Name:
Tim Wiles

Age:
26

Location 2

City:
London

Employee 2

Start Date:
31/03/2013

Name:
David
Turnbull

Age:
27
Graph stores
Enemy

“Friend”
Case Study: Middle Earth University
Introduction to
Alchemy
Wed 11AM

Advanced
Alchemy
Wed 1PM

World Domination
Wed 9AM

Introduction to
Magic
Wed 11AM

Advanced Magical
Techniques
Wed 9AM
Case Study: Middle Earth University

Advanced Magical
Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to
Alchemy
Wed 11AM

Advanced
Alchemy
Wed 1PM

World Domination
Wed 9AM

Introduction to
Magic
Wed 11AM

Advanced Magical
Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to
Alchemy
Wed 11AM

All courses running at
11AM on Wednesday

Introduction to
Magic
Wed 11AM
Case Study: Middle Earth University
Introduction to
Alchemy
Wed 11AM

Advanced
Alchemy
Wed 1PM

World Domination
Wed 9AM

Introduction to
Magic
Wed 11AM

Advanced Magical
Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to
Alchemy
Wed 11AM

Advanced
Alchemy

BMag

Wed 1PM

World Domination
Wed 9AM

Introduction to
Magic
Wed 11AM

Advanced Magical
Techniques
Wed 9AM

MMag

DMag
Case Study: Middle Earth University

Advanced
Alchemy
Wed 1PM

DMag
Advanced Magical
Techniques
Wed 9AM
Case Study: Middle Earth University

Advanced
Alchemy
Wed 1PM

DMag
Advanced Magical
Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to
Alchemy

Shire Lecture Hall

Wed 11AM

Advanced
Alchemy
Wed 1PM

Mordor Seminar
Room

World Domination
Wed 9AM

BMag

Introduction to
Magic
Wed 11AM

MMag

Advanced Magical
Techniques
Wed 9AM

DMag
Case Study: Middle Earth University
Introduction to
Alchemy
Wed 11AM

Advanced
Alchemy
Wed 1PM

Mordor Seminar
Room
Is NoSQL for everyone?
• Most businesses functioning effectively using
only relational databases.
• Not the grand solution to all data storage
problems.
• Train or employ → NoSQL knowledge.
However…
NoSQL is showing significant promise for
certain aspects of almost any business.
Reasons to use NoSQL in your
business
• Potential significant financial savings.

• Easy to adapt stored data as your business
grows and your priorities change.
• Exceeding the performance of popular
commercial relational databases.
Reasons to use NoSQL in your
business

Effective tool for a holistic approach to
analysing the growth/status of your business.
Reasons to use NoSQL in your
business

Relational databases are
not the only solution to
your data storage
problems.
Data Security
Why Is Data Security Important
• The cost of a data breach is continuing to rise
• Fewer customers remain loyal after a data
breach
• Reputation losses and diminished goodwill – lost
business cost has steadily increase over the last
6 years (£500 thousand in 2007)
• Malicious or criminal attacks are the most costly
The Cost Of a Data Breach

2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
The Cost Of a Data Breach

2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
The Cost Of a Data Breach

2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
The Causes Of a Data Breach

2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
Current Methods Of
Authentication
1. Basic User Name and Passwords

2. Biometrics
•
•
•
•

Fingerprint Scanners
Voice recognition
Face scanning and recognition
Retina and iris scans

3. Multi-Factor Authentication
• Something possessed, as in a physical token or telephone
• Something known, such as a password or mother’s maiden
name
• Something inherent, like a biometric trait
Pros and Cons Of These
Methods
1. Standard Username and Password
authentication is extremely vulnerable to
Rainbow Attacks

2. Relies on the ability of the system users
to pick secure passwords
Adobe Crossword
Pros and Cons Of These
Methods
• In theory biometrics is a great way to authenticate a user. Its impossible
to lose your finger prints, unless you have both your hands chopped off.
The Best Solution
• Multi-factor Authentication. A security measure that requires two or more
kinds of evidence that you are who you say you are.
• Authentication requires a combination of these bits of evidence rather
than simply using one or the other.
• Something you know – Username, Password
• Something you have – An RSA Key, Credit Card

• Something inherent – A fingerprint, retina scan
• Multi-factor Authentication is very secure, but it is hard to implement
everywhere.
• Requires users to remember to carry their RSA keys with them.
Emerging Methods Of
Authentication
• YubiKey – Authentication method based on a unique physical token which
cannot be duplicated or recorded, providing a credential based on
something only an authorised user possesses.
• Can also be used with password managers such as LastPass
How Does YubiKey Work?
Quantum Cryptography
What is Quantum Cryptography?
The use of quantum mechanical effects to perform cryptographic tasks
or to break cryptographic systems

What does that mean exactly?
• Using physics rather than mathematics to perform cryptographic
tasks, such as generating cryptographic keys
• Moreover Quantum Cryptography addresses the problem of Key
distribution
Quantum Cryptography
How does it work?
• It works by using a technique called Quantum Key Distribution (QKD).
QKD enables two parties to produce a shared random secret key which is
only known to them. They can then use this key to encrypt and decrypt
messages passed between those parties.

• Keys are generated by using photons, which are produced using LEDS.
These photons are then polarised using polarising filters and then
transmitted
• The two parties decide on what filters are going to be used, and also
assign a value, usually a binary value to each photon which has a certain
polarisation
• When the whole transmission has happened a unique key has been
produced
Quantum Cryptography
What is the benefits of using Quantum Cryptography?
• An important property of quantum cryptography is the ability to detect the
presence of a third party attempting to eavesdrop on the transmission of
the secret key
• This is achieved because of a fundamental principle of quantum
mechanics – the process of measuring a quantum system in general
disturbs the system.
Questions
References
1. http://www.technologyreview.com/view/519851/th
e-big-data-conundrum-how-to-define-it/
2. http://en.wikipedia.org/wiki/Big_data#cite_note-15
3. 2013 Cost Of Data Breach Study: United Kingdom
[Ponemon Institute, May 2013]
4. http://www.yubico.com/products/yubikeyhardware/yubikey/technical-description/
5. https://wiki.archlinux.org/index.php/yubikey#How_
does_it_work
Upcoming Seminars
• Capturing the Real Value of IT Service
Management- Friday14th February
• Preparing for BYOD & Mobile Device
Management- Friday 28th February

Más contenido relacionado

Más de Waterstons Ltd

Mobile device management and byod – major players
Mobile device management and byod – major playersMobile device management and byod – major players
Mobile device management and byod – major players
Waterstons Ltd
 

Más de Waterstons Ltd (12)

Mobile device management and byod – major players
Mobile device management and byod – major playersMobile device management and byod – major players
Mobile device management and byod – major players
 
North East Change Management Network- Organising building information so it i...
North East Change Management Network- Organising building information so it i...North East Change Management Network- Organising building information so it i...
North East Change Management Network- Organising building information so it i...
 
North East Change Management Network- Changing from paper based health record...
North East Change Management Network- Changing from paper based health record...North East Change Management Network- Changing from paper based health record...
North East Change Management Network- Changing from paper based health record...
 
Mobile device management and BYOD – simple changes, big benefits
Mobile device management and BYOD – simple changes, big benefitsMobile device management and BYOD – simple changes, big benefits
Mobile device management and BYOD – simple changes, big benefits
 
How to Achieve Unified Communications Success
How to Achieve Unified Communications SuccessHow to Achieve Unified Communications Success
How to Achieve Unified Communications Success
 
‘Joining the dots’ of your applications and systems – the benefits of Integra...
‘Joining the dots’ of your applications and systems – the benefits of Integra...‘Joining the dots’ of your applications and systems – the benefits of Integra...
‘Joining the dots’ of your applications and systems – the benefits of Integra...
 
Consumer Experiences for Enterprise
Consumer Experiences for EnterpriseConsumer Experiences for Enterprise
Consumer Experiences for Enterprise
 
Capturing the Real Value of IT Service Management
Capturing the Real Value of IT Service ManagementCapturing the Real Value of IT Service Management
Capturing the Real Value of IT Service Management
 
Mobile for business
Mobile for businessMobile for business
Mobile for business
 
Messaging: Harnessing The Cloud
Messaging: Harnessing The CloudMessaging: Harnessing The Cloud
Messaging: Harnessing The Cloud
 
Messaging:Protecting your Data and your Reputation
Messaging:Protecting your Data and your ReputationMessaging:Protecting your Data and your Reputation
Messaging:Protecting your Data and your Reputation
 
Messaging: Zero Downtime Communications
Messaging:  Zero Downtime CommunicationsMessaging:  Zero Downtime Communications
Messaging: Zero Downtime Communications
 

Último

Último (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

The Power of Big Data

  • 1. The Power of Big Data Tim Wiles, Iain Batty and David Turnbull 31 January 2014
  • 2. Outline • Big data Iain Batty • NoSQL: The future of data storage? Tim Wiles • Data security David Turnbull
  • 4. What is Data? • Data is everywhere • You have more than you think • It’s your biggest asset
  • 5. So What Is “Big” Data? • Many Definitions • Study by Ward & Barker of St Andrews • “Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, Map Reduce and machine learning.”
  • 6. So What Is “Big” Data? • We have a huge amount of data: – 90% of data was created in the last two years – 2.5 Exabyte's (2.5×1018) of data created every day • Data Analysis on a huge scale
  • 7. THANKS TO BIG DATA…
  • 8.
  • 9.
  • 10. How and Why Big Data is Used • Healthcare • Scientific Research (Folding@Home, SETI) • Market Research • Business Operation Optimisation
  • 11. Why to use Big Data • Investigative and Predictive • Increasing amount of public data access • Enables high level understanding of previously unfathomable datasets
  • 12. Why Not to Use Big Data • Expensive • Limited Pool of talent • Not always applicable • Must be used correctly: Correlation does not mean causation ...yaaaarrrr?!
  • 13. Conclusion • Big Data technologies may or may not be right for you • But the principles are universal: – Gather your data – Use novel new sources such as Social Media and public data initiatives – Analyse it intelligently
  • 14. NoSQL: The future of data storage?
  • 15. 20 years ago… Hard drives ~ 500 MB Floppy disks ~ 1.44 MB Modems ~ 28-56 Kbps Digital cameras emerging BBC front page (1996): bit.ly/Kc6ojz
  • 16. Today… BBC front page (today): bbc.in/18lsxlx
  • 17. Data storage Relational (SQL) NoSQL Highly structured Flexible structure Single type Many types £ ££££
  • 18. Which horse do you back?
  • 21. Flavours of NoSQL Amazon Dynamo HBase Key-value Column Apache Cassandra Google BigTable CouchDB AllegroGraph Document Graph MongoDB Neo4j
  • 22. Comparing the options • Right tool for the job? • Relational database → Can be adapted. • NoSQL database → Specialised problem solving.
  • 23. Relational Database EmployeeID Employee PayID Payment Method 1 Tim Wiles 1 Salaried 2 Iain Batty 2 Ad Hoc 3 David Turnbull 3 Digestive Biscuits EmployeeID PayID 1 3 2 1 3 1
  • 25. Column stores Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin Orange Juice 152,000 76,000 152,000 Spain Apple Juice 137,000 54,800 123,300 UK Pineapple Juice 63,000 37,800 78,750 Brazil Grape Juice 84,000 46,200 92,400 Spain
  • 26. Column stores Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin Orange Juice 152,000 76,000 152,000 Spain Apple Juice 137,000 54,800 123,300 UK Pineapple Juice 63,000 37,800 78,750 Brazil Grape Juice 84,000 46,200 92,400 Spain = 436,000
  • 27. Column stores Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin Orange Juice 152,000 76,000 152,000 Spain Apple Juice 137,000 54,800 123,300 UK Pineapple Juice 63,000 37,800 78,750 Brazil Grape Juice 84,000 46,200 92,400 Spain
  • 28. Column stores Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin Orange Juice 152,000 76,000 152,000 Spain Apple Juice 137,000 54,800 123,300 UK Pineapple Juice 63,000 37,800 78,750 Brazil Grape Juice 84,000 46,200 92,400 Spain Profit = £231,650
  • 31. Document stores Company Location 1 City: Durham Employee List Employee 1 Name: Tim Wiles Age: 26 Location 2 City: London Employee 2 Start Date: 31/03/2013 Name: David Turnbull Age: 27
  • 33. Case Study: Middle Earth University Introduction to Alchemy Wed 11AM Advanced Alchemy Wed 1PM World Domination Wed 9AM Introduction to Magic Wed 11AM Advanced Magical Techniques Wed 9AM
  • 34. Case Study: Middle Earth University Advanced Magical Techniques Wed 9AM
  • 35. Case Study: Middle Earth University Introduction to Alchemy Wed 11AM Advanced Alchemy Wed 1PM World Domination Wed 9AM Introduction to Magic Wed 11AM Advanced Magical Techniques Wed 9AM
  • 36. Case Study: Middle Earth University Introduction to Alchemy Wed 11AM All courses running at 11AM on Wednesday Introduction to Magic Wed 11AM
  • 37. Case Study: Middle Earth University Introduction to Alchemy Wed 11AM Advanced Alchemy Wed 1PM World Domination Wed 9AM Introduction to Magic Wed 11AM Advanced Magical Techniques Wed 9AM
  • 38. Case Study: Middle Earth University Introduction to Alchemy Wed 11AM Advanced Alchemy BMag Wed 1PM World Domination Wed 9AM Introduction to Magic Wed 11AM Advanced Magical Techniques Wed 9AM MMag DMag
  • 39. Case Study: Middle Earth University Advanced Alchemy Wed 1PM DMag Advanced Magical Techniques Wed 9AM
  • 40. Case Study: Middle Earth University Advanced Alchemy Wed 1PM DMag Advanced Magical Techniques Wed 9AM
  • 41. Case Study: Middle Earth University Introduction to Alchemy Shire Lecture Hall Wed 11AM Advanced Alchemy Wed 1PM Mordor Seminar Room World Domination Wed 9AM BMag Introduction to Magic Wed 11AM MMag Advanced Magical Techniques Wed 9AM DMag
  • 42. Case Study: Middle Earth University Introduction to Alchemy Wed 11AM Advanced Alchemy Wed 1PM Mordor Seminar Room
  • 43. Is NoSQL for everyone? • Most businesses functioning effectively using only relational databases. • Not the grand solution to all data storage problems. • Train or employ → NoSQL knowledge.
  • 45. NoSQL is showing significant promise for certain aspects of almost any business.
  • 46. Reasons to use NoSQL in your business • Potential significant financial savings. • Easy to adapt stored data as your business grows and your priorities change. • Exceeding the performance of popular commercial relational databases.
  • 47. Reasons to use NoSQL in your business Effective tool for a holistic approach to analysing the growth/status of your business.
  • 48. Reasons to use NoSQL in your business Relational databases are not the only solution to your data storage problems.
  • 50. Why Is Data Security Important • The cost of a data breach is continuing to rise • Fewer customers remain loyal after a data breach • Reputation losses and diminished goodwill – lost business cost has steadily increase over the last 6 years (£500 thousand in 2007) • Malicious or criminal attacks are the most costly
  • 51. The Cost Of a Data Breach 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
  • 52. The Cost Of a Data Breach 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
  • 53. The Cost Of a Data Breach 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
  • 54. The Causes Of a Data Breach 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
  • 55. Current Methods Of Authentication 1. Basic User Name and Passwords 2. Biometrics • • • • Fingerprint Scanners Voice recognition Face scanning and recognition Retina and iris scans 3. Multi-Factor Authentication • Something possessed, as in a physical token or telephone • Something known, such as a password or mother’s maiden name • Something inherent, like a biometric trait
  • 56. Pros and Cons Of These Methods 1. Standard Username and Password authentication is extremely vulnerable to Rainbow Attacks 2. Relies on the ability of the system users to pick secure passwords Adobe Crossword
  • 57. Pros and Cons Of These Methods • In theory biometrics is a great way to authenticate a user. Its impossible to lose your finger prints, unless you have both your hands chopped off.
  • 58. The Best Solution • Multi-factor Authentication. A security measure that requires two or more kinds of evidence that you are who you say you are. • Authentication requires a combination of these bits of evidence rather than simply using one or the other. • Something you know – Username, Password • Something you have – An RSA Key, Credit Card • Something inherent – A fingerprint, retina scan • Multi-factor Authentication is very secure, but it is hard to implement everywhere. • Requires users to remember to carry their RSA keys with them.
  • 59. Emerging Methods Of Authentication • YubiKey – Authentication method based on a unique physical token which cannot be duplicated or recorded, providing a credential based on something only an authorised user possesses. • Can also be used with password managers such as LastPass
  • 61. Quantum Cryptography What is Quantum Cryptography? The use of quantum mechanical effects to perform cryptographic tasks or to break cryptographic systems What does that mean exactly? • Using physics rather than mathematics to perform cryptographic tasks, such as generating cryptographic keys • Moreover Quantum Cryptography addresses the problem of Key distribution
  • 62. Quantum Cryptography How does it work? • It works by using a technique called Quantum Key Distribution (QKD). QKD enables two parties to produce a shared random secret key which is only known to them. They can then use this key to encrypt and decrypt messages passed between those parties. • Keys are generated by using photons, which are produced using LEDS. These photons are then polarised using polarising filters and then transmitted • The two parties decide on what filters are going to be used, and also assign a value, usually a binary value to each photon which has a certain polarisation • When the whole transmission has happened a unique key has been produced
  • 63. Quantum Cryptography What is the benefits of using Quantum Cryptography? • An important property of quantum cryptography is the ability to detect the presence of a third party attempting to eavesdrop on the transmission of the secret key • This is achieved because of a fundamental principle of quantum mechanics – the process of measuring a quantum system in general disturbs the system.
  • 65. References 1. http://www.technologyreview.com/view/519851/th e-big-data-conundrum-how-to-define-it/ 2. http://en.wikipedia.org/wiki/Big_data#cite_note-15 3. 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013] 4. http://www.yubico.com/products/yubikeyhardware/yubikey/technical-description/ 5. https://wiki.archlinux.org/index.php/yubikey#How_ does_it_work
  • 66. Upcoming Seminars • Capturing the Real Value of IT Service Management- Friday14th February • Preparing for BYOD & Mobile Device Management- Friday 28th February

Notas del editor

  1. -There’s been a data revolution – its everywhere-Huge variety of sources -News Aggregators, Social Media, Search Engines, Internet -New Government initiatives (data.gov.uk) -Public data -Geographical, weather, transport, literature, historical records-You have more than you think: -Customer Information -Sales and financial data -Employee data -Stock -Intellectual property -Logistics -Sensors (Internet of Things)-Biggest Asset-Has monetary value -Can be used in a huge variety of ways to improve business
  2. -Good Question-As many definitions as people you ask-Jonathan Ward + Adam Barker + St Andrews University did a study-Asked big venders-Oracle “relational + unstructured data combined for Business Intelligence”-Microsoft “applying Artificial Intelligence and distributed computing to large datasets”-Most definitions technology focused-Vague-Study concluded with *click for quote
  3. -Huge amounts of data nowadays-Need new techniques to analyse and store itDid you know:-90% of data was generated in the last 2 years-2.5 Exabytes (more than 2.7 trillion megabytes OR 17,179,869 iPod classics - 160gb)-Data analysis on a huge scale
  4. -Healthcare: Predict trends in diseases and effectiveness of treatment, e.g. UK Biobank – collected medical, lifestyle and geographical data of 500,000 people to find what causes developments of major diseases, and the effectiveness of different treatments on them-Scientific Research: Folding@Home and SETI@Home-Market Research: -Billion Prices Project MIT -Twitter Sentiment -Google Analytics-Business Operation Analysis and Optimisation: -Tesco predict stock -FedEx package tracking and logistics optimisation -Amazon stock layout optimisation-Advertising:GoogleAdWords and Facebook Ad Audiences Hadoop/MapReduceNoSQLDistributed Computing and Virtualisation
  5. Hardware/expertise expensiveFew Big Data specialists (but growing)Not always the right tool (do you have “BIG DATA”?)Causation: Remember the pirates?...
  6. Key Points:You have more data than you thinkYou can do a lot more with it than you thinkSo:-Gather data on you and your customers-Use the analytical approach of Big Data to make informed business decisions
  7. Bill Clinton in office, AyrtonSenna died in an accident during the San Marino Grand Prix, China got its first connection to the internet and The Lion King was released into the cinema.
  8. To guarantee reliability.ACID Atomicity requires that each transaction is "all or nothing”.The consistency property ensures that any transaction will bring the database from one valid state to another.The isolation property ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i.e. one after the other.Durability means that once a transaction has been committed, it will remain so.BASE (Basically Available, Soft state, Eventual consistency)
  9. Twitter: Apache Cassandra - Our geo team uses it to store and query their database of places of interest. The research team uses it to store the results of data mining done over our entire user base. Those results then feed into things like @toptweets and local trends. Our analytics, operations and infrastructure teams are working on a system that uses cassandra for large-scale real time analytics for use both internally and externally.Facebook: Hbase – Messaging platform introduced at the end of 2010. Can deal with very high throughputs.BBC: CouchDB -The BBC is building a new environment that allows cost-effective building of dynamic content platforms.Theguardian: MongoDB - Storage of articles.NASA: Allegrograph - Storing assets and being able to provide a meaningful search through links between a variety of different kinds of assets from software to drawings to documents to employee skills.
  10. This is a very simple method of storing data.Unique key and data associated with the key.Inherent expectation of being distributed over many machines -> Highly available data stores, minimal downtime.
  11. Data is stored by column, rather than by row. Ideal for sparsely populated databases. Large reductions in the storage requirements.Very good for finding aggregate values.PIVOT TABLE!!!Use for your business, a single database containing anything purchased by your company over the past year. Quickly do analysis, such as average spend per employee or total amount spent by a particular department. Rather than pull in selected data from all over your business into a data warehouse in order to do analysis, you could store all the varied information in a single data store and run all kinds of analysis.
  12. Data is stored by column, rather than by row. Ideal for sparsely populated databases. Large reductions in the storage requirements.Very good for finding aggregate values.PIVOT TABLE!!!Use for your business, a single database containing anything purchased by your company over the past year. Quickly do analysis, such as average spend per employee or total amount spent by a particular department. Rather than pull in selected data from all over your business into a data warehouse in order to do analysis, you could store all the varied information in a single data store and run all kinds of analysis.
  13. Data is stored by column, rather than by row. Ideal for sparsely populated databases. Large reductions in the storage requirements.Very good for finding aggregate values.PIVOT TABLE!!!Use for your business, a single database containing anything purchased by your company over the past year. Quickly do analysis, such as average spend per employee or total amount spent by a particular department. Rather than pull in selected data from all over your business into a data warehouse in order to do analysis, you could store all the varied information in a single data store and run all kinds of analysis.
  14. Data is stored by column, rather than by row. Ideal for sparsely populated databases. Large reductions in the storage requirements.Very good for finding aggregate values.PIVOT TABLE!!!Use for your business, a single database containing anything purchased by your company over the past year. Quickly do analysis, such as average spend per employee or total amount spent by a particular department. Rather than pull in selected data from all over your business into a data warehouse in order to do analysis, you could store all the varied information in a single data store and run all kinds of analysis.
  15. A more holistic approach to storing data.Documents with varying data and structures can be kept together.You are not penalised as your business grows and your data model changes.Great for storing your business assets.FILING CABINET.Get a picture of a medical store.
  16. A more holistic approach to storing data.Documents with varying data and structures can be kept together.You are not penalised as your business grows and your data model changes.Great for storing your business assets.FILING CABINET.Get a picture of a medical store.
  17. A more holistic approach to storing data.Documents with varying data and structures can be kept together.You are not penalised as your business grows and your data model changes.Great for storing your business assets.FILING CABINET.Get a picture of a medical store.
  18. Where the links between data (edges) become as important as the data itself (nodes). Specialised data stores, particularly suited to social networks. If it is important to know exactly the relationship between one piece of data and another, this may be the solution to your problem.Inherent value in the links  state can change.Unknown link.
  19. Distribute your database over a number (lower cost) machines -> ‘always on’ solution. Reduces downtime and, hence, risk.
  20. Id Quantique – Swiss Company developed a machine which was used in the Swiss parliamentary election which was used to securely pass results of the election. This was done by using quantum cryptography.It works by using a technique called Quantum Key Distribution (QKD). QKD enables two parties to produce a shared random secret key which is only known to them. They can then use this key to encrypt and decrypt messages passed between those parties.Keys are generated by using Photos, which are produced using LEDS. These Photons are then polarised using polarising filtersAn important property of Quantum cryptography is the ability to detect the presence of a third party who is attempting to eavesdrop on the transmission of the secret key, thus being able to encrypt and decrypt messages themselves.However, a fundamental principle of quantum mechanics – the process of measuring a quantum system in general disturbs the system.