Se ha denunciado esta presentación.

Structuring Big Data

1

Compartir

Próximo SlideShare
Big data by Mithlesh sadh
Big data by Mithlesh sadh
Cargando en…3
×
1 de 12
1 de 12

Structuring Big Data

1

Compartir

Mark Wilson's lightning talk at the London Cloud Camp on 25 January 2012 about using linked data to integrate data silos in the world of big data

Mark Wilson's lightning talk at the London Cloud Camp on 25 January 2012 about using linked data to integrate data silos in the world of big data

Más Contenido Relacionado

Libros relacionados

Gratis con una prueba de 14 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 14 días de Scribd

Ver todo

Structuring Big Data

  1. 1. Structuring big data Mark Wilson January 2012 #CloudCamp UNCLASSIFIED © Copyright 2012 Fujitsu Services Limited
  2. 2. The problem with big data: and a solution The problem:  “New reference architectures will include both big data and enterprise data warehouses” [IDC, 19 January 2012]  Two worlds: structured and unstructured data (plus external data sources, documents stored in structured databases, etc.)  Siloes create issues with management, integration, etc. The solution:  Linked data – a single reference point for all data in the enterprise #CloudCamp 1 UNCLASSIFIED
  3. 3. Some history Fixed structure  Difficult to change schema Simple reporting capabilities  Complex to create new reports #CloudCamp 2 UNCLASSIFIED
  4. 4. Some history Completed transactions transferred to separate database for analysis  “Data warehouse” Better reporting, data mining, etc.  Still highly structured Data is historical  May be aggregated #CloudCamp 3 UNCLASSIFIED
  5. 5. The smart guys Real-time update of completed transactions  Transactions moved to data warehouse upon completion  Smaller transactional database Allows for alerts to be generated when specific conditions met and action taken #CloudCamp 4 UNCLASSIFIED
  6. 6. A third “data silo” Masses of unstructured/semi- structured data being processed in NoSQL databases May, or may not be transferred to/from structured databases  Time-consuming and inefficient Three types of data, each with their own limitations and own management considerations #CloudCamp 5 UNCLASSIFIED
  7. 7. Data everywhere! #CloudCamp 6 UNCLASSIFIED
  8. 8. Linked Data Tie records together – even from separate data sets We can express as triples with a specific grammar: Build up a graph to show machine-readable data in human form #CloudCamp 7 UNCLASSIFIED
  9. 9. Then add lots more data… Source: http://lod-cloud.net/  Each node is itself another graph (zoom in) #CloudCamp 8 UNCLASSIFIED
  10. 10. Aren’t we missing a trick? Use linked data as a the optimal reference source  Broker of all data sources Single view on structured and unstructured data  Bring in external sources too Mapping, interconnecting, indexing and feeding  In real time Query linked data to derive new value from old  Infer relationships  Gain new insights #CloudCamp 9 UNCLASSIFIED
  11. 11. About the author Mark Wilson, Strategy Manager, Fujitsu Mark is an analyst working within Fujitsu’s UK and Ireland Office of the CTO, providing thought leadership both internally and to customers, shaping business and technology strategy. He has 17 years' experience of working in the IT industry, 12 of which have been with Fujitsu. Mark has a background in leading large IT infrastructure projects with customers in the UK, mainland Europe and Australia. He has a degree in Computer Studies from the University of Glamorgan. Mark is also active in social media and won the Individual IT Professional (Male) award in the 2010 Computer Weekly IT Blog Awards. Mark may be found on Twitter @markwilsonit. If you would like to comment on the topics in this presentation, Mark would welcome your feedback, by email to mark.a.wilson@uk.fujitsu.com.

Notas

  • Everyone’s talking about big data but the bulk of the conversation seems to focus on a new level of business intelligence and an ever-increasing volume of data organised into OLTP, OLAP and NoSQLsiloes.  In this talk, Mark Wilson puts forward a view that the real value is not from the big data itself but how we can employ linked data concepts to integrate structured, unstructured and semistructured data sets – and then use this unified data source to derive new value.
  • ×