SlideShare a Scribd company logo
1 of 4
Download to read offline
healthDB: A Primer
                                  Parag Patel, Shahid Shah



                                        Overview

healthDB is a incrementally scalable, fault-tolerant, ACID compliant, key/value
document based database designed to hold huge amounts of data and has high
throughput read/writes and high availability. It is based on an open source project
called couchDB. It is designed to be a data warehouse for the disparate systems that
might be part of a healthcare practice or hospital. Due to it!s semi-structured data
storage nature, it can hold data of any type. The end user need not worry about
structuring the data in the data warehouse; the data will be stored in the warehouse for
future extraction and structuring as the user sees fit. Future versions of healthDB will
help the end user structure data from the semi-structured state it is in. Conceptually
one can think of lazy evaluation in scheme, lisp, haskell. Once the user knows the
structure they want to put the data in, it will be a cinch to implement the structure in
healthDB.

The design of database encompasses a “just works” philosophy. The database should
work as advertised. The end user should only have to worry about building their
application or service, instead of worrying about the storage of there data and
performance. Most of the traditional work that a DBA has done will be done by
healthDB. All the end user has to do is start it up initially and add additional servers as
the healthDB will dictate in order scale. HealthDB will have a connector engine, that will
connect to common interfaces such as HL7, JMS, ODBC, various delimited file formats,
and has the ability to develop custom connectors to connect to unusual interfaces.
HealthDB will support in the future health query language (HQL) (as an external or
internal component tbd), will allow them to search all their structured and semi-
structured data to find knowledge they seek in a health domain. HealthDB will come
with some sample applications to show end users just the power it holds.

                                      Architecture

healthDB uses couchDB to primarily take care of the low level storage. It
communicates to couchDB (couchDB might need to be modified for encryption) using
encrypted REST. A diagram shows the basic outline of healthDB.
healthDB



                                healthDB engine




                                    couchDB




The healthDB engine is the main control unit of the healthDB. It has a job of ensuring
the user can store data in a seamless fashion. It takes care of such task as automatic
partitioning, replication, encryption of the data, automatic load balancing, automatic
system backup, error logging.



                                   healthDB engine

The healthDB engine is made of up various components such as the partitioner,
replicator, connector engine, healthCPU, security, and healthDB API (healthSearch will
be additional component, it is undetermined whether it should sit in the healthDB engine
or couchdB. We shall look at each component of the engine briefly. Note: additional
components maybe added, components maybe merged or deleted.

healthDB API

Provides the healthDB interface to the outside world. It will be the only way to
communicate with the database, Multiple API should be developed such as python,
ruby, java, C#, REST.

connector engine

This connector engine allows data from a variety of different formats to be converted to
a format that healthDB can understand while preserving integrity.
healthCPU

This is the brain of the healthDB database. It controls when the healthDB should
replicate data and when it should partition data. It does the job of the looking up data in
the datastore (couchDB), formatting, structuring, and semi-structuring data that will be
stored in the datastore. It ensures that data HIPPA compliant, by having he security
component encrypt it. HealthCPU also maintains which nodes are alive and what the
status is. It does the job of load balancing. Filters out data based on the users
permissions.

security

This performs the encryption, authentication, and tells the healthCPU the user has
permission to certain data or not.

replicator

Creates a new database replication based on what the healthCPU tells it.

partitioner

Creates new partitions on the data and places the data on server(s) the healthCPU
specifies.

Diagram of the healthDB engine below.




                           healthDB engine


                             healthDB API



                           connector engine



                              healthCPU



                               security



              replicator                      partitioner
Storage Structure

The healthCPU will store unstructured data as follows. It will have a series of
documents that keep track of data from various sources. Each source will have its own
document(s). The document will contain (key,values) for
(hash(document_sourcesystem_objectID),document_sourcesystem_objectID). A record
from a source system will be store in its own separate document which will have system
values such as last modified date, and the actual data itself. The record will be called a
DBobject. The document name will be used to identify the DBobject.

Other entities like DBobject can be created. We might have a person entity, which
would be identified by document_person_personID. Very similar to the DBobject
concept in which a series of documents contain references or indexes to the actual
records.

More Related Content

What's hot

JPJ1421 Facilitating Document Annotation Using Content and Querying Value
JPJ1421  Facilitating Document Annotation Using Content and Querying ValueJPJ1421  Facilitating Document Annotation Using Content and Querying Value
JPJ1421 Facilitating Document Annotation Using Content and Querying Valuechennaijp
 
Facilitating document annotation using content and querying value
Facilitating document annotation using content and querying valueFacilitating document annotation using content and querying value
Facilitating document annotation using content and querying valueIEEEFINALYEARPROJECTS
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLijscai
 
facilitating document annotation using content and querying value
facilitating document annotation using content and querying valuefacilitating document annotation using content and querying value
facilitating document annotation using content and querying valueswathi78
 
Mysql certification in chennai
Mysql certification in chennaiMysql certification in chennai
Mysql certification in chennaiTHINK IT Training
 
Data mining
Data miningData mining
Data miningAnne Lee
 
A Comparison between Relational Databases and NoSQL Databases
A Comparison between Relational Databases and NoSQL DatabasesA Comparison between Relational Databases and NoSQL Databases
A Comparison between Relational Databases and NoSQL Databasesijtsrd
 
Cs437 lecture 14_15
Cs437 lecture 14_15Cs437 lecture 14_15
Cs437 lecture 14_15Aneeb_Khawar
 
Applied systems
Applied systemsApplied systems
Applied systemsyuarchu
 
Data Analytics | How it Works
Data Analytics | How it WorksData Analytics | How it Works
Data Analytics | How it WorksJohn P. Gough
 

What's hot (18)

3 dw architectures
3 dw architectures3 dw architectures
3 dw architectures
 
Managing data resources
Managing  data resourcesManaging  data resources
Managing data resources
 
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
JPJ1421  Facilitating Document Annotation Using Content and Querying ValueJPJ1421  Facilitating Document Annotation Using Content and Querying Value
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
 
Facilitating document annotation using content and querying value
Facilitating document annotation using content and querying valueFacilitating document annotation using content and querying value
Facilitating document annotation using content and querying value
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
facilitating document annotation using content and querying value
facilitating document annotation using content and querying valuefacilitating document annotation using content and querying value
facilitating document annotation using content and querying value
 
Cedar Data Lake
Cedar Data LakeCedar Data Lake
Cedar Data Lake
 
Metadata
MetadataMetadata
Metadata
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Mis chapter 5
Mis chapter 5Mis chapter 5
Mis chapter 5
 
Mysql certification in chennai
Mysql certification in chennaiMysql certification in chennai
Mysql certification in chennai
 
Data mining
Data miningData mining
Data mining
 
A Comparison between Relational Databases and NoSQL Databases
A Comparison between Relational Databases and NoSQL DatabasesA Comparison between Relational Databases and NoSQL Databases
A Comparison between Relational Databases and NoSQL Databases
 
Cs437 lecture 14_15
Cs437 lecture 14_15Cs437 lecture 14_15
Cs437 lecture 14_15
 
IT6701-Information management question bank
IT6701-Information management question bankIT6701-Information management question bank
IT6701-Information management question bank
 
Applied systems
Applied systemsApplied systems
Applied systems
 
Data Analytics | How it Works
Data Analytics | How it WorksData Analytics | How it Works
Data Analytics | How it Works
 
hbase lab
hbase labhbase lab
hbase lab
 

Viewers also liked

Зачем нужна Scala?
Зачем нужна Scala?Зачем нужна Scala?
Зачем нужна Scala?Vasil Remeniuk
 
Никита Вельмаскин - Интерпретатор или думаем над скриптовым движком для Ваше...
Никита Вельмаскин -  Интерпретатор или думаем над скриптовым движком для Ваше...Никита Вельмаскин -  Интерпретатор или думаем над скриптовым движком для Ваше...
Никита Вельмаскин - Интерпретатор или думаем над скриптовым движком для Ваше...IT Share
 
Pragmatic Real-World Scala
Pragmatic Real-World ScalaPragmatic Real-World Scala
Pragmatic Real-World Scalaparag978978
 
Concurrency in Scala - the Akka way
Concurrency in Scala - the Akka wayConcurrency in Scala - the Akka way
Concurrency in Scala - the Akka wayYardena Meymann
 
HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012
HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012
HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012Matt Raible
 
Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008Yardena Meymann
 

Viewers also liked (9)

Enterprise Osgi
Enterprise OsgiEnterprise Osgi
Enterprise Osgi
 
Зачем нужна Scala?
Зачем нужна Scala?Зачем нужна Scala?
Зачем нужна Scala?
 
Никита Вельмаскин - Интерпретатор или думаем над скриптовым движком для Ваше...
Никита Вельмаскин -  Интерпретатор или думаем над скриптовым движком для Ваше...Никита Вельмаскин -  Интерпретатор или думаем над скриптовым движком для Ваше...
Никита Вельмаскин - Интерпретатор или думаем над скриптовым движком для Ваше...
 
Pragmatic Real-World Scala
Pragmatic Real-World ScalaPragmatic Real-World Scala
Pragmatic Real-World Scala
 
Scale up your thinking
Scale up your thinkingScale up your thinking
Scale up your thinking
 
All about scala
All about scalaAll about scala
All about scala
 
Concurrency in Scala - the Akka way
Concurrency in Scala - the Akka wayConcurrency in Scala - the Akka way
Concurrency in Scala - the Akka way
 
HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012
HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012
HTML5 with Play Scala, CoffeeScript and Jade - UberConf 2012
 
Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008
 

Similar to Health Db Primer

Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy snehal parikh
 
Database Management Systems (Mcom Ecommerce)
Database Management Systems (Mcom Ecommerce)Database Management Systems (Mcom Ecommerce)
Database Management Systems (Mcom Ecommerce)Rupen Parte
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkAgnihotriGhosh2
 
Database Performance Management in Cloud
Database Performance Management in CloudDatabase Performance Management in Cloud
Database Performance Management in CloudDr. Amarjeet Singh
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperScott Gray
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfan
 
A N S I S P A R C Architecture
A N S I  S P A R C  ArchitectureA N S I  S P A R C  Architecture
A N S I S P A R C ArchitectureSabeeh Ahmed
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptxRushikeshChikane2
 
Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Sreekanth Gali
 
Types of Databases.pptx
Types of Databases.pptxTypes of Databases.pptx
Types of Databases.pptxRudradeepHazra
 
Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02CH JuNaid
 
Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02CH JuNaid
 
Library management
Library managementLibrary management
Library managementfarouq umar
 
Big data talking stories in Healthcare
Big data talking stories in Healthcare Big data talking stories in Healthcare
Big data talking stories in Healthcare Mostafa
 

Similar to Health Db Primer (20)

paper
paperpaper
paper
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy
 
SAP BI/BW
SAP BI/BWSAP BI/BW
SAP BI/BW
 
Database Management Systems (Mcom Ecommerce)
Database Management Systems (Mcom Ecommerce)Database Management Systems (Mcom Ecommerce)
Database Management Systems (Mcom Ecommerce)
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and spark
 
Database Performance Management in Cloud
Database Performance Management in CloudDatabase Performance Management in Cloud
Database Performance Management in Cloud
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_Whitepaper
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -
 
A N S I S P A R C Architecture
A N S I  S P A R C  ArchitectureA N S I  S P A R C  Architecture
A N S I S P A R C Architecture
 
Data Base
Data BaseData Base
Data Base
 
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
3.Implementation with NOSQL databases Document Databases (Mongodb).pptx
 
Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)Pharmacoinformatics Database basics(sree)
Pharmacoinformatics Database basics(sree)
 
Database Management Systems
Database Management SystemsDatabase Management Systems
Database Management Systems
 
Types of Databases.pptx
Types of Databases.pptxTypes of Databases.pptx
Types of Databases.pptx
 
Bigdata ppt
Bigdata pptBigdata ppt
Bigdata ppt
 
Bigdata
BigdataBigdata
Bigdata
 
Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02
 
Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02Librarymanagement 140315062611-phpapp02
Librarymanagement 140315062611-phpapp02
 
Library management
Library managementLibrary management
Library management
 
Big data talking stories in Healthcare
Big data talking stories in Healthcare Big data talking stories in Healthcare
Big data talking stories in Healthcare
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Health Db Primer

  • 1. healthDB: A Primer Parag Patel, Shahid Shah Overview healthDB is a incrementally scalable, fault-tolerant, ACID compliant, key/value document based database designed to hold huge amounts of data and has high throughput read/writes and high availability. It is based on an open source project called couchDB. It is designed to be a data warehouse for the disparate systems that might be part of a healthcare practice or hospital. Due to it!s semi-structured data storage nature, it can hold data of any type. The end user need not worry about structuring the data in the data warehouse; the data will be stored in the warehouse for future extraction and structuring as the user sees fit. Future versions of healthDB will help the end user structure data from the semi-structured state it is in. Conceptually one can think of lazy evaluation in scheme, lisp, haskell. Once the user knows the structure they want to put the data in, it will be a cinch to implement the structure in healthDB. The design of database encompasses a “just works” philosophy. The database should work as advertised. The end user should only have to worry about building their application or service, instead of worrying about the storage of there data and performance. Most of the traditional work that a DBA has done will be done by healthDB. All the end user has to do is start it up initially and add additional servers as the healthDB will dictate in order scale. HealthDB will have a connector engine, that will connect to common interfaces such as HL7, JMS, ODBC, various delimited file formats, and has the ability to develop custom connectors to connect to unusual interfaces. HealthDB will support in the future health query language (HQL) (as an external or internal component tbd), will allow them to search all their structured and semi- structured data to find knowledge they seek in a health domain. HealthDB will come with some sample applications to show end users just the power it holds. Architecture healthDB uses couchDB to primarily take care of the low level storage. It communicates to couchDB (couchDB might need to be modified for encryption) using encrypted REST. A diagram shows the basic outline of healthDB.
  • 2. healthDB healthDB engine couchDB The healthDB engine is the main control unit of the healthDB. It has a job of ensuring the user can store data in a seamless fashion. It takes care of such task as automatic partitioning, replication, encryption of the data, automatic load balancing, automatic system backup, error logging. healthDB engine The healthDB engine is made of up various components such as the partitioner, replicator, connector engine, healthCPU, security, and healthDB API (healthSearch will be additional component, it is undetermined whether it should sit in the healthDB engine or couchdB. We shall look at each component of the engine briefly. Note: additional components maybe added, components maybe merged or deleted. healthDB API Provides the healthDB interface to the outside world. It will be the only way to communicate with the database, Multiple API should be developed such as python, ruby, java, C#, REST. connector engine This connector engine allows data from a variety of different formats to be converted to a format that healthDB can understand while preserving integrity.
  • 3. healthCPU This is the brain of the healthDB database. It controls when the healthDB should replicate data and when it should partition data. It does the job of the looking up data in the datastore (couchDB), formatting, structuring, and semi-structuring data that will be stored in the datastore. It ensures that data HIPPA compliant, by having he security component encrypt it. HealthCPU also maintains which nodes are alive and what the status is. It does the job of load balancing. Filters out data based on the users permissions. security This performs the encryption, authentication, and tells the healthCPU the user has permission to certain data or not. replicator Creates a new database replication based on what the healthCPU tells it. partitioner Creates new partitions on the data and places the data on server(s) the healthCPU specifies. Diagram of the healthDB engine below. healthDB engine healthDB API connector engine healthCPU security replicator partitioner
  • 4. Storage Structure The healthCPU will store unstructured data as follows. It will have a series of documents that keep track of data from various sources. Each source will have its own document(s). The document will contain (key,values) for (hash(document_sourcesystem_objectID),document_sourcesystem_objectID). A record from a source system will be store in its own separate document which will have system values such as last modified date, and the actual data itself. The record will be called a DBobject. The document name will be used to identify the DBobject. Other entities like DBobject can be created. We might have a person entity, which would be identified by document_person_personID. Very similar to the DBobject concept in which a series of documents contain references or indexes to the actual records.