SlideShare una empresa de Scribd logo
1 de 3
Descargar para leer sin conexión
Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34
www.ijera.com 32 | P a g e
Privacy Preservation of Data in Data Mining
Prachi Kohale*, Sheetal Girase**
*(Department of Inforamation Technology, MIT, PUNE PUNE University)
** (Department of Information Technology,MIT PUNE, PUNE University)
ABSTRACT
For data privacy against an un-trusted party, Anonymization is a widely used technique capable of preserving
attribute values and supporting data mining algorithms. The technique deals with Anonymization methods for
users in a domain-driven data mining outsourcing.
Several issues emerge when anonymization is applied in a real world outsourcing scenario. The majority of
methods have focused on the traditional data mining approach; therefore they do not implement domain
knowledge nor optimize data for domain-driven usage. So, existing techniques are mostly non-interactive in
nature, providing little control to users.
To successfully obtain optimal data privacy and actionable patterns in a real world setting, these concerns need
to be addressed. The technique is based on anonymization framework for users in a domain-driven data mining
outsourcing scenario. The framework involves several components designed to anonymize data while preserving
meaningful or actionable patterns that can be discovered after mining.
In the medical field, for enhancing the quality and importance of healthcare, data mining plays an important
role. For example classification analysis on patients data to determine their probability of having heart disease,
so, appropriate actions can be taken, thus enhancing treatment, saving time and cost. It is helpful to anonymize
data while preserving meaningful of actionable patterns that can be found after mining.
Keywords: DDDM, GT, BT
I. INTRODUCTION
In the real world data mining is constraint based
as opposed to traditional data mining which is data
driven trial and error process. In traditional data
mining knowledge discovery usually performed
without domain intelligence, thus affecting model or
actionability for real business needs [1]. So there
exists gap between objective and business goals as
well as outputs and business expectations. Domain
driven data mining aims to bridge the gap between
these by involving domain experts and knowledge to
obtain actionable patterns or models applicable to
real world requirements.[7] Data anonymization in
domain driven data mining uses methods like k
anonymity and recent dynamic anonymization
methods.
II. ANONYMIZATION
The fig.1 shows anonymization in domain driven
data mining outsourcing. It involves several
components designed to anonymize data while
preserving meaningful and actionable patterns that
can be discovered after mining [8]. In contrast to
traditional data mining this framework integrates the
knowledge to retain values after anonymization.
Domain driven data mining DDDM is to make
system deliver business friendly and decision making
rules and actions that are of solid technical
significance as well [2]. Its having effective
involvement of domain intelligence, network
intelligence, Data intelligence, human intelligence,
social intelligence[9].
fig 1 anonymization in domain driven data mining
outsourcing.
III. ANGEL TECHNIQUE
Generalization is well known method for privacy
preservation of data. It has several drawbacks such as
heavy information loss, difficulty of supporting
marginal publication. [3]To overcome this Angel, a
new anonymization technique that is effective as
generalization in privacy protection but able to retain
significantly more information in microdata. It is
applicable to any principle l-diversity, t-closeness and
to overcome the drawback of several methods.
Compared to traditional generalization it ensures
same privacy guarantee preserve significantly more
information in microdata.
Table 1 ANGEL publication 1
Anonymi
zation
(DDDM)
Anonymi
zation
(Data
Protectio
n)
Data
Privacy
(Patient
data)
Data
Mining
(outsour
cing)
RESEARCH ARTICLE OPEN ACCESS
Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34
www.ijera.com 33 | P a g e
Suppose that we want to publish the microdata of
Table 1, conforming to 2-diversity. ANGEL first
divides the table into batches:
Batch 1: {Alan, Carrie},
Batch 2: {Bob, Daisy},
Batch 3: {Eddy, Gloria},
Batch 4: {Frank, Helena}.
Observe that each batch obeys 2-diversity: it
contains one pneumonia- and one bronchitis-tuple.
ANGEL creates a batch table (BT), as in Table 3.1a,
summarizing the Disease-statistics of each batch. For
example, the first row of Table 3.1a states that
exactly one tuple in Batch 1 carries pneumonia.
Then, ANGEL creates another partitioning into
buckets (which do not have to be 2-diverse).
Bucket 1: {Alan, Bob},
Bucket 2: {Carrie, Daisy},
Bucket 3: {Eddy, Frank},
Bucket 4: {Gloria, Helena}
Table 2 Microdata
Name Age Sex Zip Disease
Allan 21 M 10k Pneumonia
Bob 23 M 58k Flu
Carrie 58 F 12k Bronchitis
Daisy 60 F 60k Pneumonia
Eddy 70 M 78k Flu
Frank 72 M 80k Bronchitis
Table 3 2-diverse generalization
Age Sex Zipcod
e
Disease
[21,
23]
M [10k,
58k]
Pneumon
ia
[21,
23]
M [10k,
58k]
Flu
[58,
60]
F [12k,
60k]
Bronchiti
s
[58,
60]
F [12k,
60k]
Pneumon
ia
[70,
72]
M [78k,
80k]
Flu
[70,
72]
M [78k,
80k]
Bronchiti
s
Zipcode Disease
[10k,
12k]
Pneumon
ia
[10k, Bronchiti
12k] s
[58k,
60k]
Flu
[58k,
60k]
Pneumon
ia
[78k,
80k]
Flu
[78k,
80k]
Bronchiti
s
Finally, ANGEL generalizes the tuples of each
bucket into the same form, producing a generalized
table (GT). Table 3 demonstrates the GT. Note that
GT does not include the Disease attribute, but stores,
for each tuple of the microdata, the ID of the batch
containing it. For instance, the first tuple of Table has
a Batch-ID 1, because its owner Alan belongs to
Batch 1. Tables 3.a and 3.b are the final relations
released by ANGEL[10].
IV. CONCLUSION
Angelization is new anonymization technic for
privacy preserving publication which is applicable to
any monotonic anonymization principle. It produces
anonymized relation that achieve privacy guarantee
as conventional generalization but permits much
more accurate reconstruction of original data
distribution. It offers simple and rigorous solution
which was difficult issue.
V. ACKNOWLEDGEMENTS
I would like to take opportunity to express my
humble gratitude to my guide Prof. Shital P. Girase,
Asst. Prof. Maharashtra Academy of Engineering &
Educational Research’s, Maharashtra Institute of
Technology, Pune, who gave me assistance with their
experienced knowledge and whatever required at all
stages of my project.
References
[1] Longbing Cao “DDDM challenges and
prospectus “ june 2010,755-769
[2] Longbing Cao, Philip S. Yu, Chengqi
Zhang, Yanchang Zhao “Domain Driven
Data Mining”
[3] G. Aggarwal, T. Feder, K. Kenthapadi, S.
Khuller, R. Panigrahy, D. Thomas, and A.
Zhu. “Achieving anonymity via clustering”.
In PODS, 2006.
[4] Josep Domingo-Ferrer and Vicenc¸ Torra “A
Critique of k-Anonymity and Some of Its
Enhancements”. 3 International
Conference on Availability, Reliability and
Security
[5] V.Narmada “An enhanced security
algorithm for distributed databases in
privacy preserving data ases” (IJAEST)
INTERNATIONAL JOURNAL OF
Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com
ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34
www.ijera.com 34 | P a g e
ADVANCED ENGINEERING SCIENCES
AND TECHNOLOGIES”.
[6] R. Agrawal and R. Srikant. “Privacy-
preserving data mining”.
[7] J. Han and M. Kamber Morgan Kaufmann
“Data Mining: Concepts and
Techniques”.2000.
[8] Brian, C.S. Loh and Patrick, H.H. Then l.
“Ontology-Enhanced Interactive
Anonymization in Domain-Driven Data
Mining Outsourcing” 2010 Second
International Symposium on Data, Privacy,
and E-Commerce
[9] CHARU C. AGGARWAL,PHILIP S. YU
“PRIVACY-PRESERVING DATA
MINING:MODELS AND ALGORITHM”
IBM T. J. Watson Research Center,
Hawthorne, NY 10532
[10] Yufei Tao1 Hekang Chen2 Xiaokui ”
Enhancing the Utility of Generalization for
Privacy Preserving data publishing”

Más contenido relacionado

La actualidad más candente

Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Kato Mivule
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Miningidescitation
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining14894
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109IJRAT
 
Utilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewUtilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewKato Mivule
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...IJDKP
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesEditor IJMTER
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsIJERA Editor
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMIJNSA Journal
 
78201919
7820191978201919
78201919IJRAT
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data miningeSAT Publishing House
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data MiningVrushali Malvadkar
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_pptSagar Verma
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 

La actualidad más candente (19)

Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...Implementation of Data Privacy and Security in an Online Student Health Recor...
Implementation of Data Privacy and Security in an Online Student Health Recor...
 
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data MiningPerformance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
Performance Analysis of Hybrid Approach for Privacy Preserving in Data Mining
 
Using Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data MiningUsing Randomized Response Techniques for Privacy-Preserving Data Mining
Using Randomized Response Techniques for Privacy-Preserving Data Mining
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109
 
Utilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an OverviewUtilizing Noise Addition For Data Privacy, an Overview
Utilizing Noise Addition For Data Privacy, an Overview
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
1699 1704
1699 17041699 1704
1699 1704
 
Cluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for DatabasesCluster Based Access Privilege Management Scheme for Databases
Cluster Based Access Privilege Management Scheme for Databases
 
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data SetsPrivacy Preservation and Restoration of Data Using Unrealized Data Sets
Privacy Preservation and Restoration of Data Using Unrealized Data Sets
 
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREMPRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
PRIVACY PRESERVING DATA MINING BY USING IMPLICIT FUNCTION THEOREM
 
78201919
7820191978201919
78201919
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
Hy3414631468
Hy3414631468Hy3414631468
Hy3414631468
 
Privacy preservation techniques in data mining
Privacy preservation techniques in data miningPrivacy preservation techniques in data mining
Privacy preservation techniques in data mining
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 
A review on privacy preservation in data mining
A review on privacy preservation in data miningA review on privacy preservation in data mining
A review on privacy preservation in data mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
130509
130509130509
130509
 

Destacado

Introduccion a la Gerencia
Introduccion a la Gerencia Introduccion a la Gerencia
Introduccion a la Gerencia betdana2010
 
Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014Allan Fraga
 
7 สามัญ อังกฤษ
7 สามัญ อังกฤษ7 สามัญ อังกฤษ
7 สามัญ อังกฤษฝฝ' ฝน
 
Blago sa tavana
Blago sa tavanaBlago sa tavana
Blago sa tavanaLimundo
 
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013Gianluigi Spagnoli
 
Peixes gigantes e perigosos
Peixes gigantes e perigososPeixes gigantes e perigosos
Peixes gigantes e perigososvaltermn
 
Centro Comercial Virtual
Centro Comercial VirtualCentro Comercial Virtual
Centro Comercial Virtualjggc
 
Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)gorodche
 

Destacado (20)

Tigana trecho
Tigana trechoTigana trecho
Tigana trecho
 
Taller en Juan Cacharpa
Taller en Juan CacharpaTaller en Juan Cacharpa
Taller en Juan Cacharpa
 
Introduccion a la Gerencia
Introduccion a la Gerencia Introduccion a la Gerencia
Introduccion a la Gerencia
 
Aj04602248254
Aj04602248254Aj04602248254
Aj04602248254
 
Trenajer 8 9
Trenajer 8 9Trenajer 8 9
Trenajer 8 9
 
S13 modulo3 s8- primaria matematica
S13 modulo3 s8- primaria matematicaS13 modulo3 s8- primaria matematica
S13 modulo3 s8- primaria matematica
 
Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014Índice FIPE ZAP divulgação Março de 2014
Índice FIPE ZAP divulgação Março de 2014
 
7 สามัญ อังกฤษ
7 สามัญ อังกฤษ7 สามัญ อังกฤษ
7 สามัญ อังกฤษ
 
Blago sa tavana
Blago sa tavanaBlago sa tavana
Blago sa tavana
 
Ap04605296305
Ap04605296305Ap04605296305
Ap04605296305
 
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
Multicanalita e-social-media-maurizio alberti-ecircle-20feb2013
 
Am04606239243
Am04606239243Am04606239243
Am04606239243
 
Peixes gigantes e perigosos
Peixes gigantes e perigososPeixes gigantes e perigosos
Peixes gigantes e perigosos
 
Erroma
ErromaErroma
Erroma
 
Centro Comercial Virtual
Centro Comercial VirtualCentro Comercial Virtual
Centro Comercial Virtual
 
Aj04606216221
Aj04606216221Aj04606216221
Aj04606216221
 
Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)Глянец №14 (декабрь-январь 2012-13)
Глянец №14 (декабрь-январь 2012-13)
 
Ac04605194204
Ac04605194204Ac04605194204
Ac04605194204
 
Nghi ve me
Nghi ve meNghi ve me
Nghi ve me
 
H046064750
H046064750H046064750
H046064750
 

Similar a F046043234

Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachPrivacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachIRJET Journal
 
78201919
7820191978201919
78201919IJRAT
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining TechniquesIJMER
 
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...IJSRD
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET Journal
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishingijcisjournal
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...acijjournal
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Miningijujournal
 
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata PrivacyTwo-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacydbpublications
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope IJCSEIT Journal
 
Big Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsBig Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsIRJET Journal
 
A Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningA Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningIJNSA Journal
 
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...IJECEIAES
 
Privacy preserving on
Privacy preserving onPrivacy preserving on
Privacy preserving onPoonam Hajare
 
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS cscpconf
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionIOSR Journals
 

Similar a F046043234 (20)

Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining ApproachPrivacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
Privacy Preserving Data Mining Using Inverse Frequent ItemSet Mining Approach
 
78201919
7820191978201919
78201919
 
A Comparative Study on Privacy Preserving Datamining Techniques
A Comparative Study on Privacy Preserving Datamining  TechniquesA Comparative Study on Privacy Preserving Datamining  Techniques
A Comparative Study on Privacy Preserving Datamining Techniques
 
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
A Survey Paper on an Integrated Approach for Privacy Preserving In High Dimen...
 
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...IRJET-  	  Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
IRJET- Study Paper on: Ontology-based Privacy Data Chain Disclosure Disco...
 
A survey on privacy preserving data publishing
A survey on privacy preserving data publishingA survey on privacy preserving data publishing
A survey on privacy preserving data publishing
 
Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...Data Transformation Technique for Protecting Private Information in Privacy P...
Data Transformation Technique for Protecting Private Information in Privacy P...
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
A Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data MiningA Review on Privacy Preservation in Data Mining
A Review on Privacy Preservation in Data Mining
 
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata PrivacyTwo-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
 
The Survey of Data Mining Applications And Feature Scope
The Survey of Data Mining Applications  And Feature Scope The Survey of Data Mining Applications  And Feature Scope
The Survey of Data Mining Applications And Feature Scope
 
Data Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope SurveyData Mining Applications And Feature Scope Survey
Data Mining Applications And Feature Scope Survey
 
Big Data: Privacy and Security Aspects
Big Data: Privacy and Security AspectsBig Data: Privacy and Security Aspects
Big Data: Privacy and Security Aspects
 
130509
130509130509
130509
 
A Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved MiningA Frame Work for Ontological Privacy Preserved Mining
A Frame Work for Ontological Privacy Preserved Mining
 
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
Misusability Measure Based Sanitization of Big Data for Privacy Preserving Ma...
 
Privacy preserving on
Privacy preserving onPrivacy preserving on
Privacy preserving on
 
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
DATA SCIENCE METHODOLOGY FOR CYBERSECURITY PROJECTS
 
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data DistortionMultilevel Privacy Preserving by Linear and Non Linear Data Distortion
Multilevel Privacy Preserving by Linear and Non Linear Data Distortion
 

Último

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Último (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

F046043234

  • 1. Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34 www.ijera.com 32 | P a g e Privacy Preservation of Data in Data Mining Prachi Kohale*, Sheetal Girase** *(Department of Inforamation Technology, MIT, PUNE PUNE University) ** (Department of Information Technology,MIT PUNE, PUNE University) ABSTRACT For data privacy against an un-trusted party, Anonymization is a widely used technique capable of preserving attribute values and supporting data mining algorithms. The technique deals with Anonymization methods for users in a domain-driven data mining outsourcing. Several issues emerge when anonymization is applied in a real world outsourcing scenario. The majority of methods have focused on the traditional data mining approach; therefore they do not implement domain knowledge nor optimize data for domain-driven usage. So, existing techniques are mostly non-interactive in nature, providing little control to users. To successfully obtain optimal data privacy and actionable patterns in a real world setting, these concerns need to be addressed. The technique is based on anonymization framework for users in a domain-driven data mining outsourcing scenario. The framework involves several components designed to anonymize data while preserving meaningful or actionable patterns that can be discovered after mining. In the medical field, for enhancing the quality and importance of healthcare, data mining plays an important role. For example classification analysis on patients data to determine their probability of having heart disease, so, appropriate actions can be taken, thus enhancing treatment, saving time and cost. It is helpful to anonymize data while preserving meaningful of actionable patterns that can be found after mining. Keywords: DDDM, GT, BT I. INTRODUCTION In the real world data mining is constraint based as opposed to traditional data mining which is data driven trial and error process. In traditional data mining knowledge discovery usually performed without domain intelligence, thus affecting model or actionability for real business needs [1]. So there exists gap between objective and business goals as well as outputs and business expectations. Domain driven data mining aims to bridge the gap between these by involving domain experts and knowledge to obtain actionable patterns or models applicable to real world requirements.[7] Data anonymization in domain driven data mining uses methods like k anonymity and recent dynamic anonymization methods. II. ANONYMIZATION The fig.1 shows anonymization in domain driven data mining outsourcing. It involves several components designed to anonymize data while preserving meaningful and actionable patterns that can be discovered after mining [8]. In contrast to traditional data mining this framework integrates the knowledge to retain values after anonymization. Domain driven data mining DDDM is to make system deliver business friendly and decision making rules and actions that are of solid technical significance as well [2]. Its having effective involvement of domain intelligence, network intelligence, Data intelligence, human intelligence, social intelligence[9]. fig 1 anonymization in domain driven data mining outsourcing. III. ANGEL TECHNIQUE Generalization is well known method for privacy preservation of data. It has several drawbacks such as heavy information loss, difficulty of supporting marginal publication. [3]To overcome this Angel, a new anonymization technique that is effective as generalization in privacy protection but able to retain significantly more information in microdata. It is applicable to any principle l-diversity, t-closeness and to overcome the drawback of several methods. Compared to traditional generalization it ensures same privacy guarantee preserve significantly more information in microdata. Table 1 ANGEL publication 1 Anonymi zation (DDDM) Anonymi zation (Data Protectio n) Data Privacy (Patient data) Data Mining (outsour cing) RESEARCH ARTICLE OPEN ACCESS
  • 2. Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34 www.ijera.com 33 | P a g e Suppose that we want to publish the microdata of Table 1, conforming to 2-diversity. ANGEL first divides the table into batches: Batch 1: {Alan, Carrie}, Batch 2: {Bob, Daisy}, Batch 3: {Eddy, Gloria}, Batch 4: {Frank, Helena}. Observe that each batch obeys 2-diversity: it contains one pneumonia- and one bronchitis-tuple. ANGEL creates a batch table (BT), as in Table 3.1a, summarizing the Disease-statistics of each batch. For example, the first row of Table 3.1a states that exactly one tuple in Batch 1 carries pneumonia. Then, ANGEL creates another partitioning into buckets (which do not have to be 2-diverse). Bucket 1: {Alan, Bob}, Bucket 2: {Carrie, Daisy}, Bucket 3: {Eddy, Frank}, Bucket 4: {Gloria, Helena} Table 2 Microdata Name Age Sex Zip Disease Allan 21 M 10k Pneumonia Bob 23 M 58k Flu Carrie 58 F 12k Bronchitis Daisy 60 F 60k Pneumonia Eddy 70 M 78k Flu Frank 72 M 80k Bronchitis Table 3 2-diverse generalization Age Sex Zipcod e Disease [21, 23] M [10k, 58k] Pneumon ia [21, 23] M [10k, 58k] Flu [58, 60] F [12k, 60k] Bronchiti s [58, 60] F [12k, 60k] Pneumon ia [70, 72] M [78k, 80k] Flu [70, 72] M [78k, 80k] Bronchiti s Zipcode Disease [10k, 12k] Pneumon ia [10k, Bronchiti 12k] s [58k, 60k] Flu [58k, 60k] Pneumon ia [78k, 80k] Flu [78k, 80k] Bronchiti s Finally, ANGEL generalizes the tuples of each bucket into the same form, producing a generalized table (GT). Table 3 demonstrates the GT. Note that GT does not include the Disease attribute, but stores, for each tuple of the microdata, the ID of the batch containing it. For instance, the first tuple of Table has a Batch-ID 1, because its owner Alan belongs to Batch 1. Tables 3.a and 3.b are the final relations released by ANGEL[10]. IV. CONCLUSION Angelization is new anonymization technic for privacy preserving publication which is applicable to any monotonic anonymization principle. It produces anonymized relation that achieve privacy guarantee as conventional generalization but permits much more accurate reconstruction of original data distribution. It offers simple and rigorous solution which was difficult issue. V. ACKNOWLEDGEMENTS I would like to take opportunity to express my humble gratitude to my guide Prof. Shital P. Girase, Asst. Prof. Maharashtra Academy of Engineering & Educational Research’s, Maharashtra Institute of Technology, Pune, who gave me assistance with their experienced knowledge and whatever required at all stages of my project. References [1] Longbing Cao “DDDM challenges and prospectus “ june 2010,755-769 [2] Longbing Cao, Philip S. Yu, Chengqi Zhang, Yanchang Zhao “Domain Driven Data Mining” [3] G. Aggarwal, T. Feder, K. Kenthapadi, S. Khuller, R. Panigrahy, D. Thomas, and A. Zhu. “Achieving anonymity via clustering”. In PODS, 2006. [4] Josep Domingo-Ferrer and Vicenc¸ Torra “A Critique of k-Anonymity and Some of Its Enhancements”. 3 International Conference on Availability, Reliability and Security [5] V.Narmada “An enhanced security algorithm for distributed databases in privacy preserving data ases” (IJAEST) INTERNATIONAL JOURNAL OF
  • 3. Prachi Kohale Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 6( Version 4), June 2014, pp.32-34 www.ijera.com 34 | P a g e ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES”. [6] R. Agrawal and R. Srikant. “Privacy- preserving data mining”. [7] J. Han and M. Kamber Morgan Kaufmann “Data Mining: Concepts and Techniques”.2000. [8] Brian, C.S. Loh and Patrick, H.H. Then l. “Ontology-Enhanced Interactive Anonymization in Domain-Driven Data Mining Outsourcing” 2010 Second International Symposium on Data, Privacy, and E-Commerce [9] CHARU C. AGGARWAL,PHILIP S. YU “PRIVACY-PRESERVING DATA MINING:MODELS AND ALGORITHM” IBM T. J. Watson Research Center, Hawthorne, NY 10532 [10] Yufei Tao1 Hekang Chen2 Xiaokui ” Enhancing the Utility of Generalization for Privacy Preserving data publishing”