SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
1
Missouri University of Science and Technology
Ethical Issues with Customer Data Collection
Submitted To:
Dr. David Spurlock
Submitted By:
Tatiana Cardona
Sulagna Mandal
Hari Nadathur
Pranav Godse
2
ABSTRACT
The paper discusses topics related to:
 Data mining-collection and analysis of large amounts of data.
Data mining is a branch of Computer Science which deals in processing large scale data to extract
previously unknown, interesting patterns. The objective is to process and get the required relevant
information from large volumes.
 Ethical issues related to data mining and how it impacts web miners and web users.
Data mining does possess information privacy threat as an individual’s/group’s personal
information is freely available. The individual must have information on what is the purpose of the
data collection, who is the recipient of the data, its implications and related information. Ethical
data mining is however acceptable. It refers to the ethical usage of individual data in accordance
with the privacy rules and set standards.
 Defines the fine line between ethical and unethical usage of data mining.
Although the impact of web-data mining should be a concern for every web user, there is no reason
for people to panic. This technique is not yet being used to its full potential.
There is, however, no clear indication of web data being misused to an extent that people are
offended.
3
DATA MINING
Data Mining involves six stages:
1. Detection: In this stage any noticeable difference in the data patterns is detected. This stage is very
crucial, since the quality of data collected will impact on the output.
2. Dependency modelling: The relationship between the variables is found such as the buying trends
of a particular age group, effect of tax reduction on savings, sales effect on sale of goods due to
discount, etc.
3. Clustering: Clustering is a process of partitioning a set of data (or objects) into a set of meaningful
sub-classes, called clusters. This helps in understanding the natural grouping or structure in a data
set.
4. Classification: Data classification is the classification of data based on its level of sensitivity. The
classification of data helps determine what baseline security controls are appropriate for
safeguarding that data.
DATAMINING
Anonymous data
collected. (Generally
used to recognize
group patterns)
CONTENT MINING
Information collected trhough
web navigation history (E.g.
trough cookies)
ESTRUCTURE MINIG
Information collected
identifiyin the IP addres and
relating it whit the company
provider (IPS) in order to
obtain more specific data (E.g.
names, address, phone
number etc)
Data collected from
an user. (Generally
used to characterize
his/her behavior)
USAGE MINING
Informaton collected when the
user give information in order
to acces to certain benefits
(E.g. loggin information)
4
5. Regression : It is a statistical approach to forecast change in a dependent variable (sales revenue,
for example) on the basis of change in one or more independent variables (population and income).
6. Summarization: Summarization is a key data mining concept which involves techniques for
finding a compact description of a dataset. Data summarization provides the capacity to give data
consumers generalize view of disparate bulks of data.
METHODS OF DATA MINING
1. WEB TRACKING
Web tracking is all about the Companies that track consumers’ behavior across the Web without their
consent, and without providing them any recognizable value.
Behavioral audience targeting, like content targeting, sponsored advertorials, pre-rolls and every other
ad-product available in digital environments, serves content creators. Keeping content creators in
business serves consumers, giving them a myriad of digital environments to explore.
But in many cases advertisers misuse behavioral data and this is something against ethics. Third Party
companies with no direct relationship to the consumer begin tracking those consumers across numerous
websites, create profiles of that behavior and profit off that information that they haven’t asked
permission to collect. This is what we call ethically problematic issue.
Reasons for Web tracking:-
 To Boost Marketing Capabilities
 Law Enforcement and Intelligence
 Web Analytics
5
2. SURVEYING
A survey is a research method for collecting information from a selected group of people using
standardized questionnaires or interviews. There are numerous survey research methods to obtain
customer preferences and likeness such as:
 In-Person Interviews
 Telephone interviews
 Online Questionnaires
BIG DATA PERSPECTIVES
Large collections of data have addressed the focus on different perspectives as it can be seeing.
As a Technology innovation In order to accomplish its purpose data mining must be developed to answer
effectively the concerns about: First, the progress of storage alternatives or Volume. Second, easy acces in
real time Velocity. Third, the current data is mostly unstructured (difficult to stablish its exact use due to
the large amount and possibilities of anlysis) Variety.
As a Commercial Value: The use of data generates value trough the identification of complex patterns in
real time (foundation of market research) and the prediction of quality issues.
As a matter of privacy: In the challenge of protect the privacy, it must exits a balance in its use and the
following factors. Recolection:- sets of data analyzed independently do not represent privacy implications
but combined can threaten the privacy. Security:- Personal data can be hacked and stolen. High volume and
velocity:- Data should be autonomously analyzed (No time to wait for consents). Significance:-
Organisations are far from have the ability to use all the collected data.
These perspectives can give a scheme about the direction in which data mining is evolving, and surely is
possible to assume that there is not a coming back in the way information is being used.
6
Considering together the three points of view is likely to assume that the progress of the first tow
(Technology innovation and Commercial value) are linked to the use of big data as a matter of privacy
based on how personal information is analyzed and how consumer relationships are built, bearing in mind
security implications within individuals’ social interaction through the use of personal technological
devices.
ETHICAL CHALLENGES IN CUSTOMER DATA HANDLING
Information privacy is defined as the relationship between collection and dissemination of data, technology,
the public expectation of privacy, and the legal and political issues surrounding them.
Data mining does possess information privacy threat as an individual’s/group’s personal information is
freely available. The individual must have information on what is the purpose of the data collection, who
is the recipient of the data, its implications and related information.
The following are some issues in application of data mining as a commercial value with their ethical facts:
The social graph: Deducted by social networking (information given voluntary) is the picture to be built
of group-level interactions and the nature of the bonds that bring these people together
Ethical challenge: Ambiguity. Uncertainty in the group picture due to the possibility of labeling friends
with weak social ties that are not representative of the physical-world life.
Ownership of data: Instead of being collected by government entities or the traditional large companies,
data is collected by high technology companies as Facebook, Google, and Twitter among others.
 Ethical challenge: Some of the owner of the data have the promise of not to sell the data now, but the
evolution of data mining as a valuable technology it can change in the future as a consequence of the
changes in the policies of data use.
Data memory: Data collected and stored can be recalled and analyzed in the future.
7
 Ethical challenge: Information storage about individual’s life can retrieve past behaviors (E.g.
Facebook timeline can represent a disadvantage for a person who use to party very frequently and now
is in a job search). Data memory "may remove the ability for individuals to forget and be forgotten"
Passive data collection: Automatic data collection trough passive technologies. (E.g. Mobile location
information).
 Ethical challenge: Increases the amount of data collected and the variables to take in account in the
analysis of the data. But individuals are not aware of it, and even if they authorized the data collection
at a first point, systems are not asking each time that are doing the collection.
Respecting privacy in a public world: The use of technologies has become necessary nowadays and they
are of easy access, offering benefits at low cost (e.g. free apps). However the use of certain technological
devices implicates the collection of information from the servers.
 Ethical challenge: Individuals can step up from giving information; however the use of the technology
has become a necessity and an important factor of social interaction, then the paradox is that making
the decision of giving information can represent to be excluded from the community.
Although now this ethical issues are challenges in the application of this technology, the laws and
regulations are gradually being updated based on the concerns on individuals privacy. Thus is important to
highlight the fact that Data mining is an emergent practice, hence it is under an adjusting phase. For its
current application the self-regulation is a very important aspect for the companies to take in consideration
when dealing with big data.
Below are some recommendations that must be taken in support of ethical data mining
1. Verify the data source for authenticity
2. Expectation of customers must be considered and respected.
3. Developing better customer relations
4. Emphasis on ethical data mining
8
5. Control on unregulated data access and software
6. Corrective action to be taken on offenders
CASE STUDIES- CONS
Target Corporation Case:
Target Corporation - A large scale retailer of consumer goods assigns every customer a Guest ID number,
tied to their credit card, name, or email address and stores the history of that customer’s purchases and other
demographic information they have collected from them or obtained from other sources.
Lots of people buy lotions, but one of Target’s employees noticed that women on the baby registry were
buying larger quantities of unscented lotion around the beginning of their second trimester.
An angry man went into a Target store outside of Minneapolis, demanding to talk to a manager: My
daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby
clothes and cribs? Are you trying to encourage her to get pregnant?”
The manager having no idea about the issue, looked at the mailer which was addressed to the man’s
daughter, and contained advertisements for maternity clothing, nursery furniture and pictures of smiling
infants.
The manager apologized and then called a few days later to apologize again. This time however the man
said “I had a talk with my daughter. It turns out there’s been some activities in my house I haven’t been
completely aware of. She’s due in August. I owe you an apology.”
Despite the accuracy of data analysis by Target Corporation, the teenage girl’s privacy with her personal
life is exposed and this results in unethical usage of customer behavior on the web.
9
LinkedIn Lawsuit:
Recently LinkedIn CEO Jeff Weiner admitted that the social networking site was guilty of sending too
many emails to some users.
The “Add Connection” service in LinkedIn lets users to import contacts from their email accounts and send
invitations to connect on the site. The way the "Add Connections" service works is that an email invitation
is sent out by LinkedIn to the contact, but if the person does not respond to the invitation within a certain
amount of time, LinkedIn follows up by sending them two more reminder emails.
The suit claims that LinkedIn repeatedly “spammed” those contacts with unwanted emails despite LinkedIn
members not providing their consent to send the additional emails.
LinkedIn said in an email to its users that anyone who used the service between Sept. 17, 2011, and Oct.
31, 2014, is eligible to file a claim.
The amount that each user will receive will depend on how many people come forward, but LinkedIn said
each person could earn up to $1,500.
LinkedIn says it has revised its disclosures to clarify that two reminder emails will be sent as part of its
"Add Connections" feature. The company says it will, by year's end, also offer an option to users to cancel
a connection invitation, thereby halting any additional reminder emails from being sent out.
This case is a classic example of ownership of data and passive data collection which pose ethical challenges
to customer’s privacy on the web.
10
ARGUMENTS TO SUPPORT DATA MINING-PROS
Arguments that defend the above discussed ethical issues based on the experiment conducted on
professionals applying web data mining practices in a business context. Their views are as follows:
 Web-data mining itself does not give rise to new ethical issues.
Professionals argue that there is nothing new about web-data mining practices as it is just an
extension of old situations to new situations created by computer and information technology. One
first has to clear up the uncertainties, which have to do with understanding what data mining is.
Most of the possible dangers come from group profiling, and since group profiling has been done
before data mining techniques were known, the issues could be considered to be old news.
 There are laws to protect private information.
This argument cannot be told with conviction, as the law is never fully sufficient with respect to
privacy problems. For instance, current privacy laws only offer protection for the misuse of
identifiable personal data but there is no legal protection for the misuse of anonymized data used
as if it were personal data. The growing number of online privacy policies is an example of self-
regulating efforts. Such policies, however, are not found on every site. Thus, there are still a lot of
sites that a person, who is concerned about his online privacy, should not visit. In addition, it is not
always an easy task for a web user to thoroughly read the privacy statements on every site he/she
visits.
11
 Many individuals simply choose to give up their privacy, and why not use this data.
As people can refuse to give out information about themselves, they possess some power to control
their relationship with organizations. Many individuals simply choose to give up their privacy and
what can be wrong with collecting this public data from the web that is voluntarily given? It is there
for the taking.
 Most collected data is not of a personal nature, or is used for anonymous profiles.
So why should there be a privacy problem? An argument often heard is: “Our software is used to
identify crowd behavior of visitors to web sites. Therefore, if we don’t know who you are, how can
we be invading your privacy?
 Web-data mining leads to less unsolicited marketing approaches.
Data mining techniques will provide more accurate and more detailed information, which can lead
to better and fairer judgements. So, web-data mining leads to less unwanted marketing approaches.
Therefore, why would people complain?
 Personalization leads to individualization instead of de-individualization.
Most customers like to be recognized, and treated as a special customer. So it is not considered a
violation of privacy to analyze usage interaction.
12
CONCLUSION
Although there are many ethical challenges prevalent with respect to data mining, it can be attributed to the
fact that data mining is an emerging technology and the market is adjusting to its capabilities and there is
no immediate threat to users. So, it is by no means clear that companies are using unexpected and non-
obvious associations, classifications, clusters, and profiles based on web data as grounds for decision-
making
 The solutions discussed previously can contribute to the responsible and well considered
development and application of web-data mining.
 The laws and regulations associated to it are bound to evolve depending on how it is perceived.
 There are things that can be done to guide this technique in a socially acceptable direction.
 As ethical issues will grow as rapidly as the technology, ethical considerations should be an
integrated and essential part of this development process instead of something at its side.
 This is a joint responsibility of both web miners and web users.
Some methods to avoid web tracking:
1. Ensure that the website is safe before sharing any information or filling out any registration forms
(by checking the website’s privacy policy and commentaries).
2. Ensure that your online accounts in the different websites are configured for providing optimal
privacy levels.
3. Use an email provider that has a reliable dedication to the protection of customer privacy.
4. Enhance the privacy of your browser through various add-ons and extensions.
13
REFERENCES
Earley, S. (2014). Big Data and Predictive Analytics: What's New? IT Professional IT Prof., 13-15.
Reteived November 16, 2015.
http://ieeexplore.ieee.org.libproxy.mst.edu/stamp/stamp.jsp?tp=&arnumber=6756866
Wel, L., & Royakkers, L. (2004). Ethical issues in web data mining. Ethics and Information Technology,
6(2), 129-140. Retrieved November 17, 2015, from
http://link.springer.com/article/10.1023/B:ETIN.0000047476.05912.3d
Nunan, D., & Domenico, M. (2013). Market research and the ethics of big data. International Journal of
Market Research Int. J. Market Res. http://um9mh3ku7s.search.serialssolutions.com/?ctx_ver=Z39.88-
2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-
8&rfr_id=info:sid/summon.serialssolutions.com&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=art
icle&rft.atitle=Market+research+and+the+ethics+of+big+data&rft.jtitle=INTERNATIONAL+JOURNAL
+OF+MARKET+RESEARCH&rft.au=Nunan%2C+D&rft.au=Di+Domenico%2C+M&rft.date=2013&rft
.pub=MARKET+RESEARCH+SOC&rft.issn=1470-
7853&rft.volume=55&rft.issue=4&rft.spage=505&rft.epage=520&rft_id=info:doi/10.2501%2FIJMR-
2013-015&rft.externalDBID=n%2Fa&rft.externalDocID=000340017200005&paramdict=en-US
GENERAL REFERENCES
Moftakhari, M., Ethical issues in data Mining. 23 pages. http://ickm2014.bilgiyonetimi.net/wp-
content/uploads/2015/01/mandana.pdf
Carr, N., (2010). Tracking is an assault on liberty, with real dangers. The wall street journal.
http://www.wsj.com/articles/SB10001424052748703748904575411682714389888
Harper, J., (2010). It’s modern trade: Web users get as much as they give. The wall street journal.
http://www.wsj.com/articles/SB10001424052748703748904575411530096840958.
CASE STUDIES:
Hill, K., (2012). How Target Figured out a teen girl was pregnant before her father did. Forbes Tech.
http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-
before-her-father-did/
Roberts, J., (2015).LinkedIn will pay $13M for sending those awful mails. Fortune.
http://fortune.com/2015/10/05/linkedin-class-action/

Más contenido relacionado

La actualidad más candente

Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
Phi Jack
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
Saif Ullah
 

La actualidad más candente (20)

Understanding the difference between Data, information and knowledge
Understanding the difference between Data, information and knowledgeUnderstanding the difference between Data, information and knowledge
Understanding the difference between Data, information and knowledge
 
Information filtering
Information filteringInformation filtering
Information filtering
 
Data Science Project Lifecycle
Data Science Project LifecycleData Science Project Lifecycle
Data Science Project Lifecycle
 
Text mining
Text miningText mining
Text mining
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Data Mining
Data MiningData Mining
Data Mining
 
History of Data Science
History of Data ScienceHistory of Data Science
History of Data Science
 
Privacy by Design and by Default + General Data Protection Regulation with Si...
Privacy by Design and by Default + General Data Protection Regulation with Si...Privacy by Design and by Default + General Data Protection Regulation with Si...
Privacy by Design and by Default + General Data Protection Regulation with Si...
 
The role of information system
The role of information system The role of information system
The role of information system
 
Ethical and social issues in information systems
Ethical and social issues in information systemsEthical and social issues in information systems
Ethical and social issues in information systems
 
CS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit VCS6010 Social Network Analysis Unit V
CS6010 Social Network Analysis Unit V
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Chapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.pptChapter 3. Data Preprocessing.ppt
Chapter 3. Data Preprocessing.ppt
 
HOW INFORMATION SYSTEM IS EFFECT ON AN ORGANIZATION
HOW INFORMATION SYSTEM IS EFFECT ON AN ORGANIZATIONHOW INFORMATION SYSTEM IS EFFECT ON AN ORGANIZATION
HOW INFORMATION SYSTEM IS EFFECT ON AN ORGANIZATION
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
legal and ethcal issues of e business
legal and ethcal issues of e businesslegal and ethcal issues of e business
legal and ethcal issues of e business
 
E business models
E business modelsE business models
E business models
 

Destacado (6)

Ethical Market Models in the Personal Data Ecosystem
Ethical Market Models in the Personal Data EcosystemEthical Market Models in the Personal Data Ecosystem
Ethical Market Models in the Personal Data Ecosystem
 
Shared Personal Data: Revolutionizing Customer Relationship
Shared Personal Data: Revolutionizing Customer RelationshipShared Personal Data: Revolutionizing Customer Relationship
Shared Personal Data: Revolutionizing Customer Relationship
 
CRM practices in banks
CRM practices in banksCRM practices in banks
CRM practices in banks
 
Ethical Issues and Challenges
Ethical Issues and ChallengesEthical Issues and Challenges
Ethical Issues and Challenges
 
Crm in retail
Crm in retailCrm in retail
Crm in retail
 
Relationship marketing concept, process and importance
Relationship marketing concept, process and importanceRelationship marketing concept, process and importance
Relationship marketing concept, process and importance
 

Similar a ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docxRunning head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
jeanettehully
 
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docxRunning head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
glendar3
 
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docxRunning head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
todd581
 
Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashok
Ashok Kumar
 

Similar a ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION (20)

The Rise of Data Ethics and Security - AIDI Webinar
The Rise of Data Ethics and Security - AIDI WebinarThe Rise of Data Ethics and Security - AIDI Webinar
The Rise of Data Ethics and Security - AIDI Webinar
 
Ethics In DW & DM
Ethics In DW & DMEthics In DW & DM
Ethics In DW & DM
 
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docxRunning head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
 
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docxRunning head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
 
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docxRunning head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
Running head POLICIES FOR MANAGING PRIVACY1POLICIES FOR M.docx
 
ETHICAL ISSUES RELATED TO DATA COLLECTION.pptx
ETHICAL ISSUES RELATED TO DATA COLLECTION.pptxETHICAL ISSUES RELATED TO DATA COLLECTION.pptx
ETHICAL ISSUES RELATED TO DATA COLLECTION.pptx
 
Smart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislationSmart Data Module 5 d drive_legislation
Smart Data Module 5 d drive_legislation
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet users
 
Data set Legislation
Data set   Legislation Data set   Legislation
Data set Legislation
 
[AIIM18] GDPR: whose job is it now? - Paul Lanois
[AIIM18] GDPR: whose job is it now? - Paul Lanois[AIIM18] GDPR: whose job is it now? - Paul Lanois
[AIIM18] GDPR: whose job is it now? - Paul Lanois
 
Ethical Considerations in Data Analytics
Ethical Considerations in Data AnalyticsEthical Considerations in Data Analytics
Ethical Considerations in Data Analytics
 
Ethical Considerations in Data Analytics
Ethical Considerations in Data AnalyticsEthical Considerations in Data Analytics
Ethical Considerations in Data Analytics
 
Research on Privacy Protection in Big Data Environment
Research on Privacy Protection in Big Data EnvironmentResearch on Privacy Protection in Big Data Environment
Research on Privacy Protection in Big Data Environment
 
Research on Privacy Protection in Big Data Environment
Research on Privacy Protection in Big Data EnvironmentResearch on Privacy Protection in Big Data Environment
Research on Privacy Protection in Big Data Environment
 
Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashok
 
Data set module 4
Data set   module 4Data set   module 4
Data set module 4
 
Data security and privacy
Data security and privacyData security and privacy
Data security and privacy
 
Data set Legislation
Data set LegislationData set Legislation
Data set Legislation
 
Data set Legislation
Data set LegislationData set Legislation
Data set Legislation
 
Data mining
Data mining Data mining
Data mining
 

ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION

  • 1. 1 Missouri University of Science and Technology Ethical Issues with Customer Data Collection Submitted To: Dr. David Spurlock Submitted By: Tatiana Cardona Sulagna Mandal Hari Nadathur Pranav Godse
  • 2. 2 ABSTRACT The paper discusses topics related to:  Data mining-collection and analysis of large amounts of data. Data mining is a branch of Computer Science which deals in processing large scale data to extract previously unknown, interesting patterns. The objective is to process and get the required relevant information from large volumes.  Ethical issues related to data mining and how it impacts web miners and web users. Data mining does possess information privacy threat as an individual’s/group’s personal information is freely available. The individual must have information on what is the purpose of the data collection, who is the recipient of the data, its implications and related information. Ethical data mining is however acceptable. It refers to the ethical usage of individual data in accordance with the privacy rules and set standards.  Defines the fine line between ethical and unethical usage of data mining. Although the impact of web-data mining should be a concern for every web user, there is no reason for people to panic. This technique is not yet being used to its full potential. There is, however, no clear indication of web data being misused to an extent that people are offended.
  • 3. 3 DATA MINING Data Mining involves six stages: 1. Detection: In this stage any noticeable difference in the data patterns is detected. This stage is very crucial, since the quality of data collected will impact on the output. 2. Dependency modelling: The relationship between the variables is found such as the buying trends of a particular age group, effect of tax reduction on savings, sales effect on sale of goods due to discount, etc. 3. Clustering: Clustering is a process of partitioning a set of data (or objects) into a set of meaningful sub-classes, called clusters. This helps in understanding the natural grouping or structure in a data set. 4. Classification: Data classification is the classification of data based on its level of sensitivity. The classification of data helps determine what baseline security controls are appropriate for safeguarding that data. DATAMINING Anonymous data collected. (Generally used to recognize group patterns) CONTENT MINING Information collected trhough web navigation history (E.g. trough cookies) ESTRUCTURE MINIG Information collected identifiyin the IP addres and relating it whit the company provider (IPS) in order to obtain more specific data (E.g. names, address, phone number etc) Data collected from an user. (Generally used to characterize his/her behavior) USAGE MINING Informaton collected when the user give information in order to acces to certain benefits (E.g. loggin information)
  • 4. 4 5. Regression : It is a statistical approach to forecast change in a dependent variable (sales revenue, for example) on the basis of change in one or more independent variables (population and income). 6. Summarization: Summarization is a key data mining concept which involves techniques for finding a compact description of a dataset. Data summarization provides the capacity to give data consumers generalize view of disparate bulks of data. METHODS OF DATA MINING 1. WEB TRACKING Web tracking is all about the Companies that track consumers’ behavior across the Web without their consent, and without providing them any recognizable value. Behavioral audience targeting, like content targeting, sponsored advertorials, pre-rolls and every other ad-product available in digital environments, serves content creators. Keeping content creators in business serves consumers, giving them a myriad of digital environments to explore. But in many cases advertisers misuse behavioral data and this is something against ethics. Third Party companies with no direct relationship to the consumer begin tracking those consumers across numerous websites, create profiles of that behavior and profit off that information that they haven’t asked permission to collect. This is what we call ethically problematic issue. Reasons for Web tracking:-  To Boost Marketing Capabilities  Law Enforcement and Intelligence  Web Analytics
  • 5. 5 2. SURVEYING A survey is a research method for collecting information from a selected group of people using standardized questionnaires or interviews. There are numerous survey research methods to obtain customer preferences and likeness such as:  In-Person Interviews  Telephone interviews  Online Questionnaires BIG DATA PERSPECTIVES Large collections of data have addressed the focus on different perspectives as it can be seeing. As a Technology innovation In order to accomplish its purpose data mining must be developed to answer effectively the concerns about: First, the progress of storage alternatives or Volume. Second, easy acces in real time Velocity. Third, the current data is mostly unstructured (difficult to stablish its exact use due to the large amount and possibilities of anlysis) Variety. As a Commercial Value: The use of data generates value trough the identification of complex patterns in real time (foundation of market research) and the prediction of quality issues. As a matter of privacy: In the challenge of protect the privacy, it must exits a balance in its use and the following factors. Recolection:- sets of data analyzed independently do not represent privacy implications but combined can threaten the privacy. Security:- Personal data can be hacked and stolen. High volume and velocity:- Data should be autonomously analyzed (No time to wait for consents). Significance:- Organisations are far from have the ability to use all the collected data. These perspectives can give a scheme about the direction in which data mining is evolving, and surely is possible to assume that there is not a coming back in the way information is being used.
  • 6. 6 Considering together the three points of view is likely to assume that the progress of the first tow (Technology innovation and Commercial value) are linked to the use of big data as a matter of privacy based on how personal information is analyzed and how consumer relationships are built, bearing in mind security implications within individuals’ social interaction through the use of personal technological devices. ETHICAL CHALLENGES IN CUSTOMER DATA HANDLING Information privacy is defined as the relationship between collection and dissemination of data, technology, the public expectation of privacy, and the legal and political issues surrounding them. Data mining does possess information privacy threat as an individual’s/group’s personal information is freely available. The individual must have information on what is the purpose of the data collection, who is the recipient of the data, its implications and related information. The following are some issues in application of data mining as a commercial value with their ethical facts: The social graph: Deducted by social networking (information given voluntary) is the picture to be built of group-level interactions and the nature of the bonds that bring these people together Ethical challenge: Ambiguity. Uncertainty in the group picture due to the possibility of labeling friends with weak social ties that are not representative of the physical-world life. Ownership of data: Instead of being collected by government entities or the traditional large companies, data is collected by high technology companies as Facebook, Google, and Twitter among others.  Ethical challenge: Some of the owner of the data have the promise of not to sell the data now, but the evolution of data mining as a valuable technology it can change in the future as a consequence of the changes in the policies of data use. Data memory: Data collected and stored can be recalled and analyzed in the future.
  • 7. 7  Ethical challenge: Information storage about individual’s life can retrieve past behaviors (E.g. Facebook timeline can represent a disadvantage for a person who use to party very frequently and now is in a job search). Data memory "may remove the ability for individuals to forget and be forgotten" Passive data collection: Automatic data collection trough passive technologies. (E.g. Mobile location information).  Ethical challenge: Increases the amount of data collected and the variables to take in account in the analysis of the data. But individuals are not aware of it, and even if they authorized the data collection at a first point, systems are not asking each time that are doing the collection. Respecting privacy in a public world: The use of technologies has become necessary nowadays and they are of easy access, offering benefits at low cost (e.g. free apps). However the use of certain technological devices implicates the collection of information from the servers.  Ethical challenge: Individuals can step up from giving information; however the use of the technology has become a necessity and an important factor of social interaction, then the paradox is that making the decision of giving information can represent to be excluded from the community. Although now this ethical issues are challenges in the application of this technology, the laws and regulations are gradually being updated based on the concerns on individuals privacy. Thus is important to highlight the fact that Data mining is an emergent practice, hence it is under an adjusting phase. For its current application the self-regulation is a very important aspect for the companies to take in consideration when dealing with big data. Below are some recommendations that must be taken in support of ethical data mining 1. Verify the data source for authenticity 2. Expectation of customers must be considered and respected. 3. Developing better customer relations 4. Emphasis on ethical data mining
  • 8. 8 5. Control on unregulated data access and software 6. Corrective action to be taken on offenders CASE STUDIES- CONS Target Corporation Case: Target Corporation - A large scale retailer of consumer goods assigns every customer a Guest ID number, tied to their credit card, name, or email address and stores the history of that customer’s purchases and other demographic information they have collected from them or obtained from other sources. Lots of people buy lotions, but one of Target’s employees noticed that women on the baby registry were buying larger quantities of unscented lotion around the beginning of their second trimester. An angry man went into a Target store outside of Minneapolis, demanding to talk to a manager: My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?” The manager having no idea about the issue, looked at the mailer which was addressed to the man’s daughter, and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized and then called a few days later to apologize again. This time however the man said “I had a talk with my daughter. It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.” Despite the accuracy of data analysis by Target Corporation, the teenage girl’s privacy with her personal life is exposed and this results in unethical usage of customer behavior on the web.
  • 9. 9 LinkedIn Lawsuit: Recently LinkedIn CEO Jeff Weiner admitted that the social networking site was guilty of sending too many emails to some users. The “Add Connection” service in LinkedIn lets users to import contacts from their email accounts and send invitations to connect on the site. The way the "Add Connections" service works is that an email invitation is sent out by LinkedIn to the contact, but if the person does not respond to the invitation within a certain amount of time, LinkedIn follows up by sending them two more reminder emails. The suit claims that LinkedIn repeatedly “spammed” those contacts with unwanted emails despite LinkedIn members not providing their consent to send the additional emails. LinkedIn said in an email to its users that anyone who used the service between Sept. 17, 2011, and Oct. 31, 2014, is eligible to file a claim. The amount that each user will receive will depend on how many people come forward, but LinkedIn said each person could earn up to $1,500. LinkedIn says it has revised its disclosures to clarify that two reminder emails will be sent as part of its "Add Connections" feature. The company says it will, by year's end, also offer an option to users to cancel a connection invitation, thereby halting any additional reminder emails from being sent out. This case is a classic example of ownership of data and passive data collection which pose ethical challenges to customer’s privacy on the web.
  • 10. 10 ARGUMENTS TO SUPPORT DATA MINING-PROS Arguments that defend the above discussed ethical issues based on the experiment conducted on professionals applying web data mining practices in a business context. Their views are as follows:  Web-data mining itself does not give rise to new ethical issues. Professionals argue that there is nothing new about web-data mining practices as it is just an extension of old situations to new situations created by computer and information technology. One first has to clear up the uncertainties, which have to do with understanding what data mining is. Most of the possible dangers come from group profiling, and since group profiling has been done before data mining techniques were known, the issues could be considered to be old news.  There are laws to protect private information. This argument cannot be told with conviction, as the law is never fully sufficient with respect to privacy problems. For instance, current privacy laws only offer protection for the misuse of identifiable personal data but there is no legal protection for the misuse of anonymized data used as if it were personal data. The growing number of online privacy policies is an example of self- regulating efforts. Such policies, however, are not found on every site. Thus, there are still a lot of sites that a person, who is concerned about his online privacy, should not visit. In addition, it is not always an easy task for a web user to thoroughly read the privacy statements on every site he/she visits.
  • 11. 11  Many individuals simply choose to give up their privacy, and why not use this data. As people can refuse to give out information about themselves, they possess some power to control their relationship with organizations. Many individuals simply choose to give up their privacy and what can be wrong with collecting this public data from the web that is voluntarily given? It is there for the taking.  Most collected data is not of a personal nature, or is used for anonymous profiles. So why should there be a privacy problem? An argument often heard is: “Our software is used to identify crowd behavior of visitors to web sites. Therefore, if we don’t know who you are, how can we be invading your privacy?  Web-data mining leads to less unsolicited marketing approaches. Data mining techniques will provide more accurate and more detailed information, which can lead to better and fairer judgements. So, web-data mining leads to less unwanted marketing approaches. Therefore, why would people complain?  Personalization leads to individualization instead of de-individualization. Most customers like to be recognized, and treated as a special customer. So it is not considered a violation of privacy to analyze usage interaction.
  • 12. 12 CONCLUSION Although there are many ethical challenges prevalent with respect to data mining, it can be attributed to the fact that data mining is an emerging technology and the market is adjusting to its capabilities and there is no immediate threat to users. So, it is by no means clear that companies are using unexpected and non- obvious associations, classifications, clusters, and profiles based on web data as grounds for decision- making  The solutions discussed previously can contribute to the responsible and well considered development and application of web-data mining.  The laws and regulations associated to it are bound to evolve depending on how it is perceived.  There are things that can be done to guide this technique in a socially acceptable direction.  As ethical issues will grow as rapidly as the technology, ethical considerations should be an integrated and essential part of this development process instead of something at its side.  This is a joint responsibility of both web miners and web users. Some methods to avoid web tracking: 1. Ensure that the website is safe before sharing any information or filling out any registration forms (by checking the website’s privacy policy and commentaries). 2. Ensure that your online accounts in the different websites are configured for providing optimal privacy levels. 3. Use an email provider that has a reliable dedication to the protection of customer privacy. 4. Enhance the privacy of your browser through various add-ons and extensions.
  • 13. 13 REFERENCES Earley, S. (2014). Big Data and Predictive Analytics: What's New? IT Professional IT Prof., 13-15. Reteived November 16, 2015. http://ieeexplore.ieee.org.libproxy.mst.edu/stamp/stamp.jsp?tp=&arnumber=6756866 Wel, L., & Royakkers, L. (2004). Ethical issues in web data mining. Ethics and Information Technology, 6(2), 129-140. Retrieved November 17, 2015, from http://link.springer.com/article/10.1023/B:ETIN.0000047476.05912.3d Nunan, D., & Domenico, M. (2013). Market research and the ethics of big data. International Journal of Market Research Int. J. Market Res. http://um9mh3ku7s.search.serialssolutions.com/?ctx_ver=Z39.88- 2004&ctx_enc=info%3Aofi%2Fenc%3AUTF- 8&rfr_id=info:sid/summon.serialssolutions.com&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=art icle&rft.atitle=Market+research+and+the+ethics+of+big+data&rft.jtitle=INTERNATIONAL+JOURNAL +OF+MARKET+RESEARCH&rft.au=Nunan%2C+D&rft.au=Di+Domenico%2C+M&rft.date=2013&rft .pub=MARKET+RESEARCH+SOC&rft.issn=1470- 7853&rft.volume=55&rft.issue=4&rft.spage=505&rft.epage=520&rft_id=info:doi/10.2501%2FIJMR- 2013-015&rft.externalDBID=n%2Fa&rft.externalDocID=000340017200005&paramdict=en-US GENERAL REFERENCES Moftakhari, M., Ethical issues in data Mining. 23 pages. http://ickm2014.bilgiyonetimi.net/wp- content/uploads/2015/01/mandana.pdf Carr, N., (2010). Tracking is an assault on liberty, with real dangers. The wall street journal. http://www.wsj.com/articles/SB10001424052748703748904575411682714389888 Harper, J., (2010). It’s modern trade: Web users get as much as they give. The wall street journal. http://www.wsj.com/articles/SB10001424052748703748904575411530096840958. CASE STUDIES: Hill, K., (2012). How Target Figured out a teen girl was pregnant before her father did. Forbes Tech. http://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant- before-her-father-did/ Roberts, J., (2015).LinkedIn will pay $13M for sending those awful mails. Fortune. http://fortune.com/2015/10/05/linkedin-class-action/