SlideShare una empresa de Scribd logo
1 de 42
Data randomness, variation, coincidences, populations and
estimation and the use and abuse of statistics

www.linkedin.com/in/sureshsood

http://www.slideshare.net/ssood/randomness-28944785
http://datafication.com.au/instagram/
September 9/11 Coincidences
•

911 is the emergency number

•

The twin towers looked like the number 11 so perhaps all 9/11 things relate to 11

•

9 + 1 + 1 = 11, the first flight to hit the twin towers was flight 11

•

On board flight 11 was 92 people on board, 9 + 2 = 11

•

September 11 is the 254th day of the year 2 + 5 + 4 = 11 and 365 – 254 = 111)

•

11 letters each in “New York City”, “Afghanistan”, “the Pentagon”, and “George W. Bush”

•

New York was the 11th state admitted to the union

•

119 (1 + 1 + 9 = 11) used to be the area code to both Iraq and Iran

•

Flight 77 that crashed in Pennsylvania had 65 people on board, 6 + 5 = 11

•

March 11 (2004) attack in Spain. There are exactly 911 days between this and the September 11 (2001) attack.
The Strange Coincidence of the Girl from Petrovka

http://listverse.com/2007/11/12/top-15-amazing-coincidences/
Key Research Finding - Serendipity is essential

"I knew before coming to
university that I wanted to
do something different, I
wanted to take advantage
of all opportunities"

"I randomly received an

email for the Young
Australian Entrepreneurship
competition event and
decided to enter it"

"You make
your luck
happen”

"An executive from the
association addressed my
class for volunteers"

"You want
to be lucky,
not right"
"I found the idea of

my business while
doing an assignment"

(Sood and Marchand 2012)
High Calibre Analytics Graduates
Data Scientist Job Roles
(LinkedIn 16 September 2012)

Notes: Word count shown next to each word
Exclusion words: ability area bay com experience francisco job linkedin preferred san
Adapted from and source: paul.kennedy@uts.edu.au

Statistics as Hypothesis Driven Process
Leading Questions: Yes Prime Minister
(video:bit.ly/yes_stats)

 The risk with data mining is the discovery of meaningless
patterns and given enough data and time you can support almost
anything
 Sir Humphrey Appleby demonstrates use of leading questions to
skew an opinion survey to support or oppose National Service
(Military Conscription)
 Taken from the 1st Season of Yes Prime Minister - Episode 2, The
Ministerial Broadcast.
 Yes Prime Minister is a British political satire/ comedy aired in the
1980s
http://www.kdnuggets.com/2013/07/kdnuggets-cartoon-nsa-cat-videos-ufo-reports-pizza-connection.html
ATLAS: The observed (full line) and expected (dashed line) 95%
CL combined upper limits on the SM Higgs boson production
cross section divided by the Standard Model expectation as a
function of mH in the full mass range considered in this analysis
(a) and in the low mass range (b). The dashed curves show the
median expected limit in the absence of a signal and the green
and yellow bands indicate the corresponding 68% and 95%
intervals.
5

Statistics of the Higgs Boson

Particle physics has an accepted definition for a “discovery”: a five-sigma level of certainty
The number of standard deviations, or sigmas, is a measure of how unlikely it is that an experimental result
is simply down to chance rather than a real effect

Similarly, tossing a coin and getting a number of heads in a row may just be chance, rather than a sign of a
“loaded” coin
The “three sigma” level represents about the same likelihood of tossing more than eight heads in a row
Five sigma, on the other hand, would correspond to tossing more than 20 in a row

One standard deviation from the center would give a probability of 68% of all data (~ 1 in 3)
About 95.5% of the data will be inside two standard deviations
(~ 1 in 22)
About 99.7% lie within three standard deviations (~ 1 in 370),
Four standard deviation events occur 1 in 15,787 times
Five standard deviation events occur 1 in every 1,744, 278 times.
So a five sigma effect, which two experiments now have, means that such a thing would be observed by
chance with a probability of 1/1,744, 278 = 5.7 x 10-7.

This is so unlikely that this is the criterion for accepting an effect as real in particle physics, when it is
corroborated by another experiment as in this case.
24104 Emerging Marketing Issues and Social Media
Assessment item 1: Project (Group)
Objective(s): This addresses Subject Learning Objective/s 1-4 Weighting: 30%
Due: The group report is due by start of lecture in Week 14.
Length: The final deliverable report requires to be of sufficient length to document:

1. The acquisition of the social data and supporting process
2. Visualisation of the network data and key measures
3. Description of models built from social data
4. Conclusion highlighting any useful insights
24104 Emerging Marketing Issues and Social Media
Task: Groups of students (4-5) participate in a practical project to data mine social media data.
Completion of this task requires the group to provide a report documenting the experience in acquiring and
discovering the social data using visualisation, setting up the data mining environment, describing the
findings with regard to the models built from the data and concluding insights. The approach to mine the
data is in 2 stages:
1. Visualise a social network of data freely available to the group e.g. LinkedIn, Twitter, YouTube,
Facebook, email,Flickr.
Identify and describe key network measures
2. Mine the data to build models from the social data
This project uses the sophisticated REVOLUTION R ENTERPRISE software as a platform for data mining.
The software is free for academic use. The Rattle (R Analytical Tool To Learn Easily) package provides a
graphical user interface specifically for data mining using R and overcomes the need to use heavy
programming.
The following resources help to bootstrap the project and amuse the group project members:
Kaggle – Kaggle.com/competitions
AnalyticsBridge A social network for analytics professionals - analyticbridge.com
The R Inferno “If you are using R and you think you‟re in hell, this is a map for you”
http://www.burns-stat.com/documents/books/the-r-inferno/
Furnas, Alexander ( 2012) Everything You Wanted to Know About Data Mining but Were Afraid to Ask, the
Atlantic, 3 April http://www.theatlantic.com/technology/archive/2012/04/everything-you-wanted-to-knowabout-data-mining-but-were-afraid-to-ask/255388/
How ?
Train of Thought Analysis
•
•
•
•
•
•
•
•

A bottom-up approach
Perceptual process of discovery to uncover structure
Distinguish patterns,structure, relationships and anomalies
Reveals indirect links
Knowledge is colour coded
Marketing Analyst can spot irregularities
Not sure why but where does this lead
Harnesses the power of the human mind

Data

Information

Knowledge
How to Find a Killer using Visualisation
•

1990’s Ivan Milat killed 7 backpackers making him Australia's most notorious Serial Killer

•

Everyone in Australia was a suspect

•

Enormous volumes of data from multiple sources





RTA Vehicle records
Gym Memberships
Gun Licensing records
Internal Police records

•
•

Police applied visualisation techniques (NetMap) to the data

•

Reduced the suspect list from 18 million to 230

•

Further analysis with the use of additional information reduced this to 32
Key Network Measures
krackkite.##h (modified labels)

•
•
•
•

Diana’s
Clique

Degree Centrality
Betweenness Centrality
Closeness Centrality
Eigenvector Centrality
Connector
(hub)

Vendor

Contractor ?

Broker

Boundary spanners
NodeXL - Excel 2007/10/13 workbook template for viewing and analyzing network graphs

http://nodexl.codeplex.com/releases/view/108288
Import ego, Fan page and groups networks from Facebook using
Social Network Importer for NodeXL

http://socialnetimporter.codeplex.com/
Aquarius,Aries,Cancer,Capricorn,Gemini,Leo,Libra,
Pisces, Sagittarius,Scorpio,Taurus,Virgo
An-Verb,An-Vis,Hol-Verb,Hol-Vis
A&F,Beijing ,Gucci,LVMH,New York,Old Navy,
,Paris, Sydney, Tiffany, Tokyo, Tommy, Versace

Depriv/Enhance,Enhance/Depriv
Africa,Argentina,Australia,Australia/Hong Kong,
Austria, California, Canada, China, Egypt, England,
Finland, France Germany, Guernsey, Holland, India,
Indonesia, Ireland , Israel, Italy , Japan, Kuwait,
Malaysia, Nepal,Paraguay , Philippines, Phillipines,
Portugual, Saudi Arabia, Singapore South Africa,
Spain, Sweden, Taiwan, Thailand,UK ,USA

Ambivalent, Employee, Opposer, Reporter, Supporter
11. Committed Partnerships, 12. Compartmentalised
Friendship,13. Childhood friendship,14. Courtship,15. Fling, 16.
Secret-Affair, 17. Enslavement , 2. Marriages of Convenience,3.
Best Friendships,4. Kinships, 5. Rebounds/ Avoidance-Driven,6.
Courtships,7.Dependencies 8. Enmities, 9. Love-Hate (Sweeney and
Chew)
23
Model Comparison By Variables/Predictors
Elaboration of Trip to Paris Blog Story (Means-End & Heider)
Woodside,Sood & Miller 2008 When Consumers and Brands Talk Psychology & Marketing
17. "I wanted Paige to get a feel
for shopping experiences that
she would not have at home (aka
the ubiquitous mall). "

3. Paris

+

16. "On our trip to Giverny, we met a young
woman from Brisbane, Australia who was
traveling on her own and we invited her to join
us. Three of us enjoyed delicious and
innovative soufflés, while Paige had the rack of
lamb. We shared two dessert soufflés, one
chocolate and the other cherry/almond. Yum"

+

1.Gayle

+
2. Paige

14. "They had decide to come to Paris
to find the Harley Davidson store so
they could buy Harley Paris t-shirts."

+

4.”The occasion
was my cousin
Paige’s 16th”

15." Michael Osman is an American artists
living in Paris."
"He supplements his income by being a
tour guide." I" found out about him on
Fodors"
"So I engaged Michael for two days."

5. “I am a Canadian
and get by in
French.”

6. "All I can say is WOW! We rented a 2
bedroom, 1 ½ bath apartment (two
showers), "Merlot" from ParisPerfect
http://www.parisperfect.com/ and boy was
it ever perfect! "

7. “We had a full view of the Eiffel from
our charming little terrace. ....We were
within walking distance to two metro
stops (Pont d'Alma or Ecole Militaire) "

13."The father stretched out his cupped
hands which held all of the pieces they were
able to recover, including the memory stick
and he very solemnly said, "El muerto...".

12. Unforgettable Memories
"This trip had so many memories, but here are a few choice
highlights........On our very first night, knowing that the Eiffel
Tower light show started at 10:00 p.m.... she [Paige] dropped
her camera…down 6 flights…we were stunned…Spanish
Family below standing below [with pieces of the camera]”

8. "We were walkable to many good
bistros, cafes and bakeries and only a
few blocks from the wonderful market
street Rue Cler."

18."We went on Fat
Tire's day trip to
+
Monet's gardens and
house in Giverny, about
an hour outside Paris."+

19....."I know Paige will
treasure the memory of
this girl's trip for many
years to come."

11.Sites
•The Marais
•Notre Dame
•L'Arc de Triomphe - 248 steps up and 248 steps
down...
•Champs Elysee
•Jacquemart Museum
•Louvre Lite
•Musee D'Orsay
•Les Invalides, Napoleon's Tomb and the
Napoleon Museum
•Sacre Coeur
•Monmartre
•Rodin Museum
•Pompidou Museum
•Train to Vernon, bike to Giverny with Fat Tire
Bike Tours
•http://www.fattirebiketoursparis.com/
•Eiffel Tower

9. "I bought a Paris Pratique pocket-sized book at a
Metro station. This handy guide has detailed maps
of each arrondisement, as well as the metro lines,
the bus lines, the RER and the SCNF (trains). I'll
never be without this again."

10."Six months before our trip, I gave
Paige a couple of good guide books on
Paris and suggested she let me know
what her interests were since after all,
this was to be her trip."

25
Tag Cloud of Paige’s Story About Travel to Paris

Created from Daniel Steinbock’s TagCrowd under Creative Commons ©

26
Linguistic Inquiry and Word Count (LIWC)
Text Analysis : The Psychological Power of Words

LWIC dimension

“I love Paris”
Paige’s Story

Personal texts

Formal texts

Self-references
(I, me, my)

6.12

11.4

4.2

Social words

10.55

9.5

8.0

Positive emotions

3.04

2.7

2.6

Negative emotions

0.54

2.6

1.6

Overall cognitive words

4.12

7.8

5.4

Articles (a, an, the)

7.74

5.0

7.2

Big words (> 6 letters)

18.40

13.1

19.6

Pennebaker, J. W., Francis ME, Booth RJ. (2001). Linguistic Inquiry and Word Count (LIWC):
LIWC2001. Mahwah: Lawrence Erlbaum Associates.

27
28
29
Which Pattern is Random ?

http://www.wired.com/wiredscience/2012/12/what-does-randomness-look-like/
Ceiling of the Waitomo cave in New Zealand.
http://www.waitomo.com/SiteCollectionImages/glowworms/Waitomo-Glowworm-Caves-New-Zealand-boat-group.jpg
Which Pattern is Random ?

THHHTHTTTTHTTHTTTHHTHTTHT

HTTHTTHTHHTTHTHTHTTHHTHTT

HHHTHTHHTHTTHHTTTTHTTTHTH

HTTHHHTTHTTHTHTHTHHTTHTTH

TTHHTTTTTTTTHTHHHHHTHTHTH

THTHTHTHHHTTHTHTHTHHTHTTT

THTHTHHHHHTHHTTTTTHTTHHTH

HTHHTHTHTHTHHTTHTHTHTTHHT

http://www.wired.com/wiredscience/2012/12/what-does-randomness-look-like/
http://madvis.blogspot.com.au/2010/09/flying-bombs-on-london-summer-of-1944.html

Journal of the Institute of Actuaries 0481
Journal of the Institute of Actuaries 72 (1946)72 (1946) 0481
Newcomb Discovery (1881)
• American mathematician/astronomer Simon Newcomb discovered the
first few pages of a logarithmic table corresponding to the lower
significant digits (typically those below 5) were comparatively dirtier than
the later pages corresponding to the higher significant digits (typically
those above 5)
• Newcomb attributed greater usage to users were looking-up numbers that
started with digit 1 more often than numbers starting with, say, digit 5
• This leads to probability distribution of an user accessing any of the pages
at any given time was skewed in favour of the earlier pages corresponding
to the lower significant digits!
• This was directly in contrast with the normal theory of probability
according to which the probability of randomly picking any number
between one and nine should be equal to the unique value of 1/9 or
roughly 11.11%
Number

Leading (first) digit

350

3

42057

4

0.64

6

If the leading (first) digit is d, then the frequency of
occurrence (probability) of the leading digit is
Log10 (1 + 1/d)
Leading
digit (d)

1

2

3

4

5

6

7

8

9

Probability
of
occurrence

30%

18%

12%

10%

8%

7%

6%

5%

< 5%
Benford Stumbles Over Newcomb Finding
• In 1938, almost half a century after the Newcomb Frank
Benford was going through a large collection of numerical
data from disparate sources when he stumbled upon a similar
finding
• Benford used a huge volume of data to empirically support his
finding including areas of rivers, street addresses of “American
men of Science” and numbers appearing in front-page
newspaper stories. He went on to publish his findings in a
number of papers including the 1937 “The Law of Anomalous
Numbers”. Thus the „ principle ‟ came to be known as
“Benford‟s Law”
Benford Utility
•

Human choices are not random, invented numbers are unlikely to follow Benford’s Law

•

Only works with natural numbers (those numbers that are not ordered in a particular
numbering scheme

•

When people invent numbers, their digit patterns (which have been artificially added to a list
of true numbers) will cause the data set to appear unnatural
–

See Durtshi, Hillison and Pacini (2004) The Effective Use of Benford’s Law to Assist in Detecting Fraud in
Accounting Data by).

•

Does not work with Lottery!

•

Formally proven in 1996

•

Corpus of over 650 papers available at
–

•

http://www.benfordonline.net/list/chronological

Benford Law Plug-in is for Kirix Strata, R package “BenfordTests” or visualise in Tableau
Smartphone, Google Glass or Apple Watchwill
Know What you Want before you do
“…from 2014 your phone [glasses or watch] will
anticipate your needs, do the research, tell you
what what you want to know – sometimes
before the question even occurs to you…”
Chapman, Jake (2013), The Wired World in 2014
Useful References Informing our Thinking
There is a potential 93% average predictability in user mobility, an exceptionally high
value rooted in the inherent regularity of human behavior. Yet it is not the 93%
predictability that we find the most surprising. Rather, it is the lack of variability in
predictability across the population.
Scellato et al. (2011), NextPlace: A Spatio-temporal Prediction Framework for
Pervasive Systems. Proceedings of the 9th International Conference on Pervasive
Computing (Pervasive'11)
Daily and weekly routines => Few significant places every day => Regularity in human
activities => Regularity leads to predictability
Useful References Informing our Thinking
Domenico, A. Lima, Musolesi.M. (2012) Interdependence and Predictability of Human
Mobility and Social Interactions. Proceedings of the Nokia Mobile Data Challenge
Workshop.
we have shown that it is possible to exploit the correlation between movement data and
social interactions in order to improve the accuracy of forecasting of the future geographic
position of a user. In particular, mobility correlation, measured by means of mutual
information, and the presence of social ties can be used to improve movement forecasting
by exploiting mobility data of friends. Moreover, this correlation can be used as indicator of
potential existence of physical or distant social interactions and vice versa.
Sadilek, A and Krumm, J. (2012) Far Out: Predicting Long-Term Human Mobility
Where are you going to be 285 days from now at 2pm …we show that it is possible to
predict location of a wide variety of hundreds of subjects even years into the future and
with high accuracy.
Caution!
“Children never put off till
tomorrow what will keep
them from going to bed
tonight”
ADVERTISING AGE

42

Más contenido relacionado

Similar a Randomness

Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docxModule 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
roushhsiu
 
Victoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division FermilabVictoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division Fermilab
Videoguy
 
군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘
군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘
군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘
NAVER Engineering
 
Going global 2013 key note
Going global 2013 key noteGoing global 2013 key note
Going global 2013 key note
joannefbeale
 
2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data
2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data
2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data
Dachis Group
 

Similar a Randomness (20)

HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7HKU Data Curation MLIM7350 Class 7
HKU Data Curation MLIM7350 Class 7
 
Major project.pptx
Major project.pptxMajor project.pptx
Major project.pptx
 
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.
 
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docxModule 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
Module 1 - CaseFRAMEWORKS OF INFORMATION SECURITY MANAGEMENT.docx
 
Gainsville v1.1
Gainsville v1.1Gainsville v1.1
Gainsville v1.1
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Victoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division FermilabVictoria A. White Head, Computing Division Fermilab
Victoria A. White Head, Computing Division Fermilab
 
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
 
Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)Towards Explainable Fact Checking (DIKU Business Club presentation)
Towards Explainable Fact Checking (DIKU Business Club presentation)
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Owl Research - Fact Books Owl Theme Classroom, Ow
Owl Research - Fact Books Owl Theme Classroom, OwOwl Research - Fact Books Owl Theme Classroom, Ow
Owl Research - Fact Books Owl Theme Classroom, Ow
 
On Semantics and Deep Learning for Event Detection in Crisis Situations
On Semantics and Deep Learning for Event Detection in Crisis SituationsOn Semantics and Deep Learning for Event Detection in Crisis Situations
On Semantics and Deep Learning for Event Detection in Crisis Situations
 
Data, data, data
Data, data, dataData, data, data
Data, data, data
 
Data and science
Data and scienceData and science
Data and science
 
군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘
군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘
군중정보를 이용한 가짜뉴스의 탐지와 확산방지 알고리즘
 
Suggested Annotative Bibliography Essay
Suggested Annotative Bibliography EssaySuggested Annotative Bibliography Essay
Suggested Annotative Bibliography Essay
 
Going global 2013 key note
Going global 2013 key noteGoing global 2013 key note
Going global 2013 key note
 
Machine Learning for Societal Applications
Machine Learning for Societal ApplicationsMachine Learning for Societal Applications
Machine Learning for Societal Applications
 
2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data
2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data
2011 SBS Singapore | Nicholas Gruen, The Coming Revolution in Data
 

Más de suresh sood

Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016
suresh sood
 

Más de suresh sood (20)

Getting to the Edge of the Future - Tools & Trends of Foresight to Nowcasting
Getting to the Edge of the Future - Tools & Trends of Foresight to NowcastingGetting to the Edge of the Future - Tools & Trends of Foresight to Nowcasting
Getting to the Edge of the Future - Tools & Trends of Foresight to Nowcasting
 
Bigdata AI
Bigdata AI Bigdata AI
Bigdata AI
 
Bigdata ai
Bigdata aiBigdata ai
Bigdata ai
 
Data Science Innovations
Data Science InnovationsData Science Innovations
Data Science Innovations
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversation
 
Data science Innovations January 2018
Data science Innovations January 2018Data science Innovations January 2018
Data science Innovations January 2018
 
future2020
future2020future2020
future2020
 
Data science innovations
Data science innovations Data science innovations
Data science innovations
 
Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
 
Swarm jobs
Swarm jobsSwarm jobs
Swarm jobs
 
Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016
 
Beyond dashboards
Beyond dashboardsBeyond dashboards
Beyond dashboards
 
Foresight Analytics
Foresight AnalyticsForesight Analytics
Foresight Analytics
 
Systemof insight
Systemof insightSystemof insight
Systemof insight
 
TPA
TPATPA
TPA
 
Datapreneurs
DatapreneursDatapreneurs
Datapreneurs
 
Future of jobs, big data & innovation
Future of jobs, big data & innovation Future of jobs, big data & innovation
Future of jobs, big data & innovation
 
Jobs Complexity
Jobs ComplexityJobs Complexity
Jobs Complexity
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
Bigdatacooltools
BigdatacooltoolsBigdatacooltools
Bigdatacooltools
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Randomness

  • 1. Data randomness, variation, coincidences, populations and estimation and the use and abuse of statistics www.linkedin.com/in/sureshsood http://www.slideshare.net/ssood/randomness-28944785
  • 2.
  • 4. September 9/11 Coincidences • 911 is the emergency number • The twin towers looked like the number 11 so perhaps all 9/11 things relate to 11 • 9 + 1 + 1 = 11, the first flight to hit the twin towers was flight 11 • On board flight 11 was 92 people on board, 9 + 2 = 11 • September 11 is the 254th day of the year 2 + 5 + 4 = 11 and 365 – 254 = 111) • 11 letters each in “New York City”, “Afghanistan”, “the Pentagon”, and “George W. Bush” • New York was the 11th state admitted to the union • 119 (1 + 1 + 9 = 11) used to be the area code to both Iraq and Iran • Flight 77 that crashed in Pennsylvania had 65 people on board, 6 + 5 = 11 • March 11 (2004) attack in Spain. There are exactly 911 days between this and the September 11 (2001) attack.
  • 5. The Strange Coincidence of the Girl from Petrovka http://listverse.com/2007/11/12/top-15-amazing-coincidences/
  • 6. Key Research Finding - Serendipity is essential "I knew before coming to university that I wanted to do something different, I wanted to take advantage of all opportunities" "I randomly received an email for the Young Australian Entrepreneurship competition event and decided to enter it" "You make your luck happen” "An executive from the association addressed my class for volunteers" "You want to be lucky, not right" "I found the idea of my business while doing an assignment" (Sood and Marchand 2012)
  • 8. Data Scientist Job Roles (LinkedIn 16 September 2012) Notes: Word count shown next to each word Exclusion words: ability area bay com experience francisco job linkedin preferred san
  • 9. Adapted from and source: paul.kennedy@uts.edu.au Statistics as Hypothesis Driven Process
  • 10. Leading Questions: Yes Prime Minister (video:bit.ly/yes_stats)  The risk with data mining is the discovery of meaningless patterns and given enough data and time you can support almost anything  Sir Humphrey Appleby demonstrates use of leading questions to skew an opinion survey to support or oppose National Service (Military Conscription)  Taken from the 1st Season of Yes Prime Minister - Episode 2, The Ministerial Broadcast.  Yes Prime Minister is a British political satire/ comedy aired in the 1980s
  • 12.
  • 13. ATLAS: The observed (full line) and expected (dashed line) 95% CL combined upper limits on the SM Higgs boson production cross section divided by the Standard Model expectation as a function of mH in the full mass range considered in this analysis (a) and in the low mass range (b). The dashed curves show the median expected limit in the absence of a signal and the green and yellow bands indicate the corresponding 68% and 95% intervals.
  • 14. 5 Statistics of the Higgs Boson Particle physics has an accepted definition for a “discovery”: a five-sigma level of certainty The number of standard deviations, or sigmas, is a measure of how unlikely it is that an experimental result is simply down to chance rather than a real effect Similarly, tossing a coin and getting a number of heads in a row may just be chance, rather than a sign of a “loaded” coin The “three sigma” level represents about the same likelihood of tossing more than eight heads in a row Five sigma, on the other hand, would correspond to tossing more than 20 in a row One standard deviation from the center would give a probability of 68% of all data (~ 1 in 3) About 95.5% of the data will be inside two standard deviations (~ 1 in 22) About 99.7% lie within three standard deviations (~ 1 in 370), Four standard deviation events occur 1 in 15,787 times Five standard deviation events occur 1 in every 1,744, 278 times. So a five sigma effect, which two experiments now have, means that such a thing would be observed by chance with a probability of 1/1,744, 278 = 5.7 x 10-7. This is so unlikely that this is the criterion for accepting an effect as real in particle physics, when it is corroborated by another experiment as in this case.
  • 15. 24104 Emerging Marketing Issues and Social Media Assessment item 1: Project (Group) Objective(s): This addresses Subject Learning Objective/s 1-4 Weighting: 30% Due: The group report is due by start of lecture in Week 14. Length: The final deliverable report requires to be of sufficient length to document: 1. The acquisition of the social data and supporting process 2. Visualisation of the network data and key measures 3. Description of models built from social data 4. Conclusion highlighting any useful insights
  • 16. 24104 Emerging Marketing Issues and Social Media Task: Groups of students (4-5) participate in a practical project to data mine social media data. Completion of this task requires the group to provide a report documenting the experience in acquiring and discovering the social data using visualisation, setting up the data mining environment, describing the findings with regard to the models built from the data and concluding insights. The approach to mine the data is in 2 stages: 1. Visualise a social network of data freely available to the group e.g. LinkedIn, Twitter, YouTube, Facebook, email,Flickr. Identify and describe key network measures 2. Mine the data to build models from the social data This project uses the sophisticated REVOLUTION R ENTERPRISE software as a platform for data mining. The software is free for academic use. The Rattle (R Analytical Tool To Learn Easily) package provides a graphical user interface specifically for data mining using R and overcomes the need to use heavy programming. The following resources help to bootstrap the project and amuse the group project members: Kaggle – Kaggle.com/competitions AnalyticsBridge A social network for analytics professionals - analyticbridge.com The R Inferno “If you are using R and you think you‟re in hell, this is a map for you” http://www.burns-stat.com/documents/books/the-r-inferno/ Furnas, Alexander ( 2012) Everything You Wanted to Know About Data Mining but Were Afraid to Ask, the Atlantic, 3 April http://www.theatlantic.com/technology/archive/2012/04/everything-you-wanted-to-knowabout-data-mining-but-were-afraid-to-ask/255388/
  • 17. How ? Train of Thought Analysis • • • • • • • • A bottom-up approach Perceptual process of discovery to uncover structure Distinguish patterns,structure, relationships and anomalies Reveals indirect links Knowledge is colour coded Marketing Analyst can spot irregularities Not sure why but where does this lead Harnesses the power of the human mind Data Information Knowledge
  • 18. How to Find a Killer using Visualisation • 1990’s Ivan Milat killed 7 backpackers making him Australia's most notorious Serial Killer • Everyone in Australia was a suspect • Enormous volumes of data from multiple sources     RTA Vehicle records Gym Memberships Gun Licensing records Internal Police records • • Police applied visualisation techniques (NetMap) to the data • Reduced the suspect list from 18 million to 230 • Further analysis with the use of additional information reduced this to 32
  • 19. Key Network Measures krackkite.##h (modified labels) • • • • Diana’s Clique Degree Centrality Betweenness Centrality Closeness Centrality Eigenvector Centrality Connector (hub) Vendor Contractor ? Broker Boundary spanners
  • 20. NodeXL - Excel 2007/10/13 workbook template for viewing and analyzing network graphs http://nodexl.codeplex.com/releases/view/108288
  • 21. Import ego, Fan page and groups networks from Facebook using Social Network Importer for NodeXL http://socialnetimporter.codeplex.com/
  • 22. Aquarius,Aries,Cancer,Capricorn,Gemini,Leo,Libra, Pisces, Sagittarius,Scorpio,Taurus,Virgo An-Verb,An-Vis,Hol-Verb,Hol-Vis A&F,Beijing ,Gucci,LVMH,New York,Old Navy, ,Paris, Sydney, Tiffany, Tokyo, Tommy, Versace Depriv/Enhance,Enhance/Depriv Africa,Argentina,Australia,Australia/Hong Kong, Austria, California, Canada, China, Egypt, England, Finland, France Germany, Guernsey, Holland, India, Indonesia, Ireland , Israel, Italy , Japan, Kuwait, Malaysia, Nepal,Paraguay , Philippines, Phillipines, Portugual, Saudi Arabia, Singapore South Africa, Spain, Sweden, Taiwan, Thailand,UK ,USA Ambivalent, Employee, Opposer, Reporter, Supporter 11. Committed Partnerships, 12. Compartmentalised Friendship,13. Childhood friendship,14. Courtship,15. Fling, 16. Secret-Affair, 17. Enslavement , 2. Marriages of Convenience,3. Best Friendships,4. Kinships, 5. Rebounds/ Avoidance-Driven,6. Courtships,7.Dependencies 8. Enmities, 9. Love-Hate (Sweeney and Chew)
  • 23. 23
  • 24. Model Comparison By Variables/Predictors
  • 25. Elaboration of Trip to Paris Blog Story (Means-End & Heider) Woodside,Sood & Miller 2008 When Consumers and Brands Talk Psychology & Marketing 17. "I wanted Paige to get a feel for shopping experiences that she would not have at home (aka the ubiquitous mall). " 3. Paris + 16. "On our trip to Giverny, we met a young woman from Brisbane, Australia who was traveling on her own and we invited her to join us. Three of us enjoyed delicious and innovative soufflés, while Paige had the rack of lamb. We shared two dessert soufflés, one chocolate and the other cherry/almond. Yum" + 1.Gayle + 2. Paige 14. "They had decide to come to Paris to find the Harley Davidson store so they could buy Harley Paris t-shirts." + 4.”The occasion was my cousin Paige’s 16th” 15." Michael Osman is an American artists living in Paris." "He supplements his income by being a tour guide." I" found out about him on Fodors" "So I engaged Michael for two days." 5. “I am a Canadian and get by in French.” 6. "All I can say is WOW! We rented a 2 bedroom, 1 ½ bath apartment (two showers), "Merlot" from ParisPerfect http://www.parisperfect.com/ and boy was it ever perfect! " 7. “We had a full view of the Eiffel from our charming little terrace. ....We were within walking distance to two metro stops (Pont d'Alma or Ecole Militaire) " 13."The father stretched out his cupped hands which held all of the pieces they were able to recover, including the memory stick and he very solemnly said, "El muerto...". 12. Unforgettable Memories "This trip had so many memories, but here are a few choice highlights........On our very first night, knowing that the Eiffel Tower light show started at 10:00 p.m.... she [Paige] dropped her camera…down 6 flights…we were stunned…Spanish Family below standing below [with pieces of the camera]” 8. "We were walkable to many good bistros, cafes and bakeries and only a few blocks from the wonderful market street Rue Cler." 18."We went on Fat Tire's day trip to + Monet's gardens and house in Giverny, about an hour outside Paris."+ 19....."I know Paige will treasure the memory of this girl's trip for many years to come." 11.Sites •The Marais •Notre Dame •L'Arc de Triomphe - 248 steps up and 248 steps down... •Champs Elysee •Jacquemart Museum •Louvre Lite •Musee D'Orsay •Les Invalides, Napoleon's Tomb and the Napoleon Museum •Sacre Coeur •Monmartre •Rodin Museum •Pompidou Museum •Train to Vernon, bike to Giverny with Fat Tire Bike Tours •http://www.fattirebiketoursparis.com/ •Eiffel Tower 9. "I bought a Paris Pratique pocket-sized book at a Metro station. This handy guide has detailed maps of each arrondisement, as well as the metro lines, the bus lines, the RER and the SCNF (trains). I'll never be without this again." 10."Six months before our trip, I gave Paige a couple of good guide books on Paris and suggested she let me know what her interests were since after all, this was to be her trip." 25
  • 26. Tag Cloud of Paige’s Story About Travel to Paris Created from Daniel Steinbock’s TagCrowd under Creative Commons © 26
  • 27. Linguistic Inquiry and Word Count (LIWC) Text Analysis : The Psychological Power of Words LWIC dimension “I love Paris” Paige’s Story Personal texts Formal texts Self-references (I, me, my) 6.12 11.4 4.2 Social words 10.55 9.5 8.0 Positive emotions 3.04 2.7 2.6 Negative emotions 0.54 2.6 1.6 Overall cognitive words 4.12 7.8 5.4 Articles (a, an, the) 7.74 5.0 7.2 Big words (> 6 letters) 18.40 13.1 19.6 Pennebaker, J. W., Francis ME, Booth RJ. (2001). Linguistic Inquiry and Word Count (LIWC): LIWC2001. Mahwah: Lawrence Erlbaum Associates. 27
  • 28. 28
  • 29. 29
  • 30.
  • 31. Which Pattern is Random ? http://www.wired.com/wiredscience/2012/12/what-does-randomness-look-like/
  • 32. Ceiling of the Waitomo cave in New Zealand. http://www.waitomo.com/SiteCollectionImages/glowworms/Waitomo-Glowworm-Caves-New-Zealand-boat-group.jpg
  • 33. Which Pattern is Random ? THHHTHTTTTHTTHTTTHHTHTTHT HTTHTTHTHHTTHTHTHTTHHTHTT HHHTHTHHTHTTHHTTTTHTTTHTH HTTHHHTTHTTHTHTHTHHTTHTTH TTHHTTTTTTTTHTHHHHHTHTHTH THTHTHTHHHTTHTHTHTHHTHTTT THTHTHHHHHTHHTTTTTHTTHHTH HTHHTHTHTHTHHTTHTHTHTTHHT http://www.wired.com/wiredscience/2012/12/what-does-randomness-look-like/
  • 34. http://madvis.blogspot.com.au/2010/09/flying-bombs-on-london-summer-of-1944.html Journal of the Institute of Actuaries 0481 Journal of the Institute of Actuaries 72 (1946)72 (1946) 0481
  • 35. Newcomb Discovery (1881) • American mathematician/astronomer Simon Newcomb discovered the first few pages of a logarithmic table corresponding to the lower significant digits (typically those below 5) were comparatively dirtier than the later pages corresponding to the higher significant digits (typically those above 5) • Newcomb attributed greater usage to users were looking-up numbers that started with digit 1 more often than numbers starting with, say, digit 5 • This leads to probability distribution of an user accessing any of the pages at any given time was skewed in favour of the earlier pages corresponding to the lower significant digits! • This was directly in contrast with the normal theory of probability according to which the probability of randomly picking any number between one and nine should be equal to the unique value of 1/9 or roughly 11.11%
  • 36. Number Leading (first) digit 350 3 42057 4 0.64 6 If the leading (first) digit is d, then the frequency of occurrence (probability) of the leading digit is Log10 (1 + 1/d) Leading digit (d) 1 2 3 4 5 6 7 8 9 Probability of occurrence 30% 18% 12% 10% 8% 7% 6% 5% < 5%
  • 37. Benford Stumbles Over Newcomb Finding • In 1938, almost half a century after the Newcomb Frank Benford was going through a large collection of numerical data from disparate sources when he stumbled upon a similar finding • Benford used a huge volume of data to empirically support his finding including areas of rivers, street addresses of “American men of Science” and numbers appearing in front-page newspaper stories. He went on to publish his findings in a number of papers including the 1937 “The Law of Anomalous Numbers”. Thus the „ principle ‟ came to be known as “Benford‟s Law”
  • 38. Benford Utility • Human choices are not random, invented numbers are unlikely to follow Benford’s Law • Only works with natural numbers (those numbers that are not ordered in a particular numbering scheme • When people invent numbers, their digit patterns (which have been artificially added to a list of true numbers) will cause the data set to appear unnatural – See Durtshi, Hillison and Pacini (2004) The Effective Use of Benford’s Law to Assist in Detecting Fraud in Accounting Data by). • Does not work with Lottery! • Formally proven in 1996 • Corpus of over 650 papers available at – • http://www.benfordonline.net/list/chronological Benford Law Plug-in is for Kirix Strata, R package “BenfordTests” or visualise in Tableau
  • 39. Smartphone, Google Glass or Apple Watchwill Know What you Want before you do “…from 2014 your phone [glasses or watch] will anticipate your needs, do the research, tell you what what you want to know – sometimes before the question even occurs to you…” Chapman, Jake (2013), The Wired World in 2014
  • 40. Useful References Informing our Thinking There is a potential 93% average predictability in user mobility, an exceptionally high value rooted in the inherent regularity of human behavior. Yet it is not the 93% predictability that we find the most surprising. Rather, it is the lack of variability in predictability across the population. Scellato et al. (2011), NextPlace: A Spatio-temporal Prediction Framework for Pervasive Systems. Proceedings of the 9th International Conference on Pervasive Computing (Pervasive'11) Daily and weekly routines => Few significant places every day => Regularity in human activities => Regularity leads to predictability
  • 41. Useful References Informing our Thinking Domenico, A. Lima, Musolesi.M. (2012) Interdependence and Predictability of Human Mobility and Social Interactions. Proceedings of the Nokia Mobile Data Challenge Workshop. we have shown that it is possible to exploit the correlation between movement data and social interactions in order to improve the accuracy of forecasting of the future geographic position of a user. In particular, mobility correlation, measured by means of mutual information, and the presence of social ties can be used to improve movement forecasting by exploiting mobility data of friends. Moreover, this correlation can be used as indicator of potential existence of physical or distant social interactions and vice versa. Sadilek, A and Krumm, J. (2012) Far Out: Predicting Long-Term Human Mobility Where are you going to be 285 days from now at 2pm …we show that it is possible to predict location of a wide variety of hundreds of subjects even years into the future and with high accuracy.
  • 42. Caution! “Children never put off till tomorrow what will keep them from going to bed tonight” ADVERTISING AGE 42

Notas del editor

  1. Social graph in the following order: you, your social network friends, friends-of-friends, your followers, and the overall community.Wall Street feed – simple way to navigate social network of friends social gestures and your –efficient, increased engagement , increases importance of attention info c.f. banking – remember fuss around news feedGoogle Open Social Attention Streams (already included in Plaxo Pulse) - MySpace Friends Updates -Netvibes Activities-LinkedIn Network UpdatesHigh social engagement vs traditional media (radio, tv, print, outdoor) with low engagement. This is about dialogue, interactivity, informality, people + technology &amp; niche NOT Tradigital for mass using push, automation &amp; technology only. Social Media Marketing practice centres around – networks, communities, blogs and microblogging. Traditional business functions can be socialised e.g. legal, supply chain, R&amp;D, HR…Social Strategy (Media) - through sharing; engaging; building relationships and influencingincrease our reach, influence and relevancecreate ambassadors to support and promote what we dopersonalise interactionsencourage and grow communities through a critical mass of active cultural and scientific participants maximise revenuechange our work models from one-to-one communication to many-to-many communicationmove from providing information to creating shared meaning with audiences
  2. These are all interesting, but do they mean anything?What do you think about patterns of this type and their relations to the coincidences.
  3. In 1974, Hopkins starred in &apos;The Girl from Petrovka,&apos; based on a book by George Feifer. Not long after signing on to the film, Hopkins went to London to try to track down a copy of the book. After canvasing several bookshops, he could not find a copy. Frustrated, Hopkins entered Leicester Square train station to board a train when he spotted a copy of &apos;The Girl from Petrovka’. The book was discarded on a nearby bench. Naturally he took the book.Two years later filming in Vienna, the author George Feifer visited the set. During a conversation with Hopkins, Feifer mentions, he doesn’t even have a copy of his own book. He had lent his last copy with his own annotations to a friend. The book had been lost somewhere in London. Hopkins fetched his copy with notes in the margins. Feifer, the author confirmed this indeed was the same book.
  4. Serendipity is an essential pre-requisite to any entrepreneurial activities by all student entrepreneurs (Type 1 and 2) and archetypal entrepreneurs&quot;You make your luck happen&quot; (#101)&quot;I found the idea of my business while doing an assignment&quot; (#105)&quot;I randomly received an email for the Young Australian Entrepreneurship competition event and decided to enter it&quot; (#107).&quot;You want to be lucky, not right&quot; (#105).&quot;I knew before coming to university that I wanted to do something different, I wanted to take advantage of all opportunities&quot; (#102)&quot;An executive from the association addressed my class for volunteers&quot; (#106).
  5. Combined results of searches for the standard model Higgs boson in pp collisions atat √S = 7 TeV, The CMS Collaboration, CERN-PH-EP/2012-023 2012/02/08
  6. Diana – max links (degree centrality) most connected – connector or hub – number of nodes connected – high influence of spreading info or virusHeather – best location powerful figure as broker to determine what flows and doesn’t –single point of failure – high betweeness = high influence – position of node as gatekeeper to exploit structural holes (gaps in network)Fernado &amp; Garth – shortest paths = closeness – the bigger the number the less centralEigenvector = importance of node in network ~ page rank google is similar measure – being connected to well connected a popularity and power measure
  7. The first student’s data has clusters – long runs of up to eight tails in a row. This might look surprising, but it’s actually what you’d expect from random coin tosses (I should know – I did a hundred coin tosses to get that data!) The second student’s data in suspiciously lacking in clusters. In fact, in a hundred coin tosses, they didn’t get a single run of four or more heads or tails in a row. This has about a 0.1% chance of ever happening, suggesting that the student fudged the data (and indeed I did).
  8. The Poisson distribution. It tells you the odds that a large number of infrequent events result in a specific outcome