Geodemographics involves classifying and analyzing people based on their residential locations using data. It originated from early 20th century studies in London and Chicago that mapped neighborhoods and demographics. Advances in computing and data availability allowed multivariate analysis and clustering of census zones. Parallel coordinates is presented as an effective method for visualizing and analyzing multivariate data, with an example using car attributes. The document discusses various visualization techniques for bivariate, trivariate and multivariate data.
3. Geodemographics
Geodemographics is ‘the analysis of people by where they live’ (Sleight, 2004) or,
more precisely, by a data based classification of residential location (although
classifications have also been produced for workplace, financial services and
CYBERSPACE). The origins of geodemographics include Charles Booth’s Poverty
Maps of London (1898-9, see http://booth.lse.ac.uk)and the 1920s-30s CHICAGO
SCHOOL of Urban Sociology. During the twentieth century, the increasing
availability of national census data and the development of computation permitted
multivariate summaries of census zones to be produced, and for those areas to be
grouped together on a like with like basis using clustering techniques (see
CLASSIFICATION AND REGIONALISATION).
4. Data Table Format: cases x variables
Case, instance, element Variable, factor, trait
Variable1 Variable2 Variable3
Case1 • • •
Case2 • (Values) •
Case3 • • •
5. Example: Home buyer’s table
ft2
Asking price Year built
Home1 • • •
Home2 • (Values) •
Home3 • • •
Sample values: 1, 10.5, 500
10. “It was in 1977 while giving a Linear Algebra
course that I was challenged by my students
to ‘show some multi-dimensional spaces.’
This was the catalyst leading to subsequent
development of the methodology: How do
multi-dimensional lines, planes, curves,
surfaces look in (two) coordinates?”
Alfred Inselberg, inventor of Parallel Coordinates*
http://www.math.tau.ac.il/~aiisreal/
*He calls them //-coords.
12. Parallel Coodinates best for multivariate analysis:
Guitar neck as metaphor...
‘Strings’
are cases
(1,2,3)
but they
can ‘cross’
‘Fret’ is Fret is Fret is Fret is
Var1 Var2 Var3 Var4
Think of ‘notes’ as
data values here
13. Guitar neck as Parallel Coodinates
And, yes,
cases (our
‘strings’)
can be
strummed
Fret is Fret is Fret is Fret is
Var1 Var2 Var3 Var4
Guitars make music, parallel coordinates reveal patterns!
14. Car data as Parallel Coodinates
one
case is
one car
Var1: Var2: Var3: Var4:
MPG Cylin- Horse- Weight
ders power
15. WTMCC
ASA Data Exposition data set. The data set contains 392 a6 a5 a1 b3 b2 b4 a7
automobiles. These automobiles are described by 7 attributes:
a1: MPG, a2: Cylinders, a3: Horsepower, a4: Weight, a5:
Acceleration, a6: Year and a7: Origin. The results are shown
in Figure 4 and Table III. We can see that the number
of crosses between lines decreases much after using the
proposed algorithms.
Fig. 4. Cars data set.
TABLE III
18. Conspectus credits (slide #, author/s, affiliation, source)
Slide 3, R. Harris, University of Bristol, article
Slide 4, S. Sweeney, CSISS, ESDA lecture
Slide 11, A. Inselberg, Tel-Aviv University, website photo
Slide 12, A. Inselberg, Tel-Aviv University, book cover
Slide 17, C. Yang and J. Zhou, Tsinghou University, article
Slide 18, R. Edsall, Arizona State University, article
Slides 19-20, A. MacEachren, GeoVISTA Center / Penn.
State University, G. & N. Andrienko, Fraunhofer Institute
AIS; article
Slide 21, E. Wegman, George Mason University, Powerpoint
presentation