Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Cybernetics big data_abrusci_15 novembre 2013
1. Seminario del gruppo di logica ven. 15 nov. 2013
Cybernetics, control and
big data
Teresa Numerico
teresa.numerico@uniroma3.it
2. Outline
• The cultural biases of
cybernetics
• The influence of cybernetics on
Arpanet
• Big data, knowledge as control
and measure, AKA the dream of
reason
3. The epistemology of
closed-box as a model
• The setting up of a simple model for
a closed-box assumes that a number of
variables are only loosely coupled
with the rest of those belonging to
the system. The success of the
initial experiments depends on the
validity of that assumption.
• […] Many of these small compartments
may be deliberately left closed,
because they are considered only
functionally, but not structurally
important
Rosenblueth, Wiener 1945, p. 319
4. The closed box in
action
• […]The behavioristic method of
study omits the specific
structure and the intrinsic
organization of the object. This
omission is fundamental because
on it is based the distinction
between the behavioristic and
the alternative functional
method of study.
Rosenbluet, Wiener, Bigelow 1943, pp.1
5. The cybernetic perspective
on machines and animals
• A further comparison of living organisms
and machines leads to the following
inferences
• The methods of study for the two groups
are at present similar. Whether they
should always be the same may depend on
whether or not there are one or more
qualitatively distinct, unique
characteristics present in one group and
absent in the other. Such qualitative
differences have not appeared so far
Rosenblueth, Wiener, Bigelow, 1943, p.4
6. Behavior and purpose as
metaphors in the closed box
• By behavior is meant any change of an
entity with respect to its
surroundings[…] Any modification of an
object, detectable externally, may be
denoted as behavior
• Purposeful behavior: […] the act or
behavior may be interpreted as directed
to the attainment of a goal – i.e. to a
final condition in which the behaving
object reaches a definite correlation in
time or in space with respect to another
object or event
Rosenblueth, Wiener, Bigelow, 1943, p.1
7. Animals and machines as
information exchange agents
• The physical functioning of the
living individual and the operation
of some of the newer communication
machines are precisely parallel in
their analogous attempts to control
entropy through feedback
• The information is then turned into a
new form available for the further
stages of performance. In both the
animal and the machine this
performance is made to be effective
on the outer world
Wiener 1950, pp.26-27
8. Communication and control
• When I communicate with another person, I
impart a message to him, and when he
communicates back to me he returns a
related message which contains information
primarily accessible to him and not to me
• When I control the actions of another
person, I communicate a message to him,
and although this message is in the
imperative mood, the technique of
communication does not differ from that of
a message of fact. […]
Wiener 1950, 16
9. The metaphors of
cybernetics
• The association of living organisms and
machine according to the concept of
purposeful behavior
• The interpretation of their behavior as
a correlation between an input and an
output
• Input and output may be described as
transmission of messages (information)
• Transmission of messages can be
identified with communication
interpreted as negative feedback, and
servomechanisms
• The effectiveness of negative feedback
is guaranteed by data that exhibit the
order execution
11. From human-machine
interaction…
• […] the future development of
these messages and
communication facilities,
messages between man and
machines, between machines and
man, and between machines and
machines are destined to play an
ever-increasing part
Wiener 1950:16
12. Libraries of the future
• It is both our hypothesis
and our conviction that
people can handle the
major part of their
interaction with the fund
of knowledge better by
controlling and
monitoring the processing
of information than by
handling all the detail
directly themselves
Licklider 1965, p. 28
13. The aim of procognitive
systems
• A basic part of the over-all aim for
procognitive systems is to get the
user of the fund of knowledge into
something more nearly like an
executive‘s or commander‘s position.
He will still read and think
and, hopefully, have insights and
make discoveries, but he will not
have to do all the searching […] all
the transforming, nor all the testing
for matching or compatibility that is
involved in creative use of knowledge
Licklider 1965, p. 32
14. Needs and desires of users
•
•
•
•
•
•
•
•
•
•
Be available when and where needed
Handle both documents and facts
Permit several different categories of input
Make available a body of knowledge organized both
broadly and deeply – and foster the improvement of such
organization through use
Provide access to the body of knowledge through
convenient procedure-oriented languages
Converse or negotiate with the user while he formulates
his requests
Facilitate joint contribution to and use of knowledge by
several or many co-workers
Present flexible wide-band interface to other
systems, such as research systems, informationacquisition systems and application systems
Handle formal procedures (computer programs, subroutines
etc.)
Handle heuristics coded in such a way as to facilitate
their association with situations to which they are
germane
Licklider 1965, pp. 36-39
15. Licklider‘s dream
• The computer will not only help
the scientist with repetitive
tasks but also write the rules in
formulating the research
hypotheses:
• ―one of the main aims of mancomputer symbiosis is to bring
the computing machine effectively
into the formulative parts of
technical problems‖
Licklider
1960, p. 3
16. Command and control = humanmachine interaction
• In a letter to the ―members of the
intergalactic computer network‖ (25
april 1963) Licklider acting as the
head of the IPTO affirmed:
– Command and control must be reviewed in
terms of improved man-machine
interaction, time-sharing and computer
networks
– In the effort of the IPTO there must be
―enough evident advantage in cooperative
programming and operation to lead us to
solve the problems and, thus to bring
into being the technology that military
needs‖
17. Can we store
information?
• It is false to think that
information can be stored
without an overwhelming
depreciation of its value in a
changing world because:
Wiener 1950: 121
18. Bob Taylor and Vietnam
reports
• There were discrepancies in reporting that was
coming back from Vietnam to the White House about
enemy killed, […] logistics reports of various
kinds
• […] I talked to various people who were submitting
these reports back to Washington. I got a sense of
how the data was collected, how it was analyzed,
and what was done with it before it was sent back
to the White House, and I realized that there was
no uniform data collection or reporting structure
• So they built a computer center at Tonsinook and
had all of this data come in through there. After
that the White House got a single report rather
than several. That pleased them; whether the data
was any more correct or not, I don't know, but at
least it was more consistent
Taylor 1989, pp. 12-13
19. Arpanet birth
• In 1968 Bob Taylor and
Licklider wrote the
seminal paper on The
computer as a
communication device
and an year later Bob
Taylor (head of the
IPTO at the time)
started the Arpanet
project connecting the
first 4 nodes
21. How big is big data
• In December 2012, IDC and EMC estimated the
size of the digital universe (that is, all
the digital data created, replicated and
consumed in that year) to be 2,837 exabytes
(EB) and forecast this to grow to 40,000EB by
2020 — a doubling time of roughly two years.
• One exabyte equals a thousand petabytes
(PB), or a million terabytes (TB), or a
billion gigabytes (GB). So by 2020, according
to IDC and EMC, the digital universe will
amount to over 5,200GB per person on the
planet
Charles McLellan Big Data an overview, 1 october
2013, ZDNET http://www.zdnet.com/big-data-an-overview7000020785/
23. Why quantity means
quality?
• Peter Norvig, Google's research
director, offered an update to George
Box's maxim: "All models are wrong, and
increasingly you can succeed without
them."
• Out with every theory of human behavior,
from linguistics to sociology. Forget
taxonomy, ontology, and psychology. Who
knows why people do what they do? The
point is they do it, and we can track
and measure it with unprecedented
fidelity. With enough data, the numbers
speak for themselves
Chris Anderson
―The end of the theory‖ (Wired 2008)
24. Quantity is quality
• According to Hegel in
logic:
The science of
– at first quantity as such thus appears
in opposition to quality; but quantity
is itself a quality, self-referring
determinateness as such, distinct from
the determinateness with is its
other, from quality as such. Except that
quantity is not only a quality, but the
truth of quality itself is quantity, and
quality had demonstrated itself as
passing over into it.(p. 279)
25. Correlations instead of
explanations
• State contenti, umana gente, al quia;
ché, se potuto aveste veder tutto,
mestier non era parturir Maria;
• Seek not the wherefore, race of human kind;
Could ye have seen the whole,
no need had been for Mary to bring forth.
Dante, Purgatorio canto III, 37-39
26. Correlation instead of
causation
• Correlation analysis […] based on
hard data are superior to most
intuited causal connections […]. But
in a growing number of contexts, such
analysis is also more useful and more
efficient than slow causal thinking
that is epitomized by carefully
controlled experiments […]
• Causality won‘t be discarded, but it
is being knocked off its pedestal as
the primary fountain of meaning
Mayer-Schönberger, Cukier 2013, pp.67-68
27. Even if you don‘t know why
• If big data teaches us anything,
it is just acting better, making
improvements – without deeper
understanding – is often good
enough […] even if you don‘t know
why your efforts work as they do,
you are generating better
outcomes than you would by not
making such efforts
Mayer-Schönberger, Cukier 2013, pp.195-196
28. Machine instead of humans
decisions
• The biggest impact of big data
will be that data-driven
decisions are poised to augment
or overrule human judgment
Mayer-Schönberger, Cukier 2013, p.141
29. The great weakness of the
machine
• The great weakness of the machine –
the weakness that saves us so far
from being dominated by it – is that
it cannot yet take into account the
vast range of probability that
characterizes the human situation
• The dominance of the machine
presupposes a society in the last
stages of increasing entropy, where
probability is negligible and where
statistical differences among
individuals are nil
Wiener 1950:181
30. The black box philosophy
• With Big-data analysis, however, this
traceability will become much harder. The
basis of an algorithm‘s predictions may
often be far too intricate for most people
to understand
• We can see the risk that big-data
predictions […] will become black-boxes
that offer no accountability, traceability
or confidence
Mayer-Schönberger, Cukier 2013, pp. 178-179
31. Raw data and truth
• This shared sense of starting
with data often leads to an
unnoticed assumption that data
are transparent, that
information is self-evident, the
fundamental stuff of truth
itself
Lisa Gitelman and Virginia Jackson 2013
Raw data is an oxymoron, p. 2
32. Dati e potere
• se avete accesso ai dati e i mezzi per
interpretarli, allora il dato è potere
• Realizzare strumenti online in grado di
portare a termine direttamente compiti di
carattere cognitivo operando sulla conoscenza
stessa, cercando significati e collegamenti
nascosti nel nostro sapere collettivo
• Prima o poi dovremmo riorganizzare il
database della conoscenza, a mano a mano che
scopriamo che i nostri vecchi schemi sono
sbagliati e devono essere aggiornati
Nielsen 2012, pp. 111-112, 141
33. Il cambiamento della
natura della spiegazione
• Non più spiegazioni semplici
• La complessità delle nostre spiegazioni era
condizionata dai limiti della nostra mente
• Ora possiamo usare computer per costruire
modelli complessi con cui operare
• Vedi la traduzione automatica statistica:
google usa un algoritmo statistico
incredibilmente dettagliato pur non
conoscendo le lingue i programmatori
riescono a ottenere risultati notevoli
Nielsen 2012, pp. 142-145
34. Unreasonable effectiveness
of data
• A trillion-word corpus captures even
very rare aspects of human behavior.
[…] this corpus could serve as the
basis of a complete model for certain
tasks - if only we knew how to
extract the model from data
• First lesson of web-derived corpora
of trillions of link videos, images,
tables and user interactions is to
use available large-scale data rather
than hoping for annotated data
Halevy, Norvig, Pereira, 2009, p. 8
35. The semantic
interpretation
• Semantics in semantic interpretation
of natural languages is embodied in
human cognitive and cultural
processes whereby linguistic
expression elicits expected responses
and expected changes in cognitive
state.
• Because of a huge shared cognitive
and cultural context, linguistic
expression can be highly ambiguous
and still often be understood
correctly
Halevy, Norvig, Pereira, 2009, p.10
36. The challenges of semantic
interpretation
•
We have solved the sociological problem
of building a network infrastructure for
the sharing of a trillion pages of
content
• We have solved the technological problem
of aggregating and indexing this content
• We are left with the scientific problem
of interpreting the content, which is
mainly that of learning as much as
possible about the context of the
content to correctly disambiguate it
Halevy, Norvig, Pereira, 2009, p.11
37. What we need to succeed?
• Methods to infer relationship between
column headers or mentions of
entities in the world. These
inferences may be incorrect at
times, but if they are done well
enough we can connect disparate data
collections and thereby substantially
enhance our interaction with web
data.
• Here too Web-scale data might be an
important part of the solution
Halevy, Norvig, Pereira, 2009, p.11
38. What to do with data
interpretation
• Follow the data. Choose a representation that
can use unsupervised learning on unlabeled
data, which is so much more plentiful than
labeled data
• Represent all the data with a nonparametric
model […] because with very large data
sources, the data holds a lot of detail
• For natural language applications trust the
human language has already evolved words for
the important concepts
• Now go out and gather some data, and see what
it can do
Halevy, Norvig, Pereira, 2009, p.12
39. Dataverse and human
understanding
• We are entering into the dataverse
• We have flattened both the social and
the natural into a single world so
that there are no human actor and
natural entities but only agents
(speaking computationally) and actant
(speaking semiotically)
• Much of our ‗knowledge‘ today
surpasseth human understanding
Bowker 2013, pp. 167-170
40. Let‘s get rid of the
humans?
• The intelligent citizen cannot
read the programs that run our
data sets […] increasingly
scientific models are compared
primarily against other models
• Let‘s take the unnecessary human
out of the equation and talk
about the program-data-program or
data program data cycles
Bowker 2013, p. 170
41. The new science based on big
data (The Human Brain Project)
• The convergence between biology and ICT has
reached a point at which it can turn the goal of
understanding the human brain into a reality. It
is this realisation that motivates the Human Brain
Project – an EU Flagship initiative in which over
80 partners will work together to realise a new
"ICT-accelerated" vision for brain research and
its applications.
• One of the major obstacles to understanding the
human brain is the fragmentation of brain research
and the data it produces. Our most urgent need is
thus a concerted international effort that uses
emerging ICT technologies to integrate this data
in a unified picture of the brain as a single
multi-level system.
https://www.humanbrainproject.eu
• The funding started in mid October and the total
funding for the 10 years project is Eur. 1.190
million, of which 643 million from EU
42. Big data (according to
o‘reilly 2012)
•
•
•
•
Volume
Velocity
Variety
Digital nervous system:
The challenge of data
flows, and the erosion
of hierarchies and
boundaries, will lead
us to the statistical
approaches, systems
thinking, and machine
learning we need to
cope with the future we
are inventing (pos.
372)
43. The power of the code
• The maps offered by GUI are
fundamentally mediated: as our
interfaces become more ―transparent‖
and visual, our machines also become
more dense and obscure. The call to
map may be the most obscuring of all:
by constantly drawing connections
between data points, we sometimes
forget that the map should be the
beginning, rather than the end, of
the analysis
Chun 2011, 176-177
44. Knowledge is action AKA
Evelyn Fox Keller‘s thoughts
• There is no pure science and bad
applications
• Knowledge is action not only with
respect to power in society but also
with respect to the object of research
• After the knowledge process the object
will never be the same
• Language‘s role in science is never
considered enough
• The evocative character of language and
its vague, ambiguous status introduces
uncontrolled leaps of
meanings, metaphors, and the prescientific arguments
45. • Tomas did not realize at the
time that metaphors are
dangerous. Metaphors are not to
be trifled with. A single
metaphor can give birth to love
Milan Kundera The unbearable lightness of
being, p. 10
47. Bibliography
•
•
•
•
•
•
•
•
•
•
•
•
•
Chun W. H.K. (2011): Programmed visions, MIT Press, Cambridge (Mass.).
Keller Fox E. (2010) The mirage of a space between nature and nurture, Duke
University Press, Durham & London.
Halevy A., Norvig P., Pereira F., (2009) ―The unreasonable effectiveness of
data‖, IEEE Intelligent systems, March/April 2009, vol.24 n.9 pp.8-12,
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.g
oogle.com/en//pubs/archive/35179.pdf
Licklider, J.C.R. (1960): ―Man-computer symbiosis‖ in IEEE Transactions on
human factors in Electronics, Vol. HFE-I, March 4–11.
http://memex.org/licklider.pdf.
Licklider J.C.R. (1963) Memorandum for members of the affiliated of the
Intergalactic Computer Network. http://packet.cc/files/memo.html.
Licklider J.C.R. (1965): Libraries of the future, The MIT Press, Cambridge,
MA.
Mayer-Schönberger V., Cukier K. (2013) Big Data. A revolution that will
transform how we live, work and think, Houghton Mifflin Harcourt, Boston.
Nielsen M. (2012) Reinventing discovery: the new era of networked science,
Princeton University Press, Princeton.
Rosenblueth A., Wiener N., Bigelow J. (1943) "Behavior, Purpose and
Teleology", in Philosophy of science, Vol. 10, pp. 18-24.
Rosenblueth, A., Wiener, N. (1945) ―The role of models in science‖, Philosophy
of Science, Vol. 12, pp. 316-21.
Taylor Bob oral interview 1989
http://conservancy.umn.edu/bitstream/107666/1/oh154rt.pdf
Wiener, N. (1948/1961): Cybernetics: or Control and Communication in the
Animal and the Machine. MIT Press, Cambridge (Mass).
Wiener, N. (1950): The Human Use of Human Beings. Houghton Mifflin, Boston.