SlideShare una empresa de Scribd logo
1 de 32
Imagining TM351
From Virtual Machines to Notebooks
Tony Hirst
Computing and Communications
TM351
15J30L3
“The data
course”
TM351
Two new
things
Virtual
Machines
Student’s computer
e.g. Windows
Course software I
Personal folder
Student’s computer
e.g. Windows
Course software I
Course software
II
Personal folder
Student’s computer
e.g. Windows
Course software I
Course software
II
Student’s own browser
Personal folder
Access as
web/browser
application
Download files from web
Student’s computer
e.g. Windows
VirtualBox Application
Guest Operating
System e.g. Linux
Student’s own browser
Personal folder
Download files from web
Access as
web/browser
application
Student’s computer
e.g. Windows
VirtualBox Application
Guest Operating
System e.g. Linux
Course software I
Course software
II
Student’s own browser
Personal folder
Download files from web
Access as
web/browser
application
Virtual machine
Guest Operating
System e.g. Linux
Course software I
Course software
II
Student’s own browser
Personal folder
Download files from web
Student’s computer
e.g. WindowsCloud server
Access as
web/browser
application
Notebook
computing
Literate
programming
Reproducible
research
LiterateProgramming
“A literate programmer is
an essayist who writes
programs for humans to
understand.”
Knuth, Donald E. "Literate programming." CSLI Lecture Notes, Stanford, CA:
Center for the Study of Language and Information (CSLI), 1992 1 (1992).
ReproducibleResearch
“[R]esearch papers with
accompanying software tools that
allow the reader to directly
reproduce the results and employ
the methods that are presented in
the research paper.”
Gentleman, Robert and Temple Lang, Duncan, "Statistical Analyses and Reproducible Research"
(May 2004). Bioconductor Project Working Papers. Working Paper 2.
http://biostats.bepress.com/bioconductor/paper2
[Conversations
with data]
IPythonNotebook
[Corollary to
spreadsheets]
Task oriented
productivity
software
Direct
manipulation,
immediate
feedback
Markdown
Cells
MarkdownCells
Code Cells
CodeCells
Code
Output
CodeOutput
CodeOutput
CodeOutput
VM + .ipynb ?
Browser
IPython
Notebook
IPython
Files
Virtual Machine
Browser
IPython
Notebook
IPython
Files
Any
questions?

Más contenido relacionado

Más de Tony Hirst

Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Tony Hirst
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
Tony Hirst
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data Journalism
Tony Hirst
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
Tony Hirst
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
Tony Hirst
 

Más de Tony Hirst (20)

Virtual computing.pptx
Virtual computing.pptxVirtual computing.pptx
Virtual computing.pptx
 
ouseful-parlihacks
ouseful-parlihacksouseful-parlihacks
ouseful-parlihacks
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Gors appropriate
Gors appropriateGors appropriate
Gors appropriate
 
Robotlab jupyter
Robotlab   jupyterRobotlab   jupyter
Robotlab jupyter
 
Fco open data in half day th-v2
Fco open data in half day  th-v2Fco open data in half day  th-v2
Fco open data in half day th-v2
 
Notes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 WorkshopNotes on the Future - ILI2015 Workshop
Notes on the Future - ILI2015 Workshop
 
Community Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wireCommunity Journalism Conf - hyperlocal data wire
Community Journalism Conf - hyperlocal data wire
 
Residential school 2015_robotics_interest
Residential school 2015_robotics_interestResidential school 2015_robotics_interest
Residential school 2015_robotics_interest
 
Data Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKXData Mining - Separating Fact From Fiction - NetIKX
Data Mining - Separating Fact From Fiction - NetIKX
 
Week4
Week4Week4
Week4
 
A Quick Tour of OpenRefine
A Quick Tour of OpenRefineA Quick Tour of OpenRefine
A Quick Tour of OpenRefine
 
Conversations with data
Conversations with dataConversations with data
Conversations with data
 
Data reuse OU workshop bingo
Data reuse OU workshop bingoData reuse OU workshop bingo
Data reuse OU workshop bingo
 
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories Inspiring content - You Don't Need Big Data to Tell Good Data Stories
Inspiring content - You Don't Need Big Data to Tell Good Data Stories
 
Lincoln jun14datajournalism
Lincoln jun14datajournalismLincoln jun14datajournalism
Lincoln jun14datajournalism
 
Lincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data JournalismLincoln Journalism Research Day - Data Journalism
Lincoln Journalism Research Day - Data Journalism
 
Calrg14 tm351
Calrg14 tm351Calrg14 tm351
Calrg14 tm351
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
 
Hestia linear tales
Hestia linear talesHestia linear tales
Hestia linear tales
 

Último

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 

Último (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 

Calrg14 tm351

Notas del editor

  1. TM351 – a new course, currently in production… Level 3 30 points First presentation slated for October 2015 (15J)
  2. It’s replacing a “traditional” databases course, but we’re planning quite a twists… What those twists are in content terms, though, is the subject of an other presentation…
  3. What I am going to talk about are two new things we’re exploring in the context of the course, and which we’re hopefully might also prove attractive to other course teams.
  4. The first thing are virtual machines. These have already been used on a couple of other OU courses – TM128 and M812 both use virtual machines – but we are taking a more fundamental view about how to use notebooks to delivering interactive teaching material as well as software application services. So what is a virtual machine?
  5. We’re all familiar with the idea that a student can run OU supplied software, either third party software or OU created software, or a combination of both in the case of open source applications where we take some open code and then modify it ourselves, on the student’s own desktop.
  6. We may even require students to install more that one piece of software, perhaps further requiring that these applications can interoperate. With a move to be be “open” and agnostic towards a particular operating system, there are considerable challenges to be faced: software libraries should ideally be cross platform rather than multiple native implementations of the ostensibly the same application; software versions across applications should update in synch with each other; the UI, or look and feel, should be the same across platforms – or we have more writing to do; support issues are likely to scale badly for us as we have to cope with more variations on the configuration of individual student machines (for example, different operating systems, different versions of the same operating system);
  7. One way of mitigating against change is to settle on a single UI space – such as a browser. Applications can be built solely within the browser, and made available to the user requiring little more desktop (or server) application support other than a web server. Application front ends written in HTML5 and Javascript can provide an experience rich enough to rival that of a native application. Application front ends can also be created for applications running as services either on the students’ desktop or via a remote server. Applications can draw on files in a folder on the student’s desktop machine, and the browser can be used to save files (e.g. from the internet) into that folder.
  8. To get round the problem of having to install software onto multiple different possible system configurations, how much easier would it be if we knew exactly what operating system each student was running and they were all running exactly the same operating system. Virtualisation platforms such as Viirtualbox and VMware are cross-platform applications that can be downloaded to a student’s own machine and that then allow an additional guest operating system to be installed in its own container running on the the student’s own computer (the host) via the virtualisation platform. The guest operating system and the software that runs on the guest operating system are said to define a virtual machine or “VM”. The virtual machine can be defined by a central service and then delivered to the students in such a way that each receives a copy of exactly the same virtual machine in terms of its operating system and the applications preinstalled onto it.
  9. What this means is that we can define a VM, preinstall software onto it, and ship it to students so they can run it via a virtualisation platform installed onto their machine. The VM can run applications as services, exposing their UIs via a browser. Files can easily be shared between the host and guest machines. As far as students are concerned, all they need to do is install a virtualisation system onto their computer, and then the same OU virtual machine into that system irrespective of the operating system they happen to be running.
  10. It is also possible to run the VM on a remote server, with the students accessing the services running in that VM via their browser. This means that students can access services using computers that themselves may not be capable of installing or running particular applications – such as some tablet computers.
  11. Notebook computing is my great hope for the future. Notebook computing is like spreadsheet computing, a democratisation of access to and the process of practically based, task oriented computing. Spreadsheets help you get stuff done, even if you don’t consider yourself to be a programmer. My hope is that the notebook metaphor – and it’s actually quite an old one – can similarly encourage people who don’t consider themselves programmers to do and to use programmy things.
  12. Notebook computing buys us in to two ways of thinking that I think are useful from a pedagogical perspective – that is, pedagogy not just as a way of teaching but also as a way of learning in the sense of learning about something through investigating it. Here, I’m thinking of an investigation as a form of problem based learning – I’m not up enough on educational or learning theory to know whether there is a body of theory, or even just a school of thought, about “investigative learning”. These two ways of thinking are literate programming and reproducible research.
  13. In case you haven’t already realised it, code is an expressive medium. Code has its poets, and artists, as well as its architects, engineers and technicians. One of the grand masters of code is Don – Donald – Knuth. Don Knuth said “A literate programmer is an essayist who writes programs for humans to understand” as part of a longer quote. Here’s that longer quote: “Literate programming is a programming methodology that combines a programming language with a documentation language, making programs more robust, more portable, and more easily maintained than programs written only in a high-level language. “Computer programmers already know both kind of languages; they need only learn a few conventions about alternating between languages to create programs that are works of literature. A literate programmer is an essayist who writes programs for humans to understand, instead of primarily writing instructions for machines to follow. When programs are written in the recommended style they can be transformed into documents by a document compiler and into efficient code by an algebraic compiler.” Notebooks are environments that encourage the programming of writing literate code. Notebooks encourage you to write prose and illustrate it with code – and the outputs associated with executing that code. In many cases, the code may already exist. The programming is then more a case of applying an existing bit of code to a new bit of data. That is what you do in a spreadsheet, Oftentimes the code is hidden – or automatically generated – by a menu option selected by graphical user interface. But there is no magic going on (at least, no more magic than is associated with the ability to take electronic representations of text and do something to them that makes them responsible for what appears on a screen, keeps planes flying, and seemingly creates and destroys money on the fly in the world’s financial systems). Code is an incantation – and when you select a menu option in your spreadsheet you are asking the computer to perform that incantation and execute some code. You can also copy and paste code and then run it and it will have the same effect as selecting that operation from a menu. That’s how it works. In literate programming, you can see a human description of what you want to achieve by executing the code, then the code, then the result of executing the code, then an interpretation of the result. Introduction. Method. Results. Conclusion. You know this four part structure, particularly if you’ve ever taught – or been taught – how to write a formal practical report. But you can apply it at an atomic level to. At the level of a particular event. Like a particular scene in a narrative chart, or a particular geotemporal location in a time map.
  14. The other idea that the notebooks buy is into is reproducible research. I love this idea and think you should too. It lets archiving make sense. Do I really have to say any more than just show that quote? Now you may say that that’s all very well for, I don’t know, physics or biology, or science, or economics. Or social science in general, where they do all sorts of inexplicable things with statistics and probably should try to keep track of what they doing. But not the humanities. But that’s not quite right, because in the digital humanities there are computational tools that you can use. Particularly in the areas of text analysis and visualisation. Such as some of the visualisations we saw in the first part of this presentation. But you need a tool that democratises access to this technology. You need an environment that the social scientists found in the form of a spreadsheet. But better. One that helps you keep track of what you did and that produces a serialisation that can be read back in a linear way that makes sense. Even if you don’t create it in a linear way. Even if you did that bit before this bit, but the way you tell it is as this bit before that bit. Which is one reason why postgrads get the fear that their experiment is going wrong. (Don’t panic! Those published papers you read? The work as described never took place the way it was described. The write-up is a post hoc rationalisation of the bits that worked, retold in such a way that it makes it look as if it was planned that way all along.) And here’s a another dirty secret – most of the published reports you read that write up one experiment of another are not replicable from that report.
  15. (I also like to think of notebooks as a place where I can have a conversation with data.).
  16. So how do notebooks help? The tool I want to describe is – are – called IPython Notebooks. IPython Notebooks let you execute code written in the Python programming language in an interactive way. But they also work with other languages – Javascript, Ruby, R, and so on, as well as other applications. I use a notebook for drawing diagrams using Graphviz, for example. They also include words – of introduction, of analysis, of conclusion, of reflection. And they also include the things the code wants to tell u, or that the data wants to tell us via the code. The code outputs. (Or more correctly, the code+data outputs.)
  17. (I also like to think of notebooks as a place where I can have a conversation with data.).
  18. (I also like to think of notebooks as a place where I can have a conversation with data.).
  19. (I also like to think of notebooks as a place where I can have a conversation with data.).
  20. The first thing notebooks let you do is write text for the non-coding reader. Words. In English. (Or Spanish. Or French. I would say Chinese, but I haven’t checked what character sets are supported, so I can’t say that for definite until I check!) “Literate programming is a programming methodology that combines a programming language with a documentation language”. That’s what Knuth said. But we can take it further. Past code. Past documentation. To write up. To story. The medium in which we can write our human words is a simple text markup language called markdown. If you’ve ever written HTML, it’s not that hard. If you’ve ever written and email and wrapped asterisks around a word or phrase to emphasise it, or written a list of items down by putting each new item onto a new line and preceding it with a dash, it’s that easy.
  21. Here’s a notebook, and here’s some text. There’s also some code. But note the text – we have a header, and then some “human text”. You might also notice some up and down arrows in the notebook toolbar. These allow us to rearrange the order of the cells in the notebook in a straightforward way. In a sense, we are encouraged to rearrange the sequence of cells into an order that makes more sense as a narrative for the reader of the document, or in the execution of an investigation. The downside of this is that we can author a document in a ‘non-linear’ way and then linearise it for final distribution simply by reordering the order in which the cells are presented. There are constraints though – if a cell computationally depends on the result of, or state change resulting from, the execution of a prior cell, their relative ordering cannot be changed.
  22. As well as human readable text cells – markdown cells or header cells at a variety of levels – there are also code cells. Code cells allow you to write (or copy and paste in) code and then run it. Applications give you menu options that in the background copy, paste and execute the code you want to run, or apply to some particular set of data, or text. Code cells work the same way, but they’re naked. They show you the code. At this point it’s important to remember that code can call code. Thousands of lines of code that do really clever and difficult things can be called from a single line of code. Often code with a sensible function name just like a sensible menu item label. A self-describing name that calls the masses of really clever code that someone else has written behind the scenes. But you know which code because you just called it. Explicitly. Let’s see an example – not a brilliant example, but an example nonetheless.
  23. Here’s some code. It’s actually two code cells – in one, I define a function. In the second, I call it. (Already this is revisionist. I developed the function by not wrapping it in a function. It was just a series of lines of code that wrote to perform a particular task. But it was a useful task. So I wrapped the lines of code in a function, and now I can call those lines of code just by calling the function name. I can also hide the function in another file, outside of the notebook, then just include it in any notebook I want to… …or within a notebook, I could just copy a set of lines of code and repeatedly paste them into the notebook, applying them to a different set of data each time… but that just gets messy, and that’s what being able to call a bunch of lines of coped wrapped up in a function call avoids.
  24. As far as reproducible research goes, the ability of a notebook to execute a code element and display the output from executing that code means that there is a one-to-one binding between a code fragment and the data on which it operates and the output obtained from executing just that code on just that data.
  25. The output of the code is not a human copied and pasted artefact. The output of the code – in this case, the result of executing a particular function – is only and exactly the output from executing that function on a specified dataset.
  26. The output of a code cell is not limited to the arcane outputs of a computational function. We can display data table results as data tables.
  27. We can also generate rich HTML outputs – in this case an interactive map overlaid with markers corresponding to locations specified in a dataset, and with lines connecting markers as defined by connections described in the original dataset. We can also delete the outputs of all the code cells, and then rerun the code, one step – one cell – after the other. Reproducing results becomes simply a matter of rerunning the code in the notebook against the data loaded in by the notebook – and then comparing the code cell outputs to the code cell outputs of the original document. Tools are also under development that help spot differences between those outputs, at least in cases where the outputs are text based.
  28. So can we run virtual machines and IPython notebooks together?
  29. The IPython notebooks are actually browser based front end applications being powered by an IPython server…
  30. It’s easy enough to run the IPython server on a virtual machine, either running as a guest VM on a student’s host computer, or running as on online service accessed by the student via the web using their own web browser.
  31. There is a lot more that could be said – for example: workflows around the building/provisioning of virtual machines, how we might be able to host such machines either centrally or as a self-service option, the corollary between notebook style computing and spreadsheets, the notion of conversations with data, etc. etc.