User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
Powerpoint versiebeheer there is no such thing as a final version 1
1. Version Control and Management
PhD Workshop Carousel
20170407, Hilde van Zeeland & Hugo Besemer
2. Categories in survey feedback
• What is a logical folder/file structure?
• Choices what to keep
• Multiple computers & syncing
• Collaborating on documents
• Relationships between files
• We’re human and other facts of life
New at the meeting: “what did I read and wgere did I read it”
3. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
4. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
5. What is a logical folder structure?
• Rule of thumb: ≤ 15 items per folder
• Maximum 4 levels (some say three)
• Do not mix subject content and form
Hierdense beek Pictures
??
Hierdensebeek.tif
My organization of folders seemed intuitive when
I made it at the beginning of my PhD, but now it's
a bit of a mess and I end up searching through
trees of folders, hoping I filed it where I think it
should be...
6. Example
Study to examine the effects of diet on health
- Conducted over 3 years by 3 researchers – Peter, Lisa
and Anna
There are many ways to organise the data. We will look at
three:
- By researcher
- By year
- By activity
7. Example
It is now the summer holidays in 2016. Peter and Anna
are on holiday, and Lisa has received some urgent
questions from the reviewers. They need to know:
the procedure used to produce the high protein diet
which bureau measured the data
what sort of preprocessing was carried out on the data.
8. A librarians tale :
Pre-coordination and post-coordination
• Pre-coordination: systematic shelfing of books ,
e.g. Universal Decimal Classification
– 63 agriculture
• 636 livestock
– 636.2 cattle
» 636.2.033
– 636.4 Poultry
» 636.4.033 Meat poultry
– Post-coordination: the book can be anywhere if you
can search
• Keyword Poultry AND Keyword meat
9. A librarian’s tale 2
• Some systems (Google docs / drive) allow you to
have the same file in the same folder (see there)
• In Windows and other operating systems you can
have shortcuts
• The are local search engine applications (“Google
desktop”)
“Metadata: take half of what you think you need,
and throw away half of that”
Tim Bray – author XML specification
10. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
11. File naming
• What to do:
– Keep names consistent
– Keep names short (max. 25 characters)
– Use ‘_’ or ‘-’ instead of spaces or dots
– Avoid special characters (&%$#)
– Use the date convention YYYY-MM-DD or YYYYMMDD (ISO 8601)
– Go from generic to specific
– Avoid endings such as ‘new_version’, ‘latest_version’, ‘final_version’,
‘final_final_version)
– Use file versioning (v_01, v_02, etc.)
• Note in a separate document (e.g. a README-file) what codes in
your filenames mean, and what changes your versions include
12. How would you name the file?
12
a. MA_NTC023_20141031.xls
b.MA@NTC#23~20141031.xls
c. MicroArrayData_NetherlandsToxicogenomicsCentreP
roject023_20141031.xls
d.microarrayntc02320141031.xls
e. MA_NTC023_31102014.xls
f. MA/NTC/Project23/OCT31st/data.xls
13. Filename conventions
DO:
• Note in a separate document what element
codes in your filename mean
• Keep short and relevant, about 25 characters.
• Go from generic to specific (handy with sorting
and finding)
• Use ‘_’ or ‘-’ 13
Use fixed elements in your filename:
Version number, date, description content, project
number, name researcher/team.
taken from: Data management Workshop For Researchers
by Tessa Pronk (Utrecht University Library)
14. Filename conventions
DON'T:
• Use special characters (&%$#) or points or whitespace.
• Name your files 'new_version' 'newer_version',
'newest_version'.
• Duplicate files in different folders
• Trust computer-metadata with your file
14
15. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
16. What to keep
• Keeping all intermediate files?
– Mock sense of safety
– “So you are not sure what so delete!”
– Define milestones, and keep milestone files
“Sometimes a document (e.g. a graph, a powerpoint or a piece of text) is used
for different purposes and has to be adapted for the specific purpose. This
leads to a large number of slightly different versions of almost the same
thing."
17. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
18. Relationships between files
.”.....It can be difficult to keep track of changes in different R
scripts, and different plots (esp. which sub-sets are used/the
reasoning behind different sets)........”
20. Relationships between files
This only indicates that these files
and versions co-existed in a file
system. Every other relationship
should be handled in your script or
document history
21. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Extra: what did I read and where did I read it?
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
22. What did I read and where did I read
it?
• Many people use Endnote or Mendeley to
manage literature
• A reference manager should :
– be able to display references in different journal
styles
– Work seamlessly with e.g. MS/Word
– Organize references in folders and by adding tags
26. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
27. • “I have multiple backups and manual syncing between
them makes it difficult to keep track.”
• “Problems with data backups on different locations that
are not synced.”
• “Similar (R) scripts, on two computers. Sometimes issues
with finding the most up to date one.”
Automatic syncing, across devices, of the latest version
Difficulties when synchronising files
28. Synchronising files
WUR-drives:
• W-drive (internal)
• M-drive (personal) - also syncs to C-drive on WURclient
Both accessible externally: myworkspace.wur.nl
Personal cloud storage:
• SURFdrive
• OneDrive for Business
Both discussed later
29. Program
• Part 1: Managing folders and files
– What is a logical file and folder structure?
– File naming
– Choosing what to keep
– Relationships between files
– Literature and ‘what did I read where?’
• Part 2: Applications and platforms
– Synchronising files
– Collaborating on files
30. • “Too many versions of the same document, that have
been under revision by multiple co-authors. Finding the
right document can be a pain.”
• “… having people use the same system to identify and
name documents.”
• “Difficulty with integrating comments from co-authors
into a single document (especially when using LaTeX).”
Easy document identification and processing
Difficulties when collaborating on files
31. Collaborating on files
Sharing without real-time collaboration:
• SURFdrive
• OneDrive for Business
• Sharepoint teamsite
Sharing with real-time collaboration:
• OneNote - for notes
• Google Docs / Word Online - for text
• Overleaf - for LaTeX
33. WITHOUT REAL-TIME COLLABORATION
SURFdrive
• share directly with other SURFdrive users at WUR, or with
others via external link (password / expiration date)
• version control
• contact servicedesk.it@wur.nl
34. WITHOUT REAL-TIME COLLABORATION
OneDrive for Business
• cloud service - store, sync and share
• synchronise with desktop client (Windows & Mac OSX, not Linux)
• free
• 1TB - but stored locally
• private by default, but easy to share (internal/external with link)
35. • version history (only for Office file formats):
• being piloted: contact servicedesk.it@wur.nl
• like SURFdrive: as it is personal storage space, it is best for
personal use and some file sharing, not for team projects.
WITHOUT REAL-TIME COLLABORATION
OneDrive for Business
37. Document library:
• ‘check out’ docs to
avoid simultaneous
authoring
• describe new versions
• be alerted of changes
to files / libraries
Request team site: https://sharepoint.wur.nl
Request X-account: https://www.xaccounts.wurnet.nl/
WITHOUT REAL-TIME COLLABORATION
Sharepoint team site
38. To keep in mind
Note from IT services:
‘Do not use OneDrive for Business [and SURFdrive] for
critical or secret documents … If you have sensitive or
secret data use our Sharepoint, the W-drive or your
M-drive, this way the data is stored on our servers in
Wageningen and is not synchronized to e.g. your iPad
that could be lost.’
‘The preferred solution for collaboration is our
Sharepoint platform.’
39. Collaborating on files
Sharing without real-time collaboration:
SURFdrive
OneDrive for Business
Sharepoint teamsite
Sharing with real-time collaboration:
OneNote - for notes
Google Docs / Word Online - for text
Overleaf - for LaTeX
41. • share notebooks to collaborate (send link / on teamsite)
• edits appear in real time
• version history, changes by page/author
• tips & tricks:
Data Management Support Hub
WITH REAL-TIME COLLABORATION
OneNote
42. WITH REAL-TIME COLLABORATION
Google Docs
• share and collaborate
on documents
(with a Google account)
• edits, comments, chats
• revisions by date/person
(only unlimited for
Google formats)
43. • Tip: Select a file and press
Shift+Z to move it to different
folders – all versions synced
• Do not use Google for
sensitive/secret data:
- US government can
ask Google for your data
- Google can use it too…
WITH REAL-TIME COLLABORATION
Google Docs
44. Google Terms of Service
When you upload, submit, store, send or receive
content to or through our Services, you give Google
(and those we work with) a worldwide license to use,
host, store, reproduce, modify, create derivative works
(…), communicate, publish, publicly perform, publicly
display and distribute such content.
Our automated systems analyze your content (including
emails) to provide you personally relevant product
features …
www.google.com/policies/terms
45. • Word editing in browser - from Sharepoint or OneDrive
• light-weight version of desktop Word
WITH REAL-TIME COLLABORATION
Word Online
47. • share with a link (editing or read-only)
• edits, comments
• manually save versions
WITH REAL-TIME COLLABORATION
Overleaf
48. Questions?
Also visit our Data Management Support Hub
Library > Expertise & Services > Data Management Support hub
www.wur.nl/en/Expertise-Services/Data-Management-Support-Hub.htm
49. Exercises: Google Docs and/or OneNote
Choose Google Docs and/or OneNote, and follow these steps:
OneNote
- Go to http://tinyurl.com/AccessOneNote
- Click ‘Edit in browser’
- Follow the instructions in the notebook. And just play
around a bit!
Google Docs
- Go to https://tinyurl.com/AccessGoogleDocs
- Follow the instructions in the document. And just play
around a bit!
Notas del editor
Although good tools and software will help you, you don’t need specialised lab notebook software to produce good, well-structured data and documentation. For this example we simply use files stored in folders. With a little time and effort, even such a simple system will help you a great deal.
Keep names consistent – you should know what it in it just by reading the file name. NOT: mydata.csv
keep names short (max. 25 characters) – so its easy to find and search through
Note in a separate document what element codes in your filename mean
Use ‘_’ or ‘-’ instead of spaces or dots – some operating systems cannot handle these. Underscores and dashes are a good alternative
Avoid special characters (&%$#) – some operating systems cannot handle these. Underscores and dashes are a good alternativeUse the date convention YYYY-MM-DD or YYYYMMDD (ISO 8601) – files will then sort chronologically in your foldersGo from generic to specific - handy with sorting and finding
Avoid endings such as ‘new_version’, ‘latest_version’, ‘final_version’, ‘final_final_version)Use file versioning (v_01, v_02, etc.) – allows you to track your progress and easily revert to earlier versions of your file. Easy to jump back afew steps if you make mistakes in your analysis. Also very useful if you’re collaborating on a file and working on different locations: you always know which is the latest. You can also version your files by date, but you might find that you save the file more often (many dates) than you really see them as a new milestone version.
Note in a separate document what element codes in your filename mean
good: a
b: symbols
c: too long
d: hard to distinguish different parts of file name, but everything is there
e: OK, but date not converted to international format, better for sorting
f: folderstructure → year is missing; you need folder structure to understand what is in the file