1. The African Open Science Platform
Presented by Susan Veldsman
Director: Scholarly Publishing Porgramme
Academy of Science of South Africa (ASSAf)
Research Infrastructure Workshop, 14 May 2018
3. Trusted Research & Data
• Trust is at the centre of the process of science
• Trust research & researchers who have your
best interest at heart
• Build new research on existing research/data
• To be trusted, it needs to be managed
4. Square Kilometre Array (SKA)
• Data collection on a massive scale
• Telescope array to consist of 250,000 radio
antennas between Australia & SA
• Investment in machine learning and artificial
intelligence software tools to enable data analysis
• 400+ engineers and technicians in infrastructure,
fibre optics, data collection
• Supercomputers to process data (IBM)
• To come: super computer 3x times power of
world’s current fastest computer (Tianhe-2) to cope
with SKA data
7. • African human genomic research; Central node at University of
Cape Town
• Using NetMap to monitor connectivity
• Data transfer: Africa Globus Online (668,622 files transferred
between Rhodes University & UCT; 140TB data transferred from
USA to SA
• Challenges: slow & unstable Internet, unreliable power supply,
continent-wide obsolete computer infrastructure that varies
between medium-scale server infrastructure to a small number of
workstations, with multiple operating systems, lack of centralized,
secure data storage
• Other: database of participants (H3APRDB, REDCap), data analysis
incl. Galaxy, Job Management System, eBiokits, REDCap,
WebProtege, Pipelines for data execution, data repository
(European Genome-Phenome Archive)
8. Open Science Defined
“Open Science is the practice of science in such a
way that others can collaborate and contribute,
where research data, lab notes and other
research processes are freely available, under
terms that enable reuse, redistribution and
reproduction of the research and its
underlying data and methods.” - FOSTER Project,
funded by the European Commission
11. African Open Science Platform
• Platform = opportunity to engage in dialogue,
create awareness, connect all, provide continental
view
• Funded by SA Dept. of Science & Technology
through National Research Foundation
• 3 years (1 Nov. 2016 – 31 Oct. 2019)
• Managed by Academy of Science of South Africa
(ASSAf)
• Through ASSAf hosting ICSU Regional Office for Africa
(ICSU ROA)
• Direction from CODATA
http://africanopenscience.org.za/
12. Accord on Open Data in a
Big Data World
• Values of open data in
emerging scientific culture
of big data
• Need for an international
framework
• Proposes comprehensive
set of principles
• FAIR Principles
• Provides framework & plan
for African data science
capacity mobilization
initiative
• Proposes African Platform
Call to Endorse
13. Key Stakeholders
• Global Network of Science Academies (IAP)
• International Council for Science (ICSU)
• The World Academy of Sciences (TWAS)
• Research Data Alliance (RDA)
• NRENs (Internet Service Providers for Education)
• Association of African Universities (AAU)
• Network of African Science Academies (NASAC)
• African Research Councils (incl. DIRISA, funders)
• African Universities
• African Governments
• Other
17. Click to view Initiatives/Country
https://www.targetmap.com/viewer.aspx?reportId=56245
Please note: this is just a preview and data still to be cleaned and
updated and corrected.
19. Policy Framework
• Policy provide guidance & see to well-being of
all citizens - political will
• Policies to address (also see existing policies):
• FAIR Principles
• Raw vs Processed/other data
• Licensing
• Sensitive data
• Intellectual Property Issues
20. Policy Framework
• JKUAT (Kenya) Institutional Open Data Policy
• Uganda Draft Open Data Policy
• Madagascar Lobbying for Open Data Policy
• Towards a White Paper on Open Research Data
Strategy in Botswana
• White Paper on Science , Technology and Innovation in
South Africa
• South Africa Open Science Framework—EU/SA
dialogue
• Funder Policy: National Research Foundation (NRF)(SA)
• OECD Principles & Guidelines for Access to Research
Data from Public Funding
21. Capacity Building Framework
• Data collector vs data user vs data manager
Therefore the following are core aspects to capacity building:
• Research Data Management Planning
• Repositories
• Command Line Interpretation
• Software Development
• Data Organisation
• Data Cleaning
• Data Management & Databases
• Data Analysis & Visualisation (incl. programming)
22. Capacity Building Framework
• Engineers, Statisticians, Data Scientists, Librarians, Data
Curators, Researchers, System Administrators,
Policymakers, Auditors, Data Centre Managers, Data
Architects – Wim Hugo
• Different skills for different categories of data workers
• Existing workshops presented
• Tertiary curricula need to adapt more rapidly
• Never too early to learn to work with data, program
23. Incentives Framework
• Funder requirements changing
• Mechanisms that acknowledge publication of
datasets and to promote data sharing
• How do we deal with difficulties in sharing
data—what are the solutions
• Why is sharing essential
• How do we make sharing successful
• How do we lay the fears down and ensure buy-
in
25. South African Research and Infrastructure
Roadmap, DST
• Focus on global infrastructure in South Africa,
• South African Research Infrastructure
Roadmap (SARIR) has been developed to
facilitate a research infrastructure investment
programme
• “SARIR is intended to provide a strategic,
rational, medium to long term framework for
planning, implementing, monitoring and
evaluating the provision of research
infrastructures (RIs) necessary for a
competitive and sustainable national system of
innovation” (DST, 2016:2).
26. National Integrated
Cyberinfrastructure Service (NICIS)
• Research Infrastructure is dependent on cyber-
infrastructure.
• This dependency refers to access to physical sites, data
sharing, curation, provenance, protection, and developing
interoperability and metadata standards.
• From the start of any RI programme, E-science and cyber-
infrastructure need to allow virtual access and open access
to national and international data.
• Centre of High Performance Computing,
• South African National Research Network
• Data Intensive Research Initiative of South Africa
• will provide the necessary cyber-infrastructure capabilities
for the successful operation of all the RIs on a generic basis,
this will be known as the National Integrated
Cyberinfrastructure Service” (DST, 206:55
27. Ilifu
• http://www.researchsupport.uct.ac.za/ilifu
• Consortium of 6 Western Cape institutions
• Data-centric, high-performance computing facility
for data-intensive research
• Proto-typing distributed, federated cloud-based
infrastructure as a platform for data-intensive
research (African Research Cloud)
• Data-processing pipelines and e-science research
tools for big data analysis, visualisation and analytics
• Development and implementation of research data
management systems and tools
• Development of platforms, portals and middleware
to support access and collaborative research by
distributed teams on data-intensive projects
28. • Towards strategy and action plan,
implementation plan and governance structure
• Support strategic plans on Science, Technology,
Innovation
• Guide on creating and enabling environment to
harness science, technology and innovation
• Impact socio-economic development & industrialization
• Enhance education in developing & using technologies
• Support collaborative research development &
innovation
SADC Cyber-Infrastructure Framework
29. • Cyber-infrastructure is a key driver for a
knowledge based economy
• Comprises of technologies, skills, people and
policies which support generation, analysis,
transport, sharing, stewardship of information
(incl. data)
• Framework provides Roadmap towards Cyber-
infrastructure Strategy
31. Closing Remarks
• Collaborate & learn from one another –
strength in diversity
• Take ownership & collect/curate data in ethical
way
• Downloaders vs Uploaders
• Trusted & valid data managed in trusted way
• Exploit data for the benefit of society (Min
Naledi Pandor)
• Tell the African story, in an African way
33. Objectives
• Identify the issues that need to be dealt with when
drafting framework/roadmap of RI
• Raise awareness within REN’s regarding services
beyond connectivity
• Consult with YOU as experts to shape
• Our own ideas about the way forward
• Possible Phase two of project
• Identify a”writer”
• Identify a “team” to assist with a
framework/roadmap
We are living in an increasingly data driven world – facebook, twitter, air bnb, uber
Malaria outbreak 2014-2015
World Economic Forum 2018
How to get rid of fake data
Collaborative projects in Biomedical Sciences – genomics research – catching up with outbreaks, ebola, malaria and more
Bioinformatics legs of H3Africa (Human Heridity and Health in Africa)
Work among 30 institutions, 15 Afrucan countries, 2 partners outside Africa
To get Africa talking to one another
Engineers, Statisticians, Data Scientists, Librarians, Data Curators, Researchers, System Administrators, Policymakers, Auditors, Data Centre Managers, Data Architects – Wim Hugo
Ilifu is a consortium of Western Cape institutions that together will establish and operate a data-centric, high-performance computing facility for data-intensive research. The partner institutions are
Cape Peninsula University of Technology
Stellenbosch University
Sol Plaatje University
South African Radio Astronomy Observatory (SARAO, formerly SKA South Africa).
University of Cape Town (lead institute)
University of the Western Cape.
In addition to establishing and operating a data-intensive computing facility, the consortium will – in collaboration with local and international collaborators and partners – undertake research and development programmes for
Proto-typing a distributed, federated cloud-based infrastructure as a platform for data-intensive research, the African Research Cloud.
Development of data-processing pipelines and e-science research tools for big data analysis, visualisation and analytics.
Development and implementation of research data management systems and tools.
Development of platforms, portals and middleware to support access and collaborative research by distributed teams on data-intensive projects.