The Italian Blog – CODATA-RDA Summer School
Training in Research Data Science in Trieste
Posted July 13, 2017
By Vicky Lucas,
Training Development Manager of the IEA and Human Dimensions Champion for the e-I&DM group of the Belmont Forum
What is the CODATA-RDA research data science summer school?
The research data science summer school is in full flow. Approximately 50 researchers, practitioners and trainers from six continents are sitting in a computer lab each day instead of enjoying the Italian sunshine and a swim in the Adriatic.
This July is the second time that the course has run, attracting over 600 applicants. Many students are able to attend the course via sponsorship from a range of organisations and instructors donate their time. Course co-director, Dr Hugh Shanahan, describes the selection process as ‘rigorous’ taking candidates from low and middle income countries who demonstrate an interest and aptitude for research data science. They have to provide two references, work in a relevant scientific domain and answer questions on statistics.
The International Centre for Theoretical Physics (ICTP) in Trieste is a main contributor, hosting the course including accommodation, administrative overheads, publicity, assistance with travel visas and some travel contributions. Course co-director, Dr Clement Onime of ICTP, says: “There is a growing trend of the multi-disciplinarity of data from scientific experiments and the course aligns with this.”
Which subjects are taught in the summer school?
The course has a packed schedule. The first week has the technical fundamentals of Unix, Git and R, then moves on to SQL. At the same time data ethics, publishing, licensing and data management are embedded into the schedule to ensure that the full range of skills and knowledge are included for the entire research data life cycle. The second week begins with a day of visualising data then moves on to machine learning and neural networks, with the week finishing on computational infrastructure. For those who want to specialise, there are three options for a third week of bioinformatics, extreme sources of data or the internet of things and big data analytics.
Who attends the course?
Shaily Gandhi is a geoinformatics lecturer at CEPT University in Gujarat, India, who found out about the course by checking the CODATA website after attending CODATA training in Beijing in 2016. She teaches Python and SQL to masters students. Attending the course will benefit her teaching by improving her knowledge of data organisation and analysis as well as allowing her to teach open source packages such as R. Shaily is also studying for a PhD and the skills learnt on research data science will be directly applicable to improving efficiency in her own research. The subject she is most interested in is data visualisation, especially in application to spatial data. Shaily is sponsored to attend the course by Springer Nature, organised via CODATA. It is only the first week and Shaily is enthusiastic about the course saying: “The best thing I’ve learnt so far is team version control using Git and I’m really enjoying the great camaraderie.”
Oscar Arbelaez Echeverri is both a university researcher and full stack developer for a commercial software development company. In his role at Universidad Nacional de Colombia, Oscar researches magnetic materials and information retrieval.
He has a strong interest in training. Oscar tutors undergraduate students, and favours a functional teaching style, showing what a technique or system can do to generate interest and therefore inspire learning.
(You can read his Blog and watch his YouTube Channel.) Oscar explains what he hopes to gain from the course: “I want to take the knowledge back to Colombia and translate it into Spanish for greater accessibility, producing stimulating content to share because there’s not a lot of data science materials for those who are not fluent in English.” His priority is learning more about computational infrastructure as well as seeing first-hand how others teach the material.
Oscar has previously attended training at the ICTP, finding this summer school by browsing their events programme. He wanted to return to learn data science and to maintain contact with the ICTP, praising their ‘commitment to training scientists in developing countries’.
Wilhelmina Nekoto graduated in software engineering in Namibia last year and holds a certificate in radioastronomy.
Attracted to the course by the skills it will develop along with the networking opportunities from such a diverse group, she believes that attending the course is ‘a big opportunity that will advance next generation projects in my country, software will be a bottle neck in future programmes and this course will teach me how to leverage tools for data intensive science’. Open source software and computing infrastructure are a priority for her and the ethos of the course.
Data visualisation is a passion as she notes ‘visualisations make data accessible to all and can be much more influential than written reports’. Wilhelmina intends to start a consultancy business, including becoming a trainer for the Software Sustainability Institute who provided several modules on the summer school.
How do I find out more?
The current summer school runs until July 28th. The next course on research data science will be held in Sao Paulo in December, hosted at the ICTP South American Institute for Fundamental Research. The Trieste summer school will run again in August 2018; contact CODATA for information on attending as student or sponsoring students to attend the training. The ICTP website lists their upcoming events.