Demo workshop

Netherlands eScience Center (in person, Amsterdam)

December 28 - 28, 2023

7:00 - 8:00 CET

Instructors: Barbara Vreede, Peter Kalverla

Helpers: Sarah Alidoost, Barbara Vreede

Some adblockers block the registration window. If you do not see the registration box below, please check your adblocker settings.

General Information

The eScience Center offers a range of free workshops and training courses, open to all researchers affiliated with Dutch research organizations. We organize workshops covering digital skills needed to put reproducible research into practice. These include online collaboration, reproducible code and good programming practices. We also offer more advanced workshops such as GPU Programming, Parallel Programming and Deep Learning.

This hands-on workshop will provide you with the basics of machine learning using Python.

Machine learning is the field devoted to methods and algorithms that ‘learn’ from data. It can be applied to a vast range of different domains, from linguistics to physics and from medical imaging to history.

This workshop covers the basics of machine learning in a practical and hands-on manner, so that upon completion, you will be able to train your first machine learning models and understand what next steps to take to improve them.

We start with data exploration so that it is suitable for machine learning. Then we learn how to train a model on the data using scikit-learn. We learn how to select the best model from the trained models and how to use different machine learning models (like linear regression, logistic regression, decision tree and support vector machine), and discuss some of the best practices when starting your own machine learning project.

The workshop will NOT cover the following topics:

Who: 

The course aims to be accessible without a strong technical background. The requirements for this course are:

  • basic knowledge of Python programming : defining variables, writing functions, importing modules
  • some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required.

Where: Science Park 402, 1098 XH Amsterdam. Get directions with OpenStreetMap or Google Maps.

When: December 28 - 28, 2023, 7:00 - 8:00 CET.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email or training@esciencecenter.nl for more information.


Code of Conduct

Participants are expected to follow these guidelines:

Syllabus

Machine learning concepts

The predictive modeling pipeline

Selecting the best model

Machine learning best practices

Schedule

Day 1

09:00 Welcome and icebreaker
09:15 Machine learning concepts
and overview of ML models
10:15 Coffee break
10:30 Tabular data exploration
11:30 Coffee break
11:45 Fitting a scikit-learn model on numerical data
12:45 Wrap-up
13:00 END

Day 2

09:00 Welcome and icebreaker
09:15 Fitting a scikit-learn model on numerical data
10:15 Coffee break
10:30 Fitting a scikit-learn model on numerical data
11:30 Coffee break
11:45 Handling categorical data
12:45 Wrap-up
13:00 END

Day 3

09:00 Welcome and icebreaker
09:15 Handling categorical data
10:15 Coffee break
10:30 Overfitting and underfitting
11:30 Coffee break
10:45 Validation and learning curves
12:45 Wrap-up
13:00 END

Day 4

09:00 Welcome and icebreaker
09:15 Validation and learning curves
10:15 Coffee break
10:30 Bias versus variance trade-off
11:30 Coffee break
11:45 Machine learning best practices;
Q&A
12:45 Wrap-up
13:00 END

All times in the schedule are in the CET timezone.


Setup

To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Software setup

Please follow these setup instructions