March 11 - 12, 2025
9:30 - 17:00 CET
Instructors: Malte Luken, Sven van der Burg, Carsten Schnober
Some adblockers block the registration window. If you do not see the registration box below, please check your adblocker settings.
The eScience Center offers a range of workshops and training courses, aimed at PhD candidates and other researchers or research software engineers. We organize workshops covering digital skills needed to put reproducible research into practice. These include online collaboration, reproducible code and good programming practices. We also offer more advanced workshops such as GPU Programming, Parallel Programming, Image Processing and Deep Learning.
This hands-on workshop will provide you with the basics of machine learning using Python.
Machine learning is the field devoted to methods and algorithms that ‘learn’ from data. It can be applied to a vast range of different domains, from linguistics to physics and from medical imaging to history.
This workshop covers the basics of machine learning in a practical and hands-on manner, so that upon completion, you will be able to train your first machine learning models and understand what next steps to take to improve them.
We start with data exploration and prepare the data so that it is suitable for machine learning. Then we learn how to train a model on the data using scikit-learn. We learn how to select the best model from the trained models and how to use different machine learning models (like linear regression, logistic regression, and decision tree models). Finally, we discuss some of the best practices when starting your own machine learning project.
The course aims to be accessible without a strong technical background.
This course is for you if:
This course is not for you if:
Also have a look at the syllabus to see what topics we will cover.
If you are uncertain whether this course is for you, please send us an email.
Where: Science Park 402, 1098 XH Amsterdam. Get directions with OpenStreetMap or Google Maps.
When: March 11 - 12, 2025, 9:30 - 17:00 CET.
Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).
Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:
Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.
Workshop files: You will find all slides, notebooks, archived collaborative documents, and other relevant files in the files folder of the workshop website repository after the workshop.
Contact: Please email or training@esciencecenter.nl for more information.
Participants are expected to follow these guidelines:
Machine learning concepts
The predictive modeling pipeline
Selecting the best model
Machine learning algorithms
Machine learning best practices
local Amsterdam time | what |
---|---|
09:30 | Welcome and icebreaker |
09:30 | Introduction to machine learning |
10:30 | Break |
10:40 | Tabular data exploration |
11:30 | Break |
11:40 | Fitting a scikit-learn model on numerical data |
12:30 | Lunch Break |
13:30 | Fitting a scikit-learn model on numerical data |
14:30 | Break |
14:40 | Fitting a scikit-learn model on numerical data |
15:30 | Break |
15:40 | Fitting a scikit-learn model on numerical data |
16:15 | Wrap-up |
16:30 | END |
local Amsterdam time | what |
---|---|
09:30 | Welcome and recap |
09:45 | Handling categorical data |
10:30 | Break |
10:40 | Combining numerical and categorical data |
11:30 | Break |
11:40 | Intuitions on decision trees |
12:00 | Overfitting and underfitting |
12:30 | Lunch Break |
13:30 | Bias versus variance trade-off |
14:00 | Advanced topics |
14:30 | Break |
14:40 | Try out learned skills on US census dataset |
15:30 | Break |
15:40 | Machine learning best practices; Q&A |
16:15 | Wrap-up & Post-workshop Survey |
16:30 | Drinks |
All times in the schedule are in the CET timezone.
To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.
We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.
It is important that you setup everything on your laptop before the start of the course. This includes installing a Python environment and downloading the necessary files. Please follow these setup instructions. Send us an email if you encounter any problems.