Online
April 22 - 25, 2024
9:00 - 13:00 CEST
Instructors: Sven van der burg, Malte Luken, Johan Hidding
Helpers: Claire Donnelly
Some adblockers block the registration window. If you do not see the registration box below, please check your adblocker settings.
The eScience Center offers a range of free workshops and training courses, open to all researchers affiliated with Dutch research organizations. We organize workshops covering digital skills needed to put reproducible research into practice. These include online collaboration, reproducible code and good programming practices. We also offer more advanced workshops such as GPU Programming, Parallel Programming and Deep Learning.
This hands-on workshop will provide you with the basics of machine learning using Python.
Machine learning is the field devoted to methods and algorithms that ‘learn’ from data. It can be applied to a vast range of different domains, from linguistics to physics and from medical imaging to history.
This workshop covers the basics of machine learning in a practical and hands-on manner, so that upon completion, you will be able to train your first machine learning models and understand what next steps to take to improve them.
We start with data exploration and prepare the data so that it is suitable for machine learning. Then we learn how to train a model on the data using scikit-learn. We learn how to select the best model from the trained models and how to use different machine learning models (like linear regression, logistic regression, and decision tree models). Finally, we discuss some of the best practices when starting your own machine learning project.
The course aims to be accessible without a strong technical background.
This course is for you if:
This course is not for you if:
Also have a look at the syllabus to see what topics we will cover.
If you are uncertain whether this course is for you, please send us an email.
Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.
When: April 22 - 25, 2024, 9:00 - 13:00 CEST.
Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).
Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.
Workshop files: You will find all slides, notebooks, archived collaborative documents, and other relevant files in the files folder of the workshop website repository after the workshop.
Contact: Please email or training@esciencecenter.nl for more information.
Participants are expected to follow these guidelines:
Machine learning concepts
The predictive modeling pipeline
Selecting the best model
Machine learning algorithms
Machine learning best practices
| 09:00 | Welcome and icebreaker |
| 09:15 | Machine learning concepts |
| 10:15 | Coffee break |
| 10:30 | Tabular data exploration |
| 11:30 | Coffee break |
| 11:45 | First model with scikit-learn |
| 12:45 | Wrap-up |
| 13:00 | END |
| 09:00 | Welcome and icebreaker |
| 09:15 | Working with numerical data |
| 10:15 | Coffee break |
| 10:30 | Preprocessing features for numerical features |
| 11:30 | Coffee break |
| 11:45 | Model evaluation using cross-validation |
| 12:30 | Intuions on linear models |
| 12:45 | Wrap-up |
| 13:00 | END |
| 09:00 | Welcome and icebreaker |
| 09:15 | Handling categorical data |
| 10:15 | Coffee break |
| 10:30 | Encoding categorical variables |
| 11:15 | Intuitions on tree-based models |
| 11:30 | Coffee break |
| 10:45 | Combining numerical and categorical data |
| 12:45 | Wrap-up |
| 13:00 | END |
| 09:00 | Welcome and icebreaker |
| 09:15 | Theory on selecting the best model: under & overfitting + learning curves |
| 10:15 | Coffee break |
| 10:30 | Pointers to advanced topics |
| 11:00 | Try out learned skills on US Census dataset |
| 11:30 | Coffee break |
| 12:30 | Concluding remarks Q&A |
| 12:45 | Wrap-up |
| 13:00 | END |
All times in the schedule are in the CET timezone.
To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.
We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.
It is important that you setup everything on your laptop before the start of the course. This includes installing a Python environment and downloading the necessary files. Please follow these setup instructions. Send us an email if you encounter any problems.
If you haven't used Zoom before, go to the official website to download and install the Zoom client for your computer.
Like other Carpentries workshops, you will be learning by "coding along" with the Instructors. To do this, you will need to have both the window for the tool you will be learning about (a terminal, RStudio, your web browser, etc..) and the window for the Zoom video conference client open. In order to see both at once, we recommend using one of the following set up options: