Fundamentals of NLP

Netherlands eScience Center

May 26 - 27, 2026

9:30 - 17:00 CEST

Instructors: Angel Daza, Carsten Schnober, Alexander Hadjiivanov

Some adblockers block the registration window. If you do not see the registration box below, please check your adblocker settings.

General Information

The eScience Center offers a range of workshops and training courses, aimed at PhD candidates and other researchers or research software engineers. We organize workshops covering digital skills needed to put reproducible research into practice. These include online collaboration, reproducible code and good programming practices. We also offer more advanced workshops such as GPU Programming, Parallel Programming, Image Processing and Deep Learning.

This lesson teaches the fundamentals of Natural Language Processing (NLP) in Python. It will equip you with the foundational skills and knowledge needed to carry over text-based research projects. The lesson is designed with researchers in the Humanities and Social Sciences in mind, but is also applicable to other fields of research.

On the first day we will dive into the importance of linguistic principles when dealing with text data, we will also teach basic techniques for text preprocessing and understand the principles behind word embeddings. The second day begins with an introduction to transformers, followed by hands-on work on classification tasks with the BERT model including basic evaluation techniques. In the afternoon, we will cover large language language models, learn to work locally with open source models and understand potential drwbacks and biases when using this technology.

Who: 

The participant should:

  • be familiar with Python
  • be comfortable working in Jupyter

Where: Science Park 402, 1098 XH Amsterdam. Get directions with OpenStreetMap or Google Maps.

When: May 26 - 27, 2026, 9:30 - 17:00 CEST.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Workshop files: You will find all slides, notebooks, archived collaborative documents, and other relevant files in the files folder of the workshop website repository after the workshop.

Contact: Please email or training@esciencecenter.nl for more information.


Code of Conduct

Participants are expected to follow these guidelines:

Syllabus

Introduction

A Primer on Linguistics

From Words to Vectors

Transformers: BERT and Beyond

Large Language Models

Schedule

Day 1

09:30 Welcome and icebreaker
09:45 Introduction to NLP
10:30 Coffee break
10:40 Introduction to NLP
11:30 Coffee break
11:40 A Primer on Linguistics
12:30 Lunch
13:30 NLP Pipelines and Word Embeddings
14:30 Coffee break
14:40 Train your own Word2Vec
15:30 Tea break
15:45 Topic Modelling with BERTopic
16:45 Wrap-up
17:00 END

Day 2

09:30 Welcome, icebreaker and recap
09:45 Transformers and Attention Mechanism
10:30 Coffee break
10:40 Understanding and Using BERT
11:30 Coffee break
11:40 BERT for Text Classification
12:30 Lunch
13:30 Evaluating NLP Classifiers
14:30 Coffee break
14:40 Prompting and using local LLMs
15:30 Tea break
15:40 Drawbacks and Biases in LLMs
16:45 Post-workshop Survey
17:00 END

All times in the schedule are in the CET timezone.


Setup

To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Software setup

Please follow these setup instructions in preparation for the workshop. This page includes the data sets to be downloaded as well.