This lesson is still being designed and assembled (Pre-Alpha version)

Data licenses

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • What is a data license?

  • Which data license should I use for climate data?

  • What are the consequences of (not) using a data license?

  • Who decides which license I can use?

Objectives
  • Find existing license policies that may apply to your data

  • Choose a suitable license given these constraints

  • Apply a license to your data

General info on licenses

Let’s get started with a small exercise.

Choosing a license

Go to https://chooser-beta.creativecommons.org/ and follow the steps.

  • What license did you arrive at?
  • Does the license fit with the FAIR principles?
  • Discuss.

One of the key parts of FAIR data is that they are reusable not only practically, but also legally. By attaching a license to your work, you specify who can reuse your data, for which purposes, and what should be done with derived work. Creative Commons (CC) are the most used license in research. They are widely recognized, easy to apply, and juridically sound. The CC license is built on four cornerstones:

The six CC licenses are combinations of these cornerstones.

Note

Without a license you keep the copyright to yourself. This prohibits anyone from using the data for their own work.

These are some points to consider when choosing a license:

Further reading:

Discussion

  • When was the last time you published some data?
  • What license did you distribute it under?
  • Where did you share the data?

licenses in Geoscience

In general, data sources in geoscience such as CMIP5 and CMIP6 are available under FAIR licenses. That is, you can freely use the data, but there are still some requirements. For example, if you base your research on CMIP6 data, you must give proper credit to the project if you deposit your research data somewhere. This is specified in the Terms of Use.

You can find the Terms of Use for some widely used projects here:

Discussion

  • How do the terms of use for CMIP6 data differ from the Creative Commons licenses?
  • Can you reshare the data?
  • Should you attribute the authors of the data? If so, how?
  • Can derived data be used commercially?
  • Can you find additional terms of licenses that are applicable to your situation?
  • FIXME: check what the answer should be

Further reading:

Data ownership

One of the major challenges with data licenses is data ownership. Who actually owns the data? Can other parties use the data and what control do you have over the data as the owner? These points are addressed by the license file.

For any academic institute, there are three possible owners of the data:

Data ownership is defined in the data management plan. You may be restricted by your institutions’ policies on licenses.

Excercise

Take a few minutes to find out whether your institute has a license policy.

  • What does it say?
  • Who owns your data?
  • What do you do if your institution does not have a policy?

Excercise

Imagine, some researchers have SOME CRAZY EXAMPLE

They have deposited the data on their institute’s website under the data under CC-BY 4.0.

  • Do you see any problems with the license they chose?
  • Which data license should you put on it when you distribute the data?
  • Which license can you not put on it when you distribute the data?

What does the EU require?

The European Commission has fully embraced FAIR principles. Horizon2020 already mandates open access to all scientific publications, and is now running a pilot from 2017 for research data to be open by default with possibilities to opt-out.

How the data will be licensed is part of the data management plan. Specifically:

Excercise

Which license does the European Commission recommend for data acquired under the Horizon 2020 Programme?

Solution

The European Commission encourages researchers to provide access to their data in the broadest sense. They encourage authors to retain their copyright, but grant adequate licenses to publishers. They recommend the Creative Common licenses as a useful legal tool for providing open access.

Further reading:

How are licenses applied to research data?

In principle, a license is valid when it is mentioned or referred to in the same place where you upload your data. Some repositories like Zenodo allow you to select a license, and they make sure it is clear under which license the data are available.

You can apply a license by one of these ways:

The license chooser can help you generate a plain text or html snippet to add to your readme or website.

Note

Specific to climate-related research, CMIP6 explicitly requires you to add the license to the metadata

Key Points

  • A permissive license ensures the re-usability of your data.

  • Many big inter-comparison projects already have suitable licenses.

  • For derived work, existing licenses may restrict your choice of license.

  • Ownership of data (FIXME)