Licensing

Overview

Teaching: 5 min
Exercises: 0 min
Questions
  • What licensing information should I include with my work?

Objectives
  • Explain why adding licensing information to a repository is important.

  • Learn how to check who owns the copyright for research works

  • Choose a proper license.

  • Explain differences in licensing and social expectations.

When a repository with source code, a manuscript or other creative works becomes public, it should include a file named LICENSE or LICENSE.txt in the base directory of the repository that clearly states under which license the content is being made available. This is because creative works are automatically eligible for intellectual property (and thus copyright) protection. Reusing creative works without a license is dangerous, because the copyright holders could sue you for copyright infringement.

A license solves this problem by granting rights to others (the licensees) that they would otherwise not have. What rights are being granted under which conditions differs, often only slightly, from one license to another.

Before you assign a license to the code or manuscript in your Git repository, verify that you hold the copyright and/or have the right to assign a license. This step is often overlooked, but is essential to protect yourself from potential legal repercussions in the future. For instance, the Bayh-Dole Act (see here) stipulates that certain research products created with grant funding are generally owned by the university, rather than individual researchers. Academic publishers also sometimes request that the copyright for manuscripts be transferred to them (see here), though this is changing as open access journals gain traction.

Columbia’s policy on who owns the copyright for a work produced at the university can be found here, and you can obtain general information from Copyright Advisory Services. The STAN Project, a major open source initiative in the department of statistics, has a detailed written account about the actions they took to ensure legal compliance here.

Once you’ve verified that you can assign a license, a few licenses are by far the most popular, and choosealicense.com could help you find a common license that suits your needs. Important considerations include:

Choosing a license that is in common use makes life easier for contributors and users, because they are more likely to already be familiar with the license and don’t have to wade through a bunch of jargon to decide if they’re ok with it. The Open Source Initiative and Free Software Foundation both maintain lists of licenses which are good choices.

This article provides an excellent overview of licensing and licensing options from the perspective of scientists who also write code.

At the end of the day what matters is that there is a clear statement as to what the license is. Also, the license is best chosen from the get-go, even if for a repository that is not public. Pushing off the decision only makes it more complicated later, because each time a new collaborator starts contributing, they, too, hold copyright and will thus need to be asked for approval once a license is chosen.

What licenses have I already accepted?

Many of the software tools we use on a daily basis (including in this workshop) are released as open-source software. Pick a project on GitHub from the list below, or one of your own choosing. Find its license (usually in a file called LICENSE or COPYING) and talk about how it restricts your use of the software. Is it one of the licenses discussed in this session? How is it different?

  • Git, the source-code management tool
  • CPython, the standard implementation of the Python language
  • Jupyter, the project behind the web-based Python notebooks we’ll be using

Key Points

  • Ensure that your manuscript or source code is assigned a license from the onset to avoid issues down the line.

  • Ensure that you have the legal right to assign a license before assigning one.

  • Multiple websites list very good options for open source licenses. Unless you are actively consulting with a lawyer, it is best to use one of these licenses or another established license.