This document is part of brain storming the project of cihai. It’s for historic purposes only.

Written Late 2013

Datasets should be available under the most permissive License possible. So long as they provide attribution to the person’s / official institution creating it.

Cihai believes in permissive license (Do as you like, for fun, academic, profit-making) but attribute.

Common concerns over people over their datasets are:

  • What if someone uses my dataset for profit purposes?
  • What if someone doesn’t give attribution to my / my colleagues / my institution / my effort?
  • What if someone doesn’t contribute modifications to my / my colleagues / my institution / my effort?

If you have not participated in an open source software effort, you would be surprised how people are happy to contribute to a common effort. Efforts like GNU/Linux are world-wide collaborations bringing together a rock-solid OS powering supercomputers, the internet.

Permissive-licensing your dataset

Even if have brought into consideration the fruits of an open source software effort you should know some of them are built upon restrictive, viral licenses which are commercially-unfriendly, complicated and require borrowing their code should mean any non-GPL compatible effort - private or freely available permissive, can’t use it without turning the software into GPL too! But GPL licensed software can use permissive software in their efforts! It opened my mind after years of thinking GPL was all principle and virtue! Defend the weak! Protect the innocent!


To add to the confusion, this is referred to by some people as “Free” software. In reality, it’s providing open source, but inhibits the real world realities of a value-added, passionate and expressive society, where people want it to be their choice whether they distribute their changes.

If you ever built something special and felt your hardwork deserved to be placed into the market to see how people receive it’s value, you would understand. But others may find this selfish!

If you not seen a permissively-licensed software project, you would be surprised to see, people are contributing these projects despite there is no fine print requiring them to.

Look into your roots, if you at your core want to open a useful work to the world - why hinder it with a minefield a caveats?

The success of open source efforts isn’t a product of rules, but a result of fast computers, fast internet, convenient developer tools, brain power, and a community of self-interested / passionate individuals who put in the hours and had the descipline to learn programming who had the courage and desire to help.

Case study: IPython

This is an observation and doesn’t infer endorsement by IPython or its contributors.

There is no requirement for providing an open source derivitive (or an upstream patch) for a modification, so does IPython development go stale? 93 pages of patches committed to the project.

There is no requirement restricting large corporations from using them and giving nothing back, Microsoft donated $100K to IPython.

Perhaps academic institutions will snub them for using a permissive license? The core developers are academics.

Perhaps non-profits will snub them for not using a permissive license? IPython gets a $1.15M sloan foundation grant.


A collaborative open source effort is passionate and self-interested parties coming together to be constructive.

Laozi, 老子, pioneered the concept of Wu wei (無爲), we do without doing. In a totally uncontrived way, without requirement, people from around the world crossed political, language and geographical barriers to bring together a common creation. It furthers one to see that.

Whether you are a copyleft, academic, private or none of the above. Providing your data under a permissive license will open your work to the world.