Let’s imagine a SNS called “ChangeIt” that works like this :
- I am taking a photo of my cat that looks like it is laughing.
- I upload photo to ChangeIt with tagging Creative Commons license.
- At the same time the image distributes to my Facebook, Twitter, and Flickr account.
- A Flickr user John saw my image and he thought that he can make something interesting using the cat image.
- John recognized that it is CCL from the statement below the image, so he downloaded the image through the link, then he added a smiling dog image beside the cat.
- John uploaded it to ChangeIt site with CC license stating that it is modification of my image.
- When John upload it, “Change” count of my posting on the cat increases.
- 2 weeks later, I found my cat image has 20 Change count. When I click the number, I can see who changed it as a tree structure. Although only John changed my image, the other guy changed John’s image to outline illustration, and another guy made a flash animation using it. I found a student forked from outline illustrate and used it in his Slide for presentation about pets.
- All derived contents have a link to see the history of this contents as a chain(tree). The original image could be disappeared at some point but as long as somebody is using contents derived from my image, my content’s “change” count increases.
What is this mean?
Creative Commons says “sharing is how society grows, how culture develops, and how innovation happens”; I completely agree with it. Before copyright, license, and patent sort of concepts came out, every human activity was free for use. All inventions and innovations happened based on predecessor’s work. An academic paper, for example, has dozens of “work cited” or reference entries. A paper about simple technology may has dozens of previous studies. If we track back all the way through connections of the influence, the connection may reach at Archimedes or Plato. This happens every moment for academic works, then why not for the image and audio?
Let’s say that I uploaded a smiling cat’s image on my facebook. My facebook friends will see it and leave some comment with pressing “like” button. And some of them will share it with his/her SNS friends. I may be feel good with it for a while. The influence of the contents disappear in a week. Nobody including me do not think about it. The content dies. This happens every moment in SNS world.
When I write and upload something, it automatically means “all rights reserved” or some kind of license is attached on it, so nobody can use it. Whereas, Creative Commons guarantees that the content can be used by anybody to make another contents within the terms. It promotes reusing it. Which means the content’s lifetime can be “forever”. Also the creator of the content can feel that his work is valuable; it makes him to create another content with more effort. That is the virtuous circle of content creation. Who knows I could see my cat’s image at the movie Spider-Man 5.
My assumption is :
- everybody wants to be famous.
- nothing is new, however, anybody can create contents
Approach and technical requirements
To make this kind of service possible, several components of system is needed. In the previous posting (Technical Overview of Creative Commons), I observed what kind of tools and projects have been developing so far relating to Creative Commons. Some of projects or technologies can be used. And some need to be constructed.
Tagging CCL into the file format
Embedding CCL license tag into a content file is typical way. The solution is XMP format that contains CC REL meta data. There are libraries called libLicense, XMP toolkit and some applications called CC Publisher and License Tagger. However, this approach has critical drawback. Metadata in the file format is vulnerable. When we change image format or capture the image on the screen, the tag easily disappears. Also, when an image is uploaded to a SNS service, they resize and change it. Therefore, embedding tag is not the ultimate solution.
What if we store a reference image that contains all metadata including XMP CCL information. And then hash of the file including metadatas can be stored as for the purpose of authentication with time stamp.
Stating license information on the HTML
One other approach is revealing license information on the HTML page. For this, RDFa is used. Projects such as “Open Attribute”, “Metadata Scraper” and “WpLicense” is the attempt for this. Contents creator put html code made by “Open Attribute”, and this code can be crawled by “Metadata Scraper”. This approach has problems :
- HTML tag easily be lost because the writer do not recognize broken tags from the rendered page
- Not all people can edit html code of the page
- Not all service has functionality of html editing
Possible solution would be doing it automatically by service provider like facebook, Instagram, and Flickr. Making an API and having the service providers use the API can be a solution. “Partner Interface” of Creative Commons provides API for making HTML tag.
License Registry and Content Registry
Creative Commons says that they are not making Central registry.
A metadata that contains license information can be stored separately into the License Server. A metadata for a content has unique content id pointing the contents. DOI(Digital object identifier) can be used as a content identifier. A Content id can have multiple service urls that are using this content.
Content based tracking
Let’s say I found an image that fits my new work, but I am not sure if this image has a license, who made this image. In this case, searching license information from the image can be used. The idea is, once extracted keypoint information from an image, even though the image resized and changed, the similarity can be detected. Google Content ID have this kind of functionality for video. In terms of image file, SIFT(Scale-invariant feature transform) algorithm can be used. An open source SIFT implementation, OpenSIFT would be starting point. I am not sure how much it will be reliable.