On Mon, 2011-06-06 at 20:19 -0400, James Westby wrote:
On Mon, 06 Jun 2011 18:05:54 -0500, Kate Stewart kate.stewart@canonical.com wrote:
After every FileName: there should be a FileChecksum: field.
For each file listed in the package, the fields that are mandatory and should show up are:
- FileName:
- FileChecksum:
- LicenseConcluded:
- LicensesInfoInFile:
- CopyrightText:
The rest are optional, and can be included or not.
You say "for each file listed in the package," is it mandatory to list all files? It seems to me that it would be much better to allow for gradual implementation of what will be a large job for some projects, so I hope it is optional.
Afraid not. Each file gets in a "package/tarball/what ever you choose to group together", gets its own file name listed. Package could be interpreted as tarball equally well as a .deb.
So, yeah, its a large job for some projects. Problem is that right now alot of folks are doing the work in multiple formats already, as a necessary condition of license compliance for shipping code. So, some of the original files may not be produced by the project itself, but rather by some 3rd party, that then can be 'reviewed' and signed off by others.
Yes, tools will help. I'd like to get some open source tools available to do the generation, and am working with Ninka and FOSSology on some prototypes. (help is always welcome ;) ). There are some commercial tool providers who have built up plans to produce the RDF varient of files already with their tools - BlackDuck, Open Logic, Protecode are all active participants. Tag value and the RDF have been designed to be interchangeable for this reason.
I guess it's intended for the format to always be output from some tool rather than hand-edited, unless it's possible to put glob patterns or similar in the FileName field?
For small projects or even single binary files that get packaged, it should be possible to hand generate. For larger packages with multiple source files, yeah, tools are going to be the key here. See comments above. ;)
LicenseConcluded: GPL-2.0
From the spec:
The licensing that the preparer of this SPDX document has concluded, based on the evidence, actual applies to the package.
I think this is where the lawyer would say, this is the license.
Yeah. Again my question is source or binary?
This is captured as an optional field (FileType:) at the file level.
Well, this applies to the whole package by my understanding, so the particular file types may not matter.
Actually if you're shipping a binary file as its own package, like the linux kernel, then its just a package with one file, there's just one file, and the FileType: applies. If you're shipping a source tar ball with 30K+ files, in it, then the Type doesn't make much sense at the Package level, since there could be multiple types involved. There are firmware binaries in the linux kernel sources, for instance. ;)
Take for instance a package which is dual licensed GPL-2 and something else at the source level. If you choose to build this package and link it against GPL-2 only code then some people's interpretation of the GPL would say that the binary is GPL-2 only, despite the source being dual licensed.
This is the sort of argument that lead us down the notion of ConcludedLicense vs. LicenseInfoInFile (at the file and package levels).
Perhaps that's a minority opinion, but I expect there are other cases where the source and binary licenses may not match, so I am wondering if the expectation that this defines the source or binary concluded license.
Yup, its not just source and binary though that cause conflicts. BTW.
Perhaps what you are saying is that there will be different SPDX files for the source and binary, and the difference would be captured there?
We're still figuring out all the use cases, but I would expect the source to be packaged separately from the binary (thinking about how the archive work, this seems to be the case in practice), and there would be different SPDX files as a result.
My overally impression is that this is rather a large additional overhead to just be able to say
the kernel was built from 0A2E345 of git://git.linaro.org/jcrigby/linux-linaro-natty.git
which is the main thrust of kiko's request as I understand it.
If you just want to say the build origins, then yes its overkill for that. If you want to be able to pass on a package that has a set of patches off a known public kernel (with its own SPDX file ;) ) then it permits the licensing and copyrights to be clearly articulated.
Yeah. Is the kernel going to be released with an SPDX file that we can base our work on?
That's been the request from others as well. Am working with Ninka/FOSSology to see if we can make it happen in an Open Source way, and then it will be a packaging step. Linux is fairly clean license wise compared to most large packages. If you're curious: http://repo.fossology.org/?mod=nomoslicense&show=detail&upload=149&a... (this takes about 7 minutes for FOSSology to generate, for instance), so it feasible to add in SPDX file generation to it - since the hard part is recognizing those licenses. The socialization of Linux generating this as a deliverable with the releases is being pursued.
Proprietary tool vendors will be providing RDF varients for specific instances of linux from time to time as well.
My point was more that I don't see it being worth Linaro engineering time to push for adoption of either format inside Debian packages at this time.
This is a format that can go outside the packages (as well as in), so can be used without marshalling the entire Debian community to adopt it. So, am not advocating it be pushed by Linaro engineering for adoption inside Debian packages. Just be considered for use when describing the licensing and copyrights of what you're providing, rather than come up with yet another version... ;)
The problem SPDX is trying to solve is the software bill of materials one, so that the supply chain can know what licenses they have to comply with from a licencing and cooyright perspective, and each sender/recipient doesn't have to scan through the same files over and over again, as they build on the work of others.
Thanks, Kate