On Mon, 2011-06-06 at 16:37 -0400, James Westby wrote:
On Mon, 6 Jun 2011 14:37:18 -0500, Zach Pfeffer zach.pfeffer@linaro.org wrote:
The spec says:
4.6Source Information 4.6.1Purpose: This is a free form text field that contains additional comments about the origin of the package. For instance, this field might include comments indicating whether the package been pulled from a source code management system or has been repackaged. 4.6.2Intent: Here, by providing a freeform field, reviewers can provide any additional information to describe any anomalies, or discoveries, in the determination of the origin of the package. 4.6.3Cardinality: Optional, one 4.6.4Data Format: single line of free form text 4.6.5Tag: SourceInfo Example: SourceInfo: uses glibc-2_11-branch from git://sourceware.org/git/glibc.git.
So it looks like we'd have to define our own microformat here (though it's going to be consumed by humans at least to start with, so consistency doesn't really matter at this stage)
If there are some pretty common trends, we can look at adding fields to support. Its in beta right now so this sort of feedback can be gathered after all ;)
What's listed here seems fairly tricky to produce automatically.
What part do you think would be tricky?
It depends when we are generating this file, but the format of what you specify seems a little clever for humans.
If the content is freeform then we can obviously choose something that is easy to generate.
FileName: file1 FileName: file2 FileName: file3 FileChecksum: SHA1: calculated
This is all the files in the source?
Yeah.
I guess the cost of that in a kernel build is pretty small.
You only list one FileChecksum here. Can that line follow every FileName line?
After every FileName: there should be a FileChecksum: field.
For each file listed in the package, the fields that are mandatory and should show up are: - FileName: - FileChecksum: - LicenseConcluded: - LicensesInfoInFile: - CopyrightText:
The rest are optional, and can be included or not.
LicenseConcluded: GPL-2.0
From the spec:
The licensing that the preparer of this SPDX document has concluded, based on the evidence, actual applies to the package.
I think this is where the lawyer would say, this is the license.
Yeah. Again my question is source or binary?
This is captured as an optional field (FileType:) at the file level.
I presume this can be an AND/OR list again?
LicenseInfoFromFiles: GPL-2.0
This is a field that has all the license found in the package.
Just a dumping ground of every license found?
If a package has multiple file, its a way of having a summary of all the licenses encountered in those files. Ideally the LicenseConcluded: at the package level, will match the LicenseInfoFromFiles:. By making this visible, its hoped over time that the package owners will look into clearing up the ambiguities.
My overally impression is that this is rather a large additional overhead to just be able to say
the kernel was built from 0A2E345 of git://git.linaro.org/jcrigby/linux-linaro-natty.git
which is the main thrust of kiko's request as I understand it.
If you just want to say the build origins, then yes its overkill for that. If you want to be able to pass on a package that has a set of patches off a known public kernel (with its own SPDX file ;) ) then it permits the licensing and copyrights to be clearly articulated.
Debian packages already contain licensing info (they also have a proposed standard to make that info machine readable.) Is it worth Linaro's time to try and move everything to one of these new formats at this stage?
Debian package standard has no way to track the accuracy of the file level information, so it could change underneath without there being a way of detecting it - which causes problems. If you look at: http://dep.debian.net/deps/dep5/ (machine readable info), you'll see this aspect missing (the wildcards preclude it). The DEP5 proposal was analyzed extensively last year but commercial companies didn't feel they could rely on the accuracy information provided through it, but there is a lot of common ground on the fields provided, file paragraphs, separate licensing section, etc. We've had a team of volunteer corporate lawyers (Motorola, HP, etc. ) working on the SPDX for the last year and reviewing and modifying the proposed fields extensively.
Kate