Index Home About Blog
Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405222341380.18601@ppc970.osdl.org>
Date: Sun, 23 May 2004 06:48:09 GMT
Message-ID: <fa.iro1nns.16kcajm@ifi.uio.no>

Hola!

This is a request for discussion..

Some of you may have heard of this crazy company called SCO (aka "Smoking
Crack Organization") who seem to have a hard time believing that open
source works better than their five engineers do. They've apparently made
a couple of outlandish claims about where our source code comes from,
including claiming to own code that was clearly written by me over a
decade ago.

People have been pretty good (understatement of the year) at debunking
those claims, but the fact is that part of that debunking involved
searching kernel mailing list archives from 1992 etc. Not much fun.

For example, in the case of "ctype.h", what made it so clear that it was
original work was the horrible bugs it contained originally, and since we
obviously don't do bugs any more (right?), we should probably plan on
having other ways to document the origin of the code.

So, to avoid these kinds of issues ten years from now, I'm suggesting that
we put in more of a process to explicitly document not only where a patch
comes from (which we do actually already document pretty well in the
changelogs), but the path it came through.

Why the full path, and not just originator?

These days, most of the patches in the kernel don't actually get sent
directly to me. That not just wouldn't scale, but the fact is, there's a
lot of subsystems I have no clue about, and thus no way of judging how
good the patch is. So I end up seeing mostly the maintainers of the
subsystem, and when a bug happens, what I want to see is the maintainer
name, not a random developer who I don't even know if he is active any
more. So at least for me, the _chain_ is actually mostly more important
than the actual originator.

There is also another issue, namely the fact than when I (or anybody else,
for that matter) get an emailed patch, the only thing I can see directly
is the sender information, and that's the part I trust. When Andrew sends
me a patch, I trust it because it comes from him - even if the original
author may be somebody I don't know. So the _path_ the patch came in
through actually documents that chain of trust - we all tend to know the
"next hop", but we do _not_ necessarily have direct knowledge of the full
chain.

So what I'm suggesting is that we start "signing off" on patches, to show
the path it has come through, and to document that chain of trust.  It
also allows middle parties to edit the patch without somehow "losing"
their names - quite often the patch that reaches the final kernel is not
exactly the same as the original one, as it has gone through a few layers
of people.

The plan is to make this very light-weight, and to fit in with how we
already pass patches around - just add the sign-off to the end of the
explanation part of the patch. That sign-off would be just a single line
at the end (possibly after _other_ peoples sign-offs), saying:

	Signed-off-by: Random J Developer <random@developer.org>

To keep the rules as simple as possible, and yet making it clear what it
means to sign off on the patch, I've been discussing a "Developer's
Certificate of Origin" with a random collection of other kernel
developers (mainly subsystem maintainers).  This would basically be what
a developer (or a maintainer that passes through a patch) signs up for
when he signs off, so that the downstream (upstream?) developers know
that it's all ok:

	Developer's Certificate of Origin 1.0

	By making a contribution to this project, I certify that:

	(a) The contribution was created in whole or in part by me and I
            have the right to submit it under the open source license
	    indicated in the file; or

	(b) The contribution is based upon previous work that, to the best
	    of my knowledge, is covered under an appropriate open source
	    license and I have the right under that license to submit that
	    work with modifications, whether created in whole or in part
	    by me, under the same open source license (unless I am
	    permitted to submit under a different license), as indicated
	    in the file; or

	(c) The contribution was provided directly to me by some other
	    person who certified (a), (b) or (c) and I have not modified
	    it.

This basically allows people to sign off on other peoples patches, as long
as they see that the previous entry in the chain has been signed off on.
And at the same time it makes the "personal trust" explicit to people who
don't necessarily understand how these things work.

The above also allows for companies that have "release criteria" to have
the company "release person" sign off on a patch, so that a company can
easily incorporate their own internal release procedures and see that all
the patches have gone through the right channel. At the same time it is
meant to _not_ cause anybody to have to change how they work (ie there is
no "extra paperwork" at any point).

Comments, improvements, ideas? And yes, I know about digital signatures
etc, and that is _not_ what this is about. This is not about proving
authorship - it's about documenting the process. This does not replace or
preclude things like PGP-signed emails, this is _documenting_ how we work,
so that we can show people who don't understand the open source process.

			Linus



Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405230855250.25502@ppc970.osdl.org>
Date: Sun, 23 May 2004 16:02:50 GMT
Message-ID: <fa.ir7vno1.174qa3j@ifi.uio.no>

On Sun, 23 May 2004, Ian Stirling wrote:
>
> Has anyone ever tried to forge the name on a patch, and get it included?

Not to my knowledge. It's a bit harder than just technically forging the
email, you also have to forge a certain "context", since most developers
know the "next hop" person anyway, and thus kind of know what to expect.
You may not see the other person, but that doesn't mean that you can't
recognize his/her way of doing things.

And if you do _not_ know the person that the forged message comes in as,
then you have to check the patch anyway, so ...

That said, forged emails is not what this process would be about. Quite
frankly, I hope we'll some day have "trusted email", but that's kind of an
independent issue, in that I hope it moves in that direction _regardless_
of any patch documentation issues..

		Linus


Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405230840520.25502@ppc970.osdl.org>
Date: Sun, 23 May 2004 15:55:12 GMT
Message-ID: <fa.isnpnfs.14kkabm@ifi.uio.no>

On Sun, 23 May 2004, Arjan van de Ven wrote:
>
> Can we make this somewhat less cumbersome even by say, allowing
> developers to file a gpg key and sign a certificate saying "all patches
> that I sign with that key are hereby under this regime". I know you hate
> it but the FSF copyright assignment stuff at least has such "do it once
> for forever" mechanism making the pain optionally only once.

One reason that I'd prefer not to is simply the question of "who maintains
the certificates?"

I certainly don't want to maintain any stateful paperwork with lots of
people. This is why I personally would prefer it all to be totally
state-less.

Also, there is a _fundamental_ problem with signing a patch in a global
setting: the patches _do_ get modified as they move through the system
(maybe just bug-fixes, maybe addign a missing piece, maybe removing a
controversial part). So the signature ends up being valid only on your
part of the communication, and then after that it needs something else.

And what I do _not_ want to see is a system where if somebody makes a
trivial change, it then has to go back to you to be re-signed. That just
would be horrible.

With those (pretty basic) caveats in mind, I don't see any fundamental
problem in a PGP key approach, if it's a "local" thing between developers.
In fact, I think PGP-signed patches are something we may want to look at
from a "trust the email" standpoint, but I think it should be a _local_
trust. And part of that "local" trust might be a private agreement between
ddevelopers that "it's ok to add the sign-off line for Arjan when the
patch has come with that PGP signature" when the patch is passed on.

So to me, the sign-off procedure is really about documenting the path, and
if a PGP key is there in certain parts of the path, then that would be a
good thing, but I think it's a separate thing from what I'm looking for.

		Linus


Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405241342190.32189@ppc970.osdl.org>
Date: Mon, 24 May 2004 20:47:46 GMT
Message-ID: <fa.iv7nmg7.134qbbr@ifi.uio.no>

On Mon, 24 May 2004, Davide Libenzi wrote:
>
> IANAL, but I don't think they have to ask. As with GPL, you not required
> to sign anything to be able to use the software. By using the software you
> agree on the license. By submitting a patch to a maintainer, you agree
> with the Developer's Certificate of Origin.

No, the thing is, we want your name to show up, and we do want you to
explicitly state that not only do you know about the license, you also
have the right to release your code under the license.

Yes, that was all implied before. This is nothing new. The only new thing
is to _document_ it, and make it _explicit_.

And that means that submitters should read the DCO, and add the extra
line. That's kind of the whole point of it - making a very ingrained and
implicit assumption be explicitly documented.

In other words: this is not about changing the way we work. It's about
documenting the things we take for granted. So that outsiders can be shown
how it works.

		Linus


Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405241326400.32189@ppc970.osdl.org>
Date: Mon, 24 May 2004 20:35:47 GMT
Message-ID: <fa.j0n5m0b.11k8arv@ifi.uio.no>

On Mon, 24 May 2004, Andi Kleen wrote:
>
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > Hola!
> >
> > This is a request for discussion..
>
> What's not completely clear to me is how the Signed-off-by
> header is related to this:
>
> > 	Developer's Certificate of Origin 1.0
> [...]
>
> I assume you're not expecting that people actually print out and sign
> this and send it somewhere?

No.

> You're just asking that they read it and confirm to the maintainer
> that they did, right?

Right. We'd add it to the Documentation directory, and add pointers to it
to anything that mentions the "Signed-off-by:" thing (eg things like
SubmittingPatches). All just to make sure that people are aware of what it
means to say "Signed-off-by:"

> That sounds quite involved to me. I bet in some companies this
> Certificate would first be sent to the legal department for approval,
> delaying the patch for a long time

Having worked at a company like that, I can say that that is true pretty
much regardless of what the patch submission is (it's about a million
times _worse_ if you have something like the FSF copyright assignment
thing, but it's certainly true even for random open source things that
don't have the physical paperwork and copyright assignment).

> e.g. normally the maintainer would just answer "ok, looks good,
> applied". Now they would need to ask "ok, did you write this. if not
> through which hands did it pass"? and wait for a reply and then only
> add the patch when you know whom to put into all these Signed-off-by
> lines.

No. The point is that a maintainer does NOT need to do this, exactly
because we'd try to educate people to have the "Signed-off-by:" line pass
with the patch from the very beginning.

> This is not unrealistic, For example for patches that are "official
> projects" by someone it often happens that not the actual submitter
> sends the patch, but his manager (often not even cc'ing the original
> developer). In some cases companies even go through huge efforts to
> keep the original developers secret (I won't give names here, but it
> happens).

Absolutely. And the whole sign-off procedure is _designed_ for this.

The person who signs off on a patch does not need to be the author: in
fact at a company that has "release people", it's not _supposed_ to be the
author, it's supposed to be the company release person (although the
original author may well have signed off on it internally - but that's not
something that an external maintainer would know about or even care
about).

			Linus


Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405241400280.32189@ppc970.osdl.org>
Date: Mon, 24 May 2004 21:10:11 GMT
Message-ID: <fa.ivnnlg5.12kmabp@ifi.uio.no>

On Mon, 24 May 2004, Thomas Gleixner wrote:
>
> What I'm missing in this discussion is a clear distinction between patches and
> contributions.

Well, I'm not sure such a clear distinction exists.

Clearly there are patches that are so trivial that we simply don't care
about the process, because they don't contain any "new work". Spelling
fixes, and trivial one-liners.

On the other hand, I'd rather have the process be "we always have the
sign-off", coupled with just plain common sense.

Any process that doesn't allow for common sense is just broken, and
clearly from a _legal_ standpoint it doesn't matter if we track who fixed
out (atrocious) spelling errors.

On the other hand, it if becomes a habit, and we just sign-off even on the
trivial stuff, that's actually going to make the whole process a lot
easier - simply by avoiding the bother of even having to think about it.

So I'd rather encourage people to sign off on even the silly stuff, than
to have to constantly make a judgement call. At the same time, I think
that if somebody _didn't_ sign off on the simple stuff, we shouldn't just
run around in circles like hens in a hen-house, we should just say "hey,
we've got brains, the process isn't meant to be _stupid_".

			Linus


Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [RFD] Explicitly documenting patch submission
Original-Message-ID: <Pine.LNX.4.58.0405250948530.9951@ppc970.osdl.org>
Date: Tue, 25 May 2004 17:09:02 GMT
Message-ID: <fa.h1ujvma.diqqhq@ifi.uio.no>

On Tue, 25 May 2004, J. Bruce Fields wrote:
>
> The patch-submission process can be more complicated than a simple path
> up a heirarchy of maintainers--patches get bounced around a lot
> sometimes.

Yes. And documenting the complex relationships obviously can't be sanely
done. The best we can do is a "it went through these people".

Perfect is the enemy of good. If we tried to be perfect, we'd never get
anything done.

> If you're trying to document who contributes "intellectual property" to
> the kernel

No, that's not what it is either.  At least to me, equally important as
the actual author is how it got reviewed, and what path it took. Because
when problems happen (say a simple bug), I want the whole path to know.

Think of it this way (purely technical to avoid any emotional arguments):
we've hunted down a change that results in strange behaviour, and what we
want to do is get the problem explained and resolved. Maybe the thing to
do is to just revert the whole change, but usually we just want to fix it,
and regardless of whether we want to undo it or fix it, what we want to do
is get the people who were involved with not just writing the code, but
approving it too to look at the issue.

And the people who approved it literally _are_ as important as the people
who wrote it (forget any copyright issues), since (a) they need to know to
avoid the problem in the first place and (b) they usually know why the
code was added and what problems _they_ saw (or didn't see) when they
approved it.

See? That's why to me, the set of people who have been involved in the
whole patch "lifetime" is actually _more_ important than the original
author. The original author is obviously special in some respects, but
from a problem solving perspective he's not necessarily even the person to
go to.

> I gues I'm still a little vague as to exactly what sort of questions we
> expect to be able to answer using this new documentation.

See above. I explicitly picked a _technical_ reason for tracking who has
been involved with a patch, but let's say that somebody raises concerns
over any _other_ issues about the code - the fact is that the same logic
applies. The original author is a bit special, but the path it took is
still equally important.

> A couple examples (which I think aren't too farfetched):
> 	* Developer A submits a patch which is dropped by maintainer B.
> 	  I later notice this and resubmit A's patch to B.  I don't
> 	  change the patch at all, and the resubmission is my only
> 	  contribution to the process.  Do I need to tag on my own
> 	  "Signed-off-by" line?

Yup. And part of it is simply credit: trust me when I say to you that
"maintenance" of patches is a job that it at _least_ as important as
writing them in most cases.

That's not always true, of course - there are pieces of code that are just
stunning works of art, and very important, and as programmers we like to
think of those really fundamental contributions. But in real life, it's
definitely the old case of "1% inspiration, 99% persiration", and we
should just accept that.

For example, look at the kernel developers out there, and ask yourself who
stands out. There's a couple of great coders, but I think the people who
really stand out are people like Andrew, who mostly really "organize" and
act as managers. Right?

So when you save a patch from oblivion by passing it on to the right
person, and get it submitted when it was originally dropped by some
reason, you're actually doing a fundamentally important job. Maybe it's
just one small piece of the puzzle, but hey, you'd only get one small line
in the changeset, so the credit (or blame ;) really is appropriate.

> 	* I write a patch.  Developers X and Y suggest significant
> 	  changes.  I make the changes before I submit them to maintainer
> 	  Z.  Suppose the changes are significant enough that I no longer
> 	  feel comfortable representing myself as the sole author of the
> 	  patch.  Should I also be asking developer X  and Y to add their
> 	  own "Signed-off-by" lines?

That, my friend, is a matter of your own taste and conscience. My answer
is that if you wrote it all, you clearly don't _need_ to. At the same
time, I think that it's certainly in good taste to at least _ask_ them.
Wouldn't you agree?

		Linus


Index Home About Blog