git bisect (Linus Torvalds)

Index Home About Blog

From: Linus Torvalds <torvalds@osdl.org>
Newsgroups: fa.linux.kernel
Subject: Re: git pull on Linux/ACPI release tree
Date: Tue, 10 Jan 2006 19:30:42 UTC
Message-ID: <fa.fv9n3b6.j067qs@ifi.uio.no>
Original-Message-ID: <Pine.LNX.4.64.0601101111110.4939@g5.osdl.org>

On Tue, 10 Jan 2006, Linus Torvalds wrote:
>
> Now, the git history is _not_ really a two-dimensional surface, so it's
> just an analogy, not an exact identity. But from a visualization
> standpoint, it's a good way to think of each "git bisect" as adding a
> _line_ on the surface rather than a point on a linear line.

Actually, the way I think of it is akin to the "light cones" in physics. A
point in space-time doesn't define a fully ordered "before and after": but
it _does_ describe a "light cone" which tells you what is reachable from
that point, and what that point reaches. Within those cones, that
particular point ("commit") has a strict ordering.

And exactly as in physics, in git there's a lot of space that is _not_
ordered by that commit. And the way to bisect is basically to find the
right points in "git space" to create the right "light cone" that you
find the point where the git space that is reachable from that commit has
the same volume as the git space that isn't reachable.

And maybe that makes more sense to you (if you're into physics), or maybe
it makes less sense to you.

Now, since we always search the "git space" in the cone that is defined by
"reachable from the bad commit, but not reachable from any good commit",
the way we handle "bad" and "good" is actually not a mirror-image. If we
fine a new _bad_ commit, we know that it was reachable from the old bad
commit, and thus the old bad commit is now uninteresting: the new bad
commit forms a "past light cone" that is a strict subset of the old one,
so we can totally discard the old bad commit from any future
consideration. It doesn't tell us anything new.

In contrast, if we find a new _good_ commit, the "past light cone" (aka
"set of commits reachable from it") is -not- necessarily a proper superset
of the previous set of good commits, so when we find a good commit, we
still need to carry the _other_ good commits around, and the "known good"
universe is the _union_ of all the "good commit past lightcones".

Then the "unknown space" is the set difference of the "past lightcone of
the bad commit" and of this "union of past lightcones of good commits".
It's the space that is reachable from the known-bad commit, but not
reachable from any known-good commit.

So this means that when doing bisection, what we want to do is find the
point in git space that has _new_ "reachability" within that unknown space
that is as close to half that volume as space as possible. And that's
exactly what "git-rev-list --bisect" calculates.

So every time, we try to either move the "known bad" light-cone down in
time in the unknown space, _or_ we add a new "known good" light-cone. In
either case, the "unknown git space" keeps shrinking by half each time.

("by half" is not exact, because git space is not only quanticized, it
also has a rather strange "distance function". In other words, we're
talking about a rather strange space. The good news is that the space is
small enough that we can just enumerate every quantum and simply
calculate the volume it defines in that space. IOW, we do a very
brute-force thing, and it works fine).

			Linus

From: Linus Torvalds <torvalds@osdl.org>
Newsgroups: fa.linux.kernel
Subject: Re: rc6 keeps hanging and blanking displays where rc4-mm1 works
Date: Fri, 12 Aug 2005 16:54:50 UTC
Message-ID: <fa.g0ad5b8.m0o5qk@ifi.uio.no>
Original-Message-ID: <Pine.LNX.4.58.0508120937140.3295@g5.osdl.org>

On Fri, 12 Aug 2005, Helge Hafting wrote:
>
> > at the moment. The setup is fine with 2.6.13-rc4-mm1 x86-64, no
> > problems there.
>
> The problem still exists in 2.6.13-rc6.  Usually, all I get is a
> suddenly black display, solveable by resizing.

Is there any chance you could try bisecting the problem? Either just
binary-searching the patches or by using the git bisect helper scripts?

Obviously the git approach needs a "good" kernel in git, but if
2.6.13-rc4-mm1 is ok, then I assume that 2.6.13-rc4 is ok too? That's a
fair number of changes:

	 git-rev-list v2.6.13-rc4..v2.6.13-rc6 | wc
	    340     340   13940

but if you can tighten it up a bit (you already had trouble at rc5, I
think), it shouldn't require testing more than a few kernels.

Git has had bisection support for a while, but the helper scripts to use
it sanely are fairly new, so I think you'd need the git-0.99.4 release for
those. But then you'd just do

	git bisect start
	git bisect bad v2.6.13-rc5
	git bisect good v2.6.13-rc4

and start bisecting (that will check out a mid-way point automatically,
you build it, and then do "git bisect bad" or "git bisect good" depending
on whether the result is bad or good - it will continue to try to find
half-way points until it has found the point that turns from good to
bad..)

		Linus

From: Linus Torvalds <torvalds@osdl.org>
Newsgroups: fa.linux.kernel
Subject: Re: Simple script that locks up my box with recent kernels
Date: Sat, 07 Oct 2006 21:27:08 UTC
Message-ID: <fa./oMGTFml4fnJMGJZgMEDjT9Ip0Q@ifi.uio.no>

On Sat, 7 Oct 2006, Jesper Juhl wrote:
>
> > Can I bother you to just bisect it?
>
> Sure, but it will take a little while since building + booting +
> starting the test + waiting for the lockup takes a fair bit of time
> for each kernel

Sure. That said, we've tried to narrow down things that took hours or days
(under real loads, not some nice test-script) to reproduce, and while it
doesn't always work, the real problem tends to be if the problem case
isn't really reproducible. It sounds like yours is pretty clear-cut, and
that will make things much easier.

> and also due to the fact that my git skills are pretty
> limited, but I'll figure it out (need to improve those git skills
> anyway) :-)

"git bisect" in particular isn't that hard to use, and it will really do
a lot of heavy lifting for you.

Although since it will just select a random commit (well, it's not
"random": it's strictly as half-way as it can possibly be, but it's
automated without any regard for anything else), you can sometimes hit a
situation where git will ask you to test a kernel that simply doesn't work
at all, and you can't even test whether it reproduces your particular bug
or not.

For example, "git bisect" might pick a kernel that just doesn't compile,
because of some stupid bug that was fixed almost immediately afterwards.
In those cases, the total automation of "git bisect" ends up being
something that has to be helped along by hand, and then it definitely
helps to know more about how git works.

Anyway, the quick tutorial about "git bisect" is that once you've given it
the required first "good" and "bad" points, it will create a new branch in
the repository (called "bisect", in case you care), and after that point
it will do a search in the commit DAG (aka "history tree" - it's not a
tree, it's a DAG, since merges will join branches together) for the next
commit that will neatly "split" the DAG into two equal pieces. It will
keep splitting the commit history until you get fed up, or until it has
pinpointed the single commit that caused the problem.

The nicest tool to use during bisection is to just do a

	git bisect visualize

that simply starts up "gitk" (the default git history visualizer) to show
what the current state of bisection is. Now, if there are thousands and
thousands of commits, you'll have a really hard time getting a visual clue
about what is going on, but especially once you get to a smaller set of
commits, it's very useful indeed.

And it's _especially_ useful if you hit one of the problem spots where you
can't test the resulting tree for some unrelated reason. When that
happens, you should _not_ mark the problematic commit as being "bad",
because you really don't know - the "badness" of that commit is probably
not related to the "badness" that you're actually searching for.

Instead, you should say "ok, I refuse to test this commit at all, because
it's got other problems, and I will select another commit instead". The
bisection algorithm doesn't care which commit you pick, as long as it's
within the set of "unknown" commits that you'll see with the visualization
tool.

Of course, for efficiency reasons, the _closer_ you get to the half-way
mark, the better. So it's useful to try to pick a commit that is close to
the one that "git bisect" originally chose for you, but that's not a
correctness issue, that's just an issue of "if we have a thousand
potential commits, we're better off bisecting it 400/600 rather than
1/999, even if the exact half-way point isn't testable".

So if you need to decide to pick another point than the one "git bisect"
chose for you automatically, just select that commit in the visualizer
(which will cut the SHA1 name of it), and then do

	git reset --hard <paste-sha1-here>

to reset the "bisect" branch to that point instead. And then compile and
test that kernel instead (and then if that's good or bad, you can do the
"git bisect good" or "git bisect bad" thing to mark it so, and git will
continue to bisect the set of commits).

It can be a bit boring, but damn, it's effective. I've used "git bisect"
several times when I've been too lazy to try to really think about what is
going on - I'll happily brute-force bug-finding even if it might take a
little longer, if it's guaranteed to find it (and if the bug is
reproducible, git bisect definitely guarantees to find what made it
appear, even if that may not necessarily be the deeper _cause_ of the bug)

		Linus

From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: 2.6.21-rc1: known regressions (part 2)
Date: Thu, 01 Mar 2007 23:40:46 UTC
Message-ID: <fa.63Ct4ClRD8hjd7Z3taISujCscb8@ifi.uio.no>

On Thu, 1 Mar 2007, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@elte.hu> wrote:
>
> > update: f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38 works for me too, and
> > 01363220f5d23ef68276db8974e46a502e43d01d is broken. I too will attempt
> > to bisect this.
>
> hm. There's some weird bisection artifact here. Here are the commits i
> tested, in git-log order:
>
> #1 commit 01363220f5d23ef68276db8974e46a502e43d01d bad
> #2 commit ee404566f97f9254433399fbbcfa05390c7c55f7 bad
> #3 commit f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38 good
> #4 commit c827ba4cb49a30ce581201fd0ba2be77cde412c7 bad

Use "git bisect visualize" to see what bisect ends up doing.

> if i tell git-bisect that #1 is bad and #3 is good, then it offers me #2
> - that's OK. But when i tell it that #2 is bad, it offers #4 - which is
> out of order!

No it's not. "git bisect" does exactly the right thing. There is no simple
ordering in a complex branch-merge scenario, you can't just put the
commits in some "ordering" and test things in time order. That would be
totally broken, and idiotic. It doesn't give the right results.

What git bisect does is to find the commit that most closely *bisects* the
history of commits, so that if it is marked good/bad, it will leave you
with about 50% of the commits left. But if you are looking at date order,
you're entirely confused.

For example, let's take a really simple case

	    a <- bad
	   / \
          b   c
	  |   |
	  d   e
	  |   |
	  f   g
	   \ /
	    h
            |
	    * <-good

and if you are looking to find something "in the middle", you might thing
that "d" or "e" are the best choices, since time-wise, they are in the
middle.

But that's not true AT ALL.

If you actually want to bisect that kind of history, you need to choose
"b" or "c", even though they may both be *much* more "recent" than the
others. Why? Because if you pick "d", you're really only testing three
commits ('d' 'f' and 'h') out of the 8 commits you have to test.

In contrast, if you pick 'b', you are testing the effects of *four*
commits ('b', 'd', 'f' and 'h') and you have thus neatly bisected the
commits into two equal groups for testing (one group _with_ those four
commits, and one group _without_) instead of having partitioned them as 3
commits vs 5 commits.

So please realize that non-linear history very much means that you MUST
NOT think that you just pick a commit "in the middle". No, git bisect is a
LOT smarter than that - it picks a commit that *reaches* about half the
commits you have left to test.

> The bisection goes off into la-la land after that and
> never gets back to a commit that is /after/ the good commit. How is this
> possible? (I upgraded from git-1.4.4 to 1.5.0 to make sure this isnt
> some git bug that's already fixed.)

It's possible because git knows what it is doing, and you didn't think
things through.

The commits that "git bisect" picked out are the right ones. Quite often,
there may be two or more "equally good" commits (in my example above, you
can choose either "b" or "c", and it will bisect the set of untested
commits equally well - in two groups of four, but two *different* groups
of four commits), and yes, it's possible that git has a bug that makes it
pick the wrong ones, but quite frankly, I seriously doubt it. "git bisect"
has been very successful indeed, and is generally a *lot* better at
picking a commit "in the middle" than people are, exactly because it's
quite hard to see which commit "reaches" half the commits if you have lots
of merges and branches.

Try out

	git bisect visualize

and it will literally show you what it is doing.

What can be confusing is that if the "good" and "bad" markers are ON
DIFFERENT BRANCHES OF DEVELOPMENT, you may not even *see* the "good"
marker, because you may well have something like this:

	a <- bad
	|
	b   * <- good
	|   |
	c   d
	 \ /
	  e
	  |
	  f
	  |
	 ...

and what do you think "git bisect visualize" will actually show you?

Since 'd', 'e' and 'f' are all in the "good" set (they both exist as
commits in something leading up to a commit that has already been deemed
fine), they aren't *interesting* - they can't be introducing the bug,
since if that was the case, the good commit wouldn't have been good. So as
far as bisection is concerned, the tree actually looks like

	 a <- bad
	 |
	 b
	 |
	 c
	 |
	...

and you have just three commits that are potentially interesting: 'a', 'b'
and 'c'.

Now, with three commits, you cannot test them half-and-half, so you have
to test it in groups of 1 vs 2 commits, so it's arbitrary whether you
choose 'b' or 'c' to test, but you'd test one of them. Say that you choose
'b', and it turns out to be good. If so, you're done: 'a' is bad and 'b'
is good, so the bug was introduced in 'a'. But if it turns out to be bad,
you'll still have to test 'c' too, since you don't know if the bug was
*introduced* in 'b' or not.

See?

> i'll try to straighten this out manually

Don't. You're just going to make your bisection much less effective. The
whole point of bisection is that you can usually cut the number of commits
to test pretty exactly in half.  If you start mucking with the commits to
test, and you don't understand about the reachability graph, you'll just
choose a much worse set of commits to test than "git bisect" will do.

So learn to trust "git bisect". It really does know what it is doing.

		Linus

From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: 2.6.21-rc1: known regressions (part 2)
Date: Fri, 02 Mar 2007 00:31:00 UTC
Message-ID: <fa.NTa0foc6yshUr2H9+kfHCLAmL60@ifi.uio.no>

On Thu, 1 Mar 2007, Ingo Molnar wrote:
>
> git-bisect gets royally confused on those ACPI merge branches around
> commit c0cd79d11412969b6b8fa1624cdc1277db82e2fe. Here are my test
> results so far:

Looks like git bisect worked for you, and wasn't confused at all. You
started out with 2931 commits between your first known-bad and known-good
commits, which means that you usually end up having to check "log2(n)+1"
kernels, ie I'd have expected you to have to do 12-13 bisection attempts
to cut it down to one.

You seem to have done 14 (you list 16 commits, two of which are the
starting points), which is right in that range. The reason you sometimes
get more is:

 - you "help" git bisect by choosing other commits than the optimal ones.

 - with bad luck, it can be hard to get really close to "half the commits"
   in the reachability analysis, especially if you have lots of merges
   (and *especially* if you have octopus merges that merge more than two
   branches of development). For example, say that you have something like

	           a
	           |
	   +---+---+---+---+
	   |   |   |   |   |
	   b   c   d   e   f

   where you have six commits - you can't test any "combinations" at all,
   since they are all independent, so "git bisect" cannot test them three
   and three to cut down the time, so if you don't know which one is bad,
   you'll basically end up testing them all.

The bad luck case never really happens to that extreme in practice, and
even when it does you can sometimes be lucky and just hit on the bug early
(so "bad luck" may end up being "good luck" after all), but it explains
why you can get more - or less - than log2(n)+1 attempts. More commonly
one more.

A much *bigger* problem is if you mark something good or bad that isn't
really. Ie if the bug comes and goes (it might be timing-dependent, for
example), the problem will be that you'll always narrow things down
(that's what bisection does), but you may not narrow it down to the right
thing!

We've had that happen several times. If the bug (for example) means that
suspend *often* breaks, but sometimes works just by luck, you might mark a
kernel "good" when it really wasn't and then "git bisect" will *really* go
out in the weeds, and won't even try to test the commits that may have
introduced the bug, because you told it that those commits resulted in a
good kernel..

>  commit 01363220f5d23ef68276db8974e46a502e43d01d: bad
>  commit 255f0385c8e0d6b9005c0e09fffb5bd852f3b506: bad
>  commit c0cd79d11412969b6b8fa1624cdc1277db82e2fe: bad
>  commit c24e912b61b1ab2301c59777134194066b06465c: good
>  commit e9e2cdb412412326c4827fc78ba27f410d837e6e: bad
>  commit 79bf2bb335b85db25d27421c798595a2fa2a0e82: bad
>  commit fc955f670c0a66aca965605dae797e747b2bef7d: good
>  commit 70c0846e430881967776582e13aefb81407919f1: good
>  commit 414f827c46973ba39320cfb43feb55a0eeb9b4e8: bad
>  commit f3ccb06f3b8e0cf42b579db21f3ca7f17fcc3f38: good
>  commit 5f0b1437e0708772b6fecae5900c01c3b5f9b512: bad
>  commit b878ca5d37953ad1c4578b225a13a3c3e7e743b7: bad
>  commit c2902c8ae06762d941fab64198467f78cab6f8cd: bad
>  commit 12e74f7d430655f541b85018ea62bcd669094bd7: bad
>  commit 3388c37e04ec0e35ebc1b4c732fdefc9ea938f3b: bad
>  commit 9f4bd5dde81b5cb94e4f52f2f05825aa0422f1ff: bad

Looks like it's claiming that 9f4bd5dde81b5cb94e4f52f2f05825aa0422f1ff is
the bad commit. Which is extremely unlikely, since it only seems to affect
the emu10k sound driver, which I don't think even exists on any ThinkPad
laptops (correct me if I'm wrong).

Btw, you seem to have re-ordered the commits - the above is not the order
you did the bisection in. The known-good commit (f3ccb06..) is in the
middle. That's totally bogus. Please use the git bisection log (see
git/BISECT_LOG), and don't think that you know some "better" order. You
really don't.

> the results are totally reproducible (i re-tried a few of both the good
> and the bad commits), i.e. it's not a sporadic condition. Also, a number
> of the 'bad' commits have no dynticks stuff in them at all, so i'd
> exclude dynticks.
>
> could someone suggest a sane way to go with this? Perhaps suggest
> specific commit IDs to test?

You claim that 9f4bd5dd is bad, but you indirectly claim that its direct
parent (5986a2ec) is good by saying that f3ccb06f is good. This is why
"git bisect" will claim that 9f4bd5dd must be the bad commit.

I would suggest testing commit 5986a2ec explicitly. If that one is good,
then, since you claim that 9f4bd5dd is bad, then yes, 9f4bd5dd *is* the
bad commit (because 5986a2ec is its direct parent).

But most likely, 9f4bd5dd is actually already bad, and what you are seeing
is two *different* bugs that just have the same symptoms ("suspend doesn't
work").

What happens is that you've chased them *both*, and you cannot bisect that
kind of behaviour totally automatically and mindlessly, simply because
when you say "git bisect bad", that means that *one* of the bugs is
active, but not necessarily both of them. So you may well be marking
kernels that are "good" (as far as the other bug is concerned) as bad -
and that just means that bisection won't even test them.

When that happens, you need to basically

 - be able to separate the bugs out some way (so that you can still mark a
   non-working kernel "good" if it's good *with*respect*to* the particular
   bug you're chasing)

   This is often practically impossible, _especially_ with suspend, where
   the behaviour is so unhelpful that it's usually not possible to
   separate out "ACPI is broken" from "one particular device driver is
   broken", because they both have exactly the same symptoms: the machine
   doesn't resume.

HOWEVER. Even if you can't actually separate the bugs out, you can usually
find where *one* of the bugs starts, and that point you can generally find
the fix for it too. In this case, we already know one of the bugs: it's
the ACPI bug that was apparently fixed by f3ccb06f3 (or maybe another one
in that series).

Once you have that, you now actually have a way to "correct" for that
known bug, and by correcting for the known bug, you now *can* separate the
behaviour of the two bugs:

 - You can now re-do a totally mindless git bisection for the *other* bug,
   but what you now need to do is that at each bisection step, you look at
   whether the bisection point has the known bug, and if so, you apply the
   known fix for that known bug, and thus you can test the kernel
   *without* the interaction of the bug you already found.

This makes bisection a lot less automated (you have to apply the "fix" for
the other bug at each step), but it still allows "total automation" in the
sense that you don't actually need to know at all what you're looking for:
you're just following a known pattern, and you're basically just
correcting for the effects of another bug that you're no longer interested
in, since you already know what the fix for that bug was.

The other alternative is to actually have a clue what you're searching
for, and/or look deeply at where the fix was merged, and trying to narrow
things down by actually understanding the problem. But at that point,
bisection won't much help you, except perhaps as a way to find a mid-way
point to test out theories with ("which drivers that I actually use have
changed in between" kinds of experiments where you simply undo part of
the changes entirely, and bisection ends up being just a way to pick
points that are hopefully "interestingly far apart").

			Linus

From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: 2.6.21-rc1: known regressions (part 2)
Date: Fri, 02 Mar 2007 00:45:25 UTC
Message-ID: <fa.8mGUFyTdx8CqCcb1RxAuXU3++W0@ifi.uio.no>

On Thu, 1 Mar 2007, Linus Torvalds wrote:
>
> Once you have that, you now actually have a way to "correct" for that
> known bug, and by correcting for the known bug, you now *can* separate the
> behaviour of the two bugs:
>
>  - You can now re-do a totally mindless git bisection for the *other* bug,
>    but what you now need to do is that at each bisection step, you look at
>    whether the bisection point has the known bug, and if so, you apply the
>    known fix for that known bug, and thus you can test the kernel
>    *without* the interaction of the bug you already found.
>
> This makes bisection a lot less automated (you have to apply the "fix" for
> the other bug at each step)

Side note: it's still usually fairly easy. Especially if you have a known
fix for the other bug, you can usually just do the equivalent of

	git cherry-pick <fixcommit>

at each point during this bisect (or just have a known patch that you keep
applying and un-applying), and you're largely done.

Of course, if the area with the fix keeps changing, or if the fix is
really intrusive and nasty, this gets hairy, but at least in this case the
patch is fairly trivial and it shouldn't cause any trouble at all to do
this.

The only real down-side is just the mindless extra work, and the possible
added confusion you get from modifying the history at the points you're
testing. "git bisect" is not necessarily happy about auto-picking a new
bisection point with a dirty tree, for example, so before you mark
something "good" or "bad", you should generally try to do so with a clean
git tree (ie if you apply a patch at each stage, do "git reset --hard" to
remove the patch before you do the "git bisect bad/good" stage).

Similarly, especially at the end of the bisection run, if you actually use
"git cherry-pick" to *add* a commit, the bisection will now take that
added commit into account when trying to pick the next commit to check,
which is not what you really want. It probably doesn't matter that much
during the early stages (when bisection is really jumping around wildly
anyway, and one commit more or less doesn't really matter), but again, it
might be a good idea to make a habit of undoing the cherry-pick, the same
way you'd undo a patch (eg "git reset --hard HEAD^" would do it, if you
have exactly one cherry-pick that you tested).

		Linus

From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: Regression: USB is nfg after suspend/resume(RAM) cycle on Intel
Date: Tue, 29 May 2007 17:42:48 UTC
Message-ID: <fa.zWzTluJWEfNpa88JpCctkkanoBA@ifi.uio.no>

On Tue, 29 May 2007, Mark Lord wrote:
>
> Ugh.  Is there a way to tell bisect to only work around the USB updates?

Well, you _can_ actually give "git bisect" a pathspec (the same way you
git "git log" and friends), and tell it to only care about stuff that
changed that pathspec.

However, that was broken in some older git versions, and in general hasn't
had a huge amount of testing even in new ones, so I'm a tad nervous about
recommending people do it. But yes, you should be able to say

	git bisect start drivers/usb/
	git bisect good v2.6.21
	git bisect bad v2.6.22-rc3

and off you go. However, if you do this, you need to make sure that you
have at LEAST git-1.5.1.

The other downside of path limiting is that if it turns out that the bug
was really introduced by something else, and just happened to _look_ like
it's USB-related, the path limiting will then cause "git bisect" to blame
a commit that just happens to be the next commit after the _real_ bug was
introduced.

In this case, I don't think it's likely to be an issue, but it *could*
obviously be something else.

(In contrast, the non-path-limited "git bisect" should work in pretty much
any situation and with any git version)

> Still, that'll take a few hours, and frankly I'm getting sick of having
> to re-debug the USB layer with each new kernel rev.

Yeah, I'm not surprised. USB is probably the worst possible case for
suspend/resume (at least if you ignore ACPI-related problems). It's nasty.

> Got a pointer to the "bisect how-to" ?

It's so disgustingly simple that I don't think we've ever done any
specific bisection tutorial, but the "git-bisect" man-page does exist, and
it talks about the only half-way interesting case, namely the case where
the automatic selection of a half-way point causes git to pick a point
that doesn't work for some other reason (ie stupid compile problem or
whatever). In which case you have to pick another one manually.

So that kind of gotcha is at least _mentioned_ in the git-bisect man-page,
even if it doesn't get much further than that.

There's also the git users manual, but I think the man-page is more
detailed. But for future reference, just do

	git user manual

in google, and press "I'm feeling lucky". It finds the right thing at
least for me (and at least right now).

			Linus

From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: Linux 2.6.22 released
Date: Tue, 10 Jul 2007 15:39:56 UTC
Message-ID: <fa.DEwDXyqrl1AOg838sXIQpMmK16U@ifi.uio.no>

On Tue, 10 Jul 2007, Stefano Rivoir wrote:
>
> 2.6.22 hangs at boot on my box. Here attached a original dmesg from 2.6.21,
> and a copy of it where it stops on 2.6.22 (I can't attach the original 2.6.22
> dmesg because it's not logged to disk yet); it actually stops right after
> 'init' launches.

Ok, without any oops or hang in any really obvious place, that doesn't
really tell anybody anything specific enough to even start trying to debug
this, so you'd need to do one of two things:

 - poke a lot at the machine to try to get more specific information. In
   particular, get things like SysRQ-T output. That, in turn, probably
   would mean trying to get a serial console hooked up or something.

   The next thing that you got on 2.6.21 after the point it hangs was
   apparently

	..
	input: PC Speaker as /class/input/input1
	ieee1394: Initialized config rom entry `ip1394'
	ACPI: PCI Interrupt Link [APC4] enabled at IRQ 19
	ACPI: PCI Interrupt 0000:04:05.0[A] -> Link [APC4] -> GSI 19 (level, low) -> IRQ 19
	usbcore: registered new interface driver hiddev
	..

   so you could _try_ to disable the PC speaker or firewire, and see
   what's up. Did you switch from the old firewire drivers to the new one,
   for example? Or if you didn't, try it.

   IOW, we'd need a lot more debug information.

The second alternative will take some time, but is really a lot easier:

 - Get a kernel git tree, and do a "git bisect".

   There's almost 7000 commits in between 2.6.21 and 22, but that still
   means that in about fourteen recompiles/reboots, "git bisect" should
   tell us where your problem starts, which will hopefully make it obvious
   what the problem is (or at least pinpoint it a *lot*).

Doing a git bisect isn't really that hard, but fourteen compiles/reboots
will take some time (well, the compiles will, the reboots aren't that
bad). But even if you're not a git user, it really is very simple:

 - get started with 'git': on most distros it's now as simple as doing
   something like

	yum install git

   and while you might not get the latest version (Debian stable is at
   some ancient 1.4.4.4 version that isn't as nice as the 1.5.x series),
   for something like this you won't care that deeply.

 - get the kernel git tree (this will take a while to download about
   180MB)

	git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

 - start the "git bisect" with

	git bisect good v2.6.21
	git bisect bad v2.6.22

   and it will pick a kernel version about half-way between the two
   points, and you can now start testing. For each kernel you try, if it
   boots fine, do "git bisect good", otherwise boot into a working kernel,
   and then do "git bisect bad". Git will then pick the next "halfway"
   kernel for that case.

Thanks,

		Linus

From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: GIT bisection range errors
Date: Thu, 08 May 2008 23:00:39 UTC
Message-ID: <fa.lMeHf8njqvCZ0Y4BjVIQtc3VesI@ifi.uio.no>

On Fri, 9 May 2008, Rene Herman wrote:
>
> I'm in a git bisect and am experiencing strangeness. I did a
>
> $ git checkout -b rc v2.6.26-rc1
> $ git bisect start
> $ git bisect bad
> $ git bisect good v2.6.25
>
> Yet, during this I'm finding myself at 2.6.25-rc6 and 2.6.25-rc8
> as the last two results (both good...).

This is very normal.

Why?

Because a lot (in fact, *most*) of the code that was merged after v2.6.25
was released was actually *written* and committed long before v2.6.25.

It just got merged into my tree much later.

So what happens? The bisection run starts walking into all that history,
and that history is *not* based on the released v2.6.25 at all, it's based
on much earlier kernels (eg the -rc kernels).

So what you see is perfectly normal and expected. It's only unexpected if
you think of history as a linear thing, but it isn't - it's full of
merging of code that was branched off from (much) earlier code points.

		Linus

Index Home About Blog