Index Home About Blog
Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH] optimize ia32 memmove
Original-Message-ID: <Pine.LNX.4.58.0312300202520.2065@home.osdl.org>
Date: Tue, 30 Dec 2003 10:07:38 GMT
Message-ID: <fa.j5bsql4.1e24kgs@ifi.uio.no>

On Tue, 30 Dec 2003, Andreas Dilger wrote:
>
> The non-overlapping cases are probably very common and worth optimizing for:

No, almost all non-overlapping users already just use "memcpy()".

So most of the kernel uses of "memmove()" are likely overlapping - and
just optimizing the non-overlap case probably doesn't help a lot.

That's why you optimize the overlapping case in one direction. We really
should do the other case right too.

		Linus


Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: [PATCH] optimize ia32 memmove
Original-Message-ID: <Pine.LNX.4.58.0312300152250.2065@home.osdl.org>
Date: Tue, 30 Dec 2003 10:00:53 GMT
Message-ID: <fa.j3s0rt4.1diclos@ifi.uio.no>

On Tue, 30 Dec 2003, Jeff Garzik wrote:
>
> I'm confused... that doesn't say anything to me about overlap.
>
> They can still overlap:  Consider if dest is 1 byte less than src, and
> n==128...

But then anything that does the loads in ascending order is still ok, so
it shouldn't matter - by the time "dest" has been overwritten, the source
data has already been read. And all the "memcpy()"  implementations had
better do that anyway, in order to get nice memory access patterns. "rep
movsl" certainly does.

So assuming we have an ascending "memcpy()", the only case we need to care
about is "overlap && dest > src".

Now, if we have a non-ascending memcpy(), we have trouble.

		Linus

Index Home About Blog