Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address.
v2: - In each iteration sg_pgoffset pointed incorrectly at the next page not the current one.
Signed-off-by: Imre Deak imre.deak@intel.com --- include/linux/scatterlist.h | 50 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 4bd6c06..72578b5 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -231,6 +231,56 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents, */ #define SG_MAX_SINGLE_ALLOC (PAGE_SIZE / sizeof(struct scatterlist))
+struct sg_page_iter { + struct scatterlist *sg; + int sg_pgoffset; + struct page *page; +}; + +static inline int +sg_page_cnt(struct scatterlist *sg) +{ + BUG_ON(sg->offset || sg->length & ~PAGE_MASK); + + return sg->length >> PAGE_SHIFT; +} + +static inline struct page * +sg_page_iter_get_page(struct sg_page_iter *iter) +{ + while (iter->sg && iter->sg_pgoffset >= sg_page_cnt(iter->sg)) { + iter->sg_pgoffset -= sg_page_cnt(iter->sg); + iter->sg = sg_next(iter->sg); + } + + return iter->sg ? nth_page(sg_page(iter->sg), iter->sg_pgoffset) : NULL; +} + +static inline void +sg_page_iter_next(struct sg_page_iter *iter) +{ + iter->sg_pgoffset++; + iter->page = sg_page_iter_get_page(iter); +} + +static inline void +sg_page_iter_start(struct sg_page_iter *iter, struct scatterlist *sglist, + unsigned long pgoffset) +{ + iter->sg = sglist; + iter->sg_pgoffset = pgoffset; + iter->page = sg_page_iter_get_page(iter); +} + +/* + * Simple sg page iterator, starting off at the given page offset. Each entry + * on the sglist must start at offset 0 and can contain only full pages. + * iter->page will point to the current page, iter->sg_pgoffset to the page + * offset within the sg holding that page. + */ +#define for_each_sg_page(sglist, iter, pgoffset) \ + for (sg_page_iter_start((iter), (sglist), (pgoffset)); \ + (iter)->page; sg_page_iter_next(iter))
/* * Mapping sg iterator
On Mon, 11 Feb 2013 20:50:04 +0200 Imre Deak imre.deak@intel.com wrote:
Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is
What is "the mapping iterator"?
meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address.
Where will this new macro be used? What is driving this effort?
v2:
- In each iteration sg_pgoffset pointed incorrectly at the next page not the current one.
Signed-off-by: Imre Deak imre.deak@intel.com
include/linux/scatterlist.h | 50 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 4bd6c06..72578b5 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -231,6 +231,56 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents, */ #define SG_MAX_SINGLE_ALLOC (PAGE_SIZE / sizeof(struct scatterlist)) +struct sg_page_iter {
- struct scatterlist *sg;
- int sg_pgoffset;
- struct page *page;
+};
Some documentation wouldn't hurt. What it's used for, why it exists.
+static inline int +sg_page_cnt(struct scatterlist *sg)
unneeded newline here.
A more typical name would be "sg_page_count". Stripping words of their vowels makes the symbols harder to remember.
+{
- BUG_ON(sg->offset || sg->length & ~PAGE_MASK);
- return sg->length >> PAGE_SHIFT;
+}
+static inline struct page * +sg_page_iter_get_page(struct sg_page_iter *iter) +{
- while (iter->sg && iter->sg_pgoffset >= sg_page_cnt(iter->sg)) {
iter->sg_pgoffset -= sg_page_cnt(iter->sg);
iter->sg = sg_next(iter->sg);
- }
- return iter->sg ? nth_page(sg_page(iter->sg), iter->sg_pgoffset) : NULL;
+}
+static inline void +sg_page_iter_next(struct sg_page_iter *iter) +{
- iter->sg_pgoffset++;
- iter->page = sg_page_iter_get_page(iter);
+}
+static inline void +sg_page_iter_start(struct sg_page_iter *iter, struct scatterlist *sglist,
unsigned long pgoffset)
+{
- iter->sg = sglist;
- iter->sg_pgoffset = pgoffset;
- iter->page = sg_page_iter_get_page(iter);
+}
All the above are undocumented also. I guess that's acceptable if they are only ever to be used by for_each_sg_page(). Although if that's the case then perhaps the identifiers should be a bit more obscure-looking. Usually we prefix them with "__" to say "this is in internal thing".
+/*
- Simple sg page iterator, starting off at the given page offset. Each entry
- on the sglist must start at offset 0 and can contain only full pages.
- iter->page will point to the current page, iter->sg_pgoffset to the page
- offset within the sg holding that page.
- */
+#define for_each_sg_page(sglist, iter, pgoffset) \
- for (sg_page_iter_start((iter), (sglist), (pgoffset)); \
(iter)->page; sg_page_iter_next(iter))
Because all the helper functions are inlined, this will expand to a quite large amount of code. And large code can be slow code due to I-cache eviction.
I don't know *how* big this thing will be because the patch didn't include a caller and I can't be bothered writing my own. (And the lack of any caller means that the code will not be tested).
So, exactly how big is this thing, and how do we know it's better this way than if we were to uninline some/all of the helpers?
On Mon, 2013-02-11 at 12:54 -0800, Andrew Morton wrote:
On Mon, 11 Feb 2013 20:50:04 +0200 Imre Deak imre.deak@intel.com wrote:
Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is
What is "the mapping iterator"?
It's the one implemented by sg_miter_{start,stop} in scatterlist.c. It also iterates through a scatterlist a page at a time, but it also kmaps these pages. Since in our use case we don't need to map the pages we needed a solution without this overhead.
meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address.
Where will this new macro be used? What is driving this effort?
At the moment the only user of the macro would be the i915 driver, see [1] for the patches that takes it into use. In the patchset the macro was added as a DRM specific macro, but since it might be useful in the future for other drivers too (anything using dma-buf) I'd like to add it to a more generic place.
v2:
- In each iteration sg_pgoffset pointed incorrectly at the next page not the current one.
Signed-off-by: Imre Deak imre.deak@intel.com
include/linux/scatterlist.h | 50 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 4bd6c06..72578b5 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -231,6 +231,56 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents, */ #define SG_MAX_SINGLE_ALLOC (PAGE_SIZE / sizeof(struct scatterlist)) +struct sg_page_iter {
- struct scatterlist *sg;
- int sg_pgoffset;
- struct page *page;
+};
Some documentation wouldn't hurt. What it's used for, why it exists.
Ok, will add it.
+static inline int +sg_page_cnt(struct scatterlist *sg)
unneeded newline here.
A more typical name would be "sg_page_count". Stripping words of their vowels makes the symbols harder to remember.
Ok, will fix this.
+{
- BUG_ON(sg->offset || sg->length & ~PAGE_MASK);
- return sg->length >> PAGE_SHIFT;
+}
+static inline struct page * +sg_page_iter_get_page(struct sg_page_iter *iter) +{
- while (iter->sg && iter->sg_pgoffset >= sg_page_cnt(iter->sg)) {
iter->sg_pgoffset -= sg_page_cnt(iter->sg);
iter->sg = sg_next(iter->sg);
- }
- return iter->sg ? nth_page(sg_page(iter->sg), iter->sg_pgoffset) : NULL;
+}
+static inline void +sg_page_iter_next(struct sg_page_iter *iter) +{
- iter->sg_pgoffset++;
- iter->page = sg_page_iter_get_page(iter);
+}
+static inline void +sg_page_iter_start(struct sg_page_iter *iter, struct scatterlist *sglist,
unsigned long pgoffset)
+{
- iter->sg = sglist;
- iter->sg_pgoffset = pgoffset;
- iter->page = sg_page_iter_get_page(iter);
+}
All the above are undocumented also. I guess that's acceptable if they are only ever to be used by for_each_sg_page(). Although if that's the case then perhaps the identifiers should be a bit more obscure-looking. Usually we prefix them with "__" to say "this is in internal thing".
Yes, they are meant to be used only internally, so I'll add the __ prefix.
+/*
- Simple sg page iterator, starting off at the given page offset. Each entry
- on the sglist must start at offset 0 and can contain only full pages.
- iter->page will point to the current page, iter->sg_pgoffset to the page
- offset within the sg holding that page.
- */
+#define for_each_sg_page(sglist, iter, pgoffset) \
- for (sg_page_iter_start((iter), (sglist), (pgoffset)); \
(iter)->page; sg_page_iter_next(iter))
Because all the helper functions are inlined, this will expand to a quite large amount of code. And large code can be slow code due to I-cache eviction.
I don't know *how* big this thing will be because the patch didn't include a caller and I can't be bothered writing my own. (And the lack of any caller means that the code will not be tested).
So, exactly how big is this thing, and how do we know it's better this way than if we were to uninline some/all of the helpers?
I admit I only hoped compiler optimization would keep the inlined parts at a minimum, but now I actually checked (on Intel CPU). I applied the patchset from [1] and uninlined sg_page_iter_start as it's not significant for speed:
size drivers/gpu/drm/i915/i915.ko 514855 15996 272 531123 81ab3 drivers/gpu/drm/i915/i915.ko
Then uninlined all helpers: size drivers/gpu/drm/i915/i915.ko 513447 15996 272 529715 81533 drivers/gpu/drm/i915/i915.ko
Since there are 8 invocations of the macro, the overhead for a single invocation is about (531123 - 529715) / 8 = 191 bytes.
For speed, I benchmarked a simple loop which was basically:
page = vmalloc(sizeof(*page) * 1000, GFP_KERNEL); for_each_sg_page(sglist, iter, 0) *page++ = iter.page;
where each entry on the sglist contained 16 consecutive pages. This takes ~10% more time for the uninlined version to run. This is a rather artificial test and I couldn't come up with something more real-life using only the i915 driver's ioctl interface that would show a significant change in speed.
So at least for now I'm ok with just uninlining all the helpers.
Thanks for the review, Imre
[1] http://lists.freedesktop.org/archives/intel-gfx/2013-February/024589.html
Hello,
On Tue, Feb 12, 2013 at 07:07:20PM +0200, Imre Deak wrote:
It's the one implemented by sg_miter_{start,stop} in scatterlist.c. It also iterates through a scatterlist a page at a time, but it also kmaps these pages. Since in our use case we don't need to map the pages we needed a solution without this overhead.
I'm not against having non-mapping iterator but please consider that kmaps are no-ops on many configurations. It matters only for archs w/ high memory.
where each entry on the sglist contained 16 consecutive pages. This takes ~10% more time for the uninlined version to run. This is a rather artificial test and I couldn't come up with something more real-life using only the i915 driver's ioctl interface that would show a significant change in speed.
So at least for now I'm ok with just uninlining all the helpers.
Can we reimplement mapping iters using the new ones?
Thanks.
On Tue, 2013-02-12 at 09:13 -0800, Tejun Heo wrote:
Hello,
On Tue, Feb 12, 2013 at 07:07:20PM +0200, Imre Deak wrote:
It's the one implemented by sg_miter_{start,stop} in scatterlist.c. It also iterates through a scatterlist a page at a time, but it also kmaps these pages. Since in our use case we don't need to map the pages we needed a solution without this overhead.
I'm not against having non-mapping iterator but please consider that kmaps are no-ops on many configurations. It matters only for archs w/ high memory.
Ok, I haven't thought about that. But in any case we care about those archs too and would like to avoid the mapping there as well.
where each entry on the sglist contained 16 consecutive pages. This takes ~10% more time for the uninlined version to run. This is a rather artificial test and I couldn't come up with something more real-life using only the i915 driver's ioctl interface that would show a significant change in speed.
So at least for now I'm ok with just uninlining all the helpers.
Can we reimplement mapping iters using the new ones?
Yes I think it's a good idea, I will follow up with a new patchset addressing this and Andrew's comments.
Thanks, Imre
On Tue, 12 Feb 2013 19:07:20 +0200 Imre Deak imre.deak@intel.com wrote:
So, exactly how big is this thing, and how do we know it's better this way than if we were to uninline some/all of the helpers?
I admit I only hoped compiler optimization would keep the inlined parts at a minimum, but now I actually checked (on Intel CPU). I applied the patchset from [1] and uninlined sg_page_iter_start as it's not significant for speed:
size drivers/gpu/drm/i915/i915.ko 514855 15996 272 531123 81ab3 drivers/gpu/drm/i915/i915.ko
Then uninlined all helpers: size drivers/gpu/drm/i915/i915.ko 513447 15996 272 529715 81533 drivers/gpu/drm/i915/i915.ko
Since there are 8 invocations of the macro, the overhead for a single invocation is about (531123 - 529715) / 8 = 191 bytes.
For speed, I benchmarked a simple loop which was basically:
page = vmalloc(sizeof(*page) * 1000, GFP_KERNEL); for_each_sg_page(sglist, iter, 0) *page++ = iter.page;
where each entry on the sglist contained 16 consecutive pages. This takes ~10% more time for the uninlined version to run. This is a rather artificial test and I couldn't come up with something more real-life using only the i915 driver's ioctl interface that would show a significant change in speed.
10% for the function call overhead sounds reasonable. Of course, that test is biased in one direction. A test which was biased in the other direction would exercise all eight of the macro's callsites and would investigate the performance impact of a 1kbyte increase in L1 cache utilisation.
And I must say, it would need to be a pretty damn carefully crafted test case to be able to trigger enough cache thrashing to cause a 10% hit.
So at least for now I'm ok with just uninlining all the helpers.
OK.
Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address.
v2: - sg_pgoffset pointed incorrectly at the next page not the current one.
v3: - formatting fixes, documentation (Andrew) - uninline helper functions, as they are too big (Andrew) - support for terminating the iteration after a maximum number sglist->nents entries, needed by the next patch
Signed-off-by: Imre Deak imre.deak@intel.com --- include/linux/scatterlist.h | 35 +++++++++++++++++++++++++++++++++++ lib/scatterlist.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 4bd6c06..788a853 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -231,6 +231,41 @@ size_t sg_copy_to_buffer(struct scatterlist *sgl, unsigned int nents, */ #define SG_MAX_SINGLE_ALLOC (PAGE_SIZE / sizeof(struct scatterlist))
+/* + * sg page iterator + * + * Iterates over sg entries page-by-page. On each successful iteration, + * @piter->page points to the current page, @piter->sg to the sg holding this + * page and @piter->sg_pgoffset to the page's page offset within the sg. The + * iteration will stop either when a maximum number of sg entries was reached + * or a terminating sg (sg_last(sg) == true) was reached. + */ +struct sg_page_iter { + struct page *page; /* current page */ + struct scatterlist *sg; /* sg holding the page */ + unsigned int sg_pgoffset; /* page offset within the sg */ + + /* these are internal states, keep away */ + unsigned int __nents; /* remaining sg entries */ + int __pg_advance; /* nr pages to advance at the + * next step */ +}; + +bool __sg_page_iter_next(struct sg_page_iter *piter); +void __sg_page_iter_start(struct sg_page_iter *piter, + struct scatterlist *sglist, unsigned int nents, + unsigned long pgoffset); + +/** + * for_each_sg_page - iterate over the pages of the given sg list + * @sglist: sglist to iterate over + * @piter: page iterator to hold current page, sg, sg_pgoffset + * @nents: maximum number of sg entries to iterate over + * @pgoffset: starting page offset + */ +#define for_each_sg_page(sglist, piter, nents, pgoffset) \ + for (__sg_page_iter_start((piter), (sglist), (nents), (pgoffset)); \ + __sg_page_iter_next(piter);)
/* * Mapping sg iterator diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 7874b01..a1d1564 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -394,6 +394,44 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, } EXPORT_SYMBOL(sg_alloc_table_from_pages);
+void __sg_page_iter_start(struct sg_page_iter *piter, + struct scatterlist *sglist, unsigned int nents, + unsigned long pgoffset) +{ + piter->__pg_advance = 0; + piter->__nents = nents; + + piter->page = NULL; + piter->sg = sglist; + piter->sg_pgoffset = pgoffset; +} +EXPORT_SYMBOL(__sg_page_iter_start); + +static int sg_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg->length) >> PAGE_SHIFT; +} + +bool __sg_page_iter_next(struct sg_page_iter *piter) +{ + if (!piter->__nents || !piter->sg) + return false; + + piter->sg_pgoffset += piter->__pg_advance; + piter->__pg_advance = 1; + + while (piter->sg_pgoffset >= sg_page_count(piter->sg)) { + piter->sg_pgoffset -= sg_page_count(piter->sg); + piter->sg = sg_next(piter->sg); + if (!--piter->__nents || !piter->sg) + return false; + } + piter->page = nth_page(sg_page(piter->sg), piter->sg_pgoffset); + + return true; +} +EXPORT_SYMBOL(__sg_page_iter_next); + /** * sg_miter_start - start mapping iteration over a sg list * @miter: sg mapping iter to be started
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo tj@kernel.org.
Signed-off-by: Imre Deak imre.deak@intel.com --- include/linux/scatterlist.h | 6 +++--- lib/scatterlist.c | 46 ++++++++++++++++++++----------------------- 2 files changed, 24 insertions(+), 28 deletions(-)
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 788a853..a6cd692 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -295,9 +295,9 @@ struct sg_mapping_iter { size_t consumed; /* number of consumed bytes */
/* these are internal states, keep away */ - struct scatterlist *__sg; /* current entry */ - unsigned int __nents; /* nr of remaining entries */ - unsigned int __offset; /* offset within sg */ + unsigned int __offset; /* offset within page */ + struct sg_page_iter __piter; /* page iterator */ + unsigned int __remaining; /* remaining bytes on page */ unsigned int __flags; };
diff --git a/lib/scatterlist.c b/lib/scatterlist.c index a1d1564..4e4974a 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -449,9 +449,7 @@ void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl, { memset(miter, 0, sizeof(struct sg_mapping_iter));
- miter->__sg = sgl; - miter->__nents = nents; - miter->__offset = 0; + __sg_page_iter_start(&miter->__piter, sgl, nents, 0); WARN_ON(!(flags & (SG_MITER_TO_SG | SG_MITER_FROM_SG))); miter->__flags = flags; } @@ -476,36 +474,33 @@ EXPORT_SYMBOL(sg_miter_start); */ bool sg_miter_next(struct sg_mapping_iter *miter) { - unsigned int off, len; - - /* check for end and drop resources from the last iteration */ - if (!miter->__nents) - return false; - sg_miter_stop(miter);
- /* get to the next sg if necessary. __offset is adjusted by stop */ - while (miter->__offset == miter->__sg->length) { - if (--miter->__nents) { - miter->__sg = sg_next(miter->__sg); - miter->__offset = 0; - } else + /* + * Get to the next page if necessary. + * __remaining, __offset is adjusted by sg_miter_stop + */ + if (!miter->__remaining) { + struct scatterlist *sg; + unsigned long pgoffset; + + if (!__sg_page_iter_next(&miter->__piter)) return false; - }
- /* map the next page */ - off = miter->__sg->offset + miter->__offset; - len = miter->__sg->length - miter->__offset; + sg = miter->__piter.sg; + pgoffset = miter->__piter.sg_pgoffset;
- miter->page = nth_page(sg_page(miter->__sg), off >> PAGE_SHIFT); - off &= ~PAGE_MASK; - miter->length = min_t(unsigned int, len, PAGE_SIZE - off); - miter->consumed = miter->length; + miter->__offset = pgoffset ? 0 : sg->offset; + miter->__remaining = sg->offset + sg->length - + (pgoffset << PAGE_SHIFT) - miter->__offset; + } + miter->page = miter->__piter.page; + miter->consumed = miter->length = miter->__remaining;
if (miter->__flags & SG_MITER_ATOMIC) - miter->addr = kmap_atomic(miter->page) + off; + miter->addr = kmap_atomic(miter->page) + miter->__offset; else - miter->addr = kmap(miter->page) + off; + miter->addr = kmap(miter->page) + miter->__offset;
return true; } @@ -532,6 +527,7 @@ void sg_miter_stop(struct sg_mapping_iter *miter) /* drop resources from the last iteration */ if (miter->addr) { miter->__offset += miter->consumed; + miter->__remaining -= miter->consumed;
if (miter->__flags & SG_MITER_TO_SG) flush_kernel_dcache_page(miter->page);
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo.
v1-v3: - original version
v4: - The dw_mmc driver used sg_mapping_iter::__sg, which was marked as internal and moved by this patch to the new sg_page_iter struct. This caused a compile time breakage for the driver. Fix this by making sg_mapping_iter::piter a public interface and making the driver use sg_mapping_iter::piter.sg instead of sg_mapping_iter::__sg. Thanks to James Hogan james.hogan@imgtec.com for pointing this out.
Signed-off-by: Imre Deak imre.deak@intel.com Cc: Maxim Levitsky maximlevitsky@gmail.com Cc: Tejun Heo tj@kernel.org Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Andrew Morton akpm@linux-foundation.org --- drivers/mmc/host/dw_mmc.c | 4 ++-- include/linux/scatterlist.h | 6 +++--- lib/scatterlist.c | 46 ++++++++++++++++++++----------------------- 3 files changed, 26 insertions(+), 30 deletions(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 323c502..6b89fde 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -1445,7 +1445,7 @@ static void dw_mci_read_data_pio(struct dw_mci *host) if (!sg_miter_next(sg_miter)) goto done;
- host->sg = sg_miter->__sg; + host->sg = sg_miter->piter.sg; buf = sg_miter->addr; remain = sg_miter->length; offset = 0; @@ -1500,7 +1500,7 @@ static void dw_mci_write_data_pio(struct dw_mci *host) if (!sg_miter_next(sg_miter)) goto done;
- host->sg = sg_miter->__sg; + host->sg = sg_miter->piter.sg; buf = sg_miter->addr; remain = sg_miter->length; offset = 0; diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 788a853..2d8bdae 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -293,11 +293,11 @@ struct sg_mapping_iter { void *addr; /* pointer to the mapped area */ size_t length; /* length of the mapped area */ size_t consumed; /* number of consumed bytes */ + struct sg_page_iter piter; /* page iterator */
/* these are internal states, keep away */ - struct scatterlist *__sg; /* current entry */ - unsigned int __nents; /* nr of remaining entries */ - unsigned int __offset; /* offset within sg */ + unsigned int __offset; /* offset within page */ + unsigned int __remaining; /* remaining bytes on page */ unsigned int __flags; };
diff --git a/lib/scatterlist.c b/lib/scatterlist.c index a1d1564..2645acf 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -449,9 +449,7 @@ void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl, { memset(miter, 0, sizeof(struct sg_mapping_iter));
- miter->__sg = sgl; - miter->__nents = nents; - miter->__offset = 0; + __sg_page_iter_start(&miter->piter, sgl, nents, 0); WARN_ON(!(flags & (SG_MITER_TO_SG | SG_MITER_FROM_SG))); miter->__flags = flags; } @@ -476,36 +474,33 @@ EXPORT_SYMBOL(sg_miter_start); */ bool sg_miter_next(struct sg_mapping_iter *miter) { - unsigned int off, len; - - /* check for end and drop resources from the last iteration */ - if (!miter->__nents) - return false; - sg_miter_stop(miter);
- /* get to the next sg if necessary. __offset is adjusted by stop */ - while (miter->__offset == miter->__sg->length) { - if (--miter->__nents) { - miter->__sg = sg_next(miter->__sg); - miter->__offset = 0; - } else + /* + * Get to the next page if necessary. + * __remaining, __offset is adjusted by sg_miter_stop + */ + if (!miter->__remaining) { + struct scatterlist *sg; + unsigned long pgoffset; + + if (!__sg_page_iter_next(&miter->piter)) return false; - }
- /* map the next page */ - off = miter->__sg->offset + miter->__offset; - len = miter->__sg->length - miter->__offset; + sg = miter->piter.sg; + pgoffset = miter->piter.sg_pgoffset;
- miter->page = nth_page(sg_page(miter->__sg), off >> PAGE_SHIFT); - off &= ~PAGE_MASK; - miter->length = min_t(unsigned int, len, PAGE_SIZE - off); - miter->consumed = miter->length; + miter->__offset = pgoffset ? 0 : sg->offset; + miter->__remaining = sg->offset + sg->length - + (pgoffset << PAGE_SHIFT) - miter->__offset; + } + miter->page = miter->piter.page; + miter->consumed = miter->length = miter->__remaining;
if (miter->__flags & SG_MITER_ATOMIC) - miter->addr = kmap_atomic(miter->page) + off; + miter->addr = kmap_atomic(miter->page) + miter->__offset; else - miter->addr = kmap(miter->page) + off; + miter->addr = kmap(miter->page) + miter->__offset;
return true; } @@ -532,6 +527,7 @@ void sg_miter_stop(struct sg_mapping_iter *miter) /* drop resources from the last iteration */ if (miter->addr) { miter->__offset += miter->consumed; + miter->__remaining -= miter->consumed;
if (miter->__flags & SG_MITER_TO_SG) flush_kernel_dcache_page(miter->page);
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo.
v1-v3: - original version
v4: - The dw_mmc driver used sg_mapping_iter::__sg, which was marked as internal and moved by this patch to the new sg_page_iter struct. This caused a compile time breakage for the driver. Fix this by making sg_mapping_iter::piter a public interface and making the driver use sg_mapping_iter::piter.sg instead of sg_mapping_iter::__sg. Thanks to James Hogan for pointing this out.
v5: - Fix the mapping size to be page-size limited. Thanks to Stephen Warren for tracking down the bug.
Signed-off-by: Imre Deak imre.deak@intel.com Cc: Maxim Levitsky maximlevitsky@gmail.com Cc: Tejun Heo tj@kernel.org Cc: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Andrew Morton akpm@linux-foundation.org --- drivers/mmc/host/dw_mmc.c | 4 ++-- include/linux/scatterlist.h | 6 +++--- lib/scatterlist.c | 48 +++++++++++++++++++++---------------------- 3 files changed, 28 insertions(+), 30 deletions(-)
diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c index 323c502..6b89fde 100644 --- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -1445,7 +1445,7 @@ static void dw_mci_read_data_pio(struct dw_mci *host) if (!sg_miter_next(sg_miter)) goto done;
- host->sg = sg_miter->__sg; + host->sg = sg_miter->piter.sg; buf = sg_miter->addr; remain = sg_miter->length; offset = 0; @@ -1500,7 +1500,7 @@ static void dw_mci_write_data_pio(struct dw_mci *host) if (!sg_miter_next(sg_miter)) goto done;
- host->sg = sg_miter->__sg; + host->sg = sg_miter->piter.sg; buf = sg_miter->addr; remain = sg_miter->length; offset = 0; diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 788a853..2d8bdae 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -293,11 +293,11 @@ struct sg_mapping_iter { void *addr; /* pointer to the mapped area */ size_t length; /* length of the mapped area */ size_t consumed; /* number of consumed bytes */ + struct sg_page_iter piter; /* page iterator */
/* these are internal states, keep away */ - struct scatterlist *__sg; /* current entry */ - unsigned int __nents; /* nr of remaining entries */ - unsigned int __offset; /* offset within sg */ + unsigned int __offset; /* offset within page */ + unsigned int __remaining; /* remaining bytes on page */ unsigned int __flags; };
diff --git a/lib/scatterlist.c b/lib/scatterlist.c index a1d1564..b83c144 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -449,9 +449,7 @@ void sg_miter_start(struct sg_mapping_iter *miter, struct scatterlist *sgl, { memset(miter, 0, sizeof(struct sg_mapping_iter));
- miter->__sg = sgl; - miter->__nents = nents; - miter->__offset = 0; + __sg_page_iter_start(&miter->piter, sgl, nents, 0); WARN_ON(!(flags & (SG_MITER_TO_SG | SG_MITER_FROM_SG))); miter->__flags = flags; } @@ -476,36 +474,35 @@ EXPORT_SYMBOL(sg_miter_start); */ bool sg_miter_next(struct sg_mapping_iter *miter) { - unsigned int off, len; - - /* check for end and drop resources from the last iteration */ - if (!miter->__nents) - return false; - sg_miter_stop(miter);
- /* get to the next sg if necessary. __offset is adjusted by stop */ - while (miter->__offset == miter->__sg->length) { - if (--miter->__nents) { - miter->__sg = sg_next(miter->__sg); - miter->__offset = 0; - } else + /* + * Get to the next page if necessary. + * __remaining, __offset is adjusted by sg_miter_stop + */ + if (!miter->__remaining) { + struct scatterlist *sg; + unsigned long pgoffset; + + if (!__sg_page_iter_next(&miter->piter)) return false; - }
- /* map the next page */ - off = miter->__sg->offset + miter->__offset; - len = miter->__sg->length - miter->__offset; + sg = miter->piter.sg; + pgoffset = miter->piter.sg_pgoffset;
- miter->page = nth_page(sg_page(miter->__sg), off >> PAGE_SHIFT); - off &= ~PAGE_MASK; - miter->length = min_t(unsigned int, len, PAGE_SIZE - off); - miter->consumed = miter->length; + miter->__offset = pgoffset ? 0 : sg->offset; + miter->__remaining = sg->offset + sg->length - + (pgoffset << PAGE_SHIFT) - miter->__offset; + miter->__remaining = min_t(unsigned long, miter->__remaining, + PAGE_SIZE - miter->__offset); + } + miter->page = miter->piter.page; + miter->consumed = miter->length = miter->__remaining;
if (miter->__flags & SG_MITER_ATOMIC) - miter->addr = kmap_atomic(miter->page) + off; + miter->addr = kmap_atomic(miter->page) + miter->__offset; else - miter->addr = kmap(miter->page) + off; + miter->addr = kmap(miter->page) + miter->__offset;
return true; } @@ -532,6 +529,7 @@ void sg_miter_stop(struct sg_mapping_iter *miter) /* drop resources from the last iteration */ if (miter->addr) { miter->__offset += miter->consumed; + miter->__remaining -= miter->consumed;
if (miter->__flags & SG_MITER_TO_SG) flush_kernel_dcache_page(miter->page);
On 02/24/2013 04:05 AM, Imre Deak wrote:
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo.
Just for completeness, this updated combined patch, Tested-by: Stephen Warren swarren@wwwdotorg.org
On 02/13/2013 08:10 AM, Imre Deak wrote:
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo tj@kernel.org.
This patch appears in linux-next since next-20130220. It breaks mounting a root filesystem on an SD card on the Raspberry Pi ARM platform, with errors such as those shown below.
next-20130222 with just this patch reverted works fine.
[ 0.708426] VFS: Mounted root (ext4 filesystem) on device 179:2. [ 0.723742] devtmpfs: mounted [ 0.733064] Freeing init memory: 204K [ 0.777992] EXT4-fs error (device mmcblk0p2): ext4_iget:3814: inode #4259: comm swapper: bad extra_isize (57200 != 256) [ 0.815172] EXT4-fs error (device mmcblk0p2): ext4_lookup:1428: inode #8198: comm swapper: deleted inode referenced: 487 [ 0.826179] Kernel panic - not syncing: No init found. Try passing init= option to kernel. See Linux Documentation/init.txt for guidance.
[ 0.719365] VFS: Mounted root (ext4 filesystem) on device 179:2. [ 0.740918] devtmpfs: mounted [ 0.745219] Freeing init memory: 204K ERROR: ld.so: object '/usr/lib/arm-linux-gnueabihf/libcofi_rpi.so' from /etc/ld.so.preload cannot be preloaded: ignored. [ 0.906840] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 0.724018] VFS: Mounted root (ext4 filesystem) on device 179:2. [ 0.739404] devtmpfs: mounted [ 0.748741] Freeing init memory: 204K [ 0.793603] EXT4-fs error (device mmcblk0p2): ext4_iget:3814: inode #4259: comm swapper: bad extra_isize (57200 != 256) [ 0.822138] Kernel panic - not syncing: No init found. Try passing init= option to kernel. See Linux Documentation/init.txt for guidance.
On Fri, 2013-02-22 at 21:29 -0700, Stephen Warren wrote:
On 02/13/2013 08:10 AM, Imre Deak wrote:
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo tj@kernel.org.
This patch appears in linux-next since next-20130220. It breaks mounting a root filesystem on an SD card on the Raspberry Pi ARM platform, with errors such as those shown below.
next-20130222 with just this patch reverted works fine.
Thanks for tracking this down. I noticed now an obvious mistake I've made, not limiting the mapping size to page size :/ I didn't hit it since it only causes a problem when the user of miter modifies miter->consumed and I think nothing does this on my machine. Since the sdhci driver on Raspberry Pi does this the following might fix the problem you saw. Could you give it a try? It applies on top of v4 of the patch [1]:
diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 2645acf..b83c144 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -493,6 +493,8 @@ bool sg_miter_next(struct sg_mapping_iter *miter) miter->__offset = pgoffset ? 0 : sg->offset; miter->__remaining = sg->offset + sg->length - (pgoffset << PAGE_SHIFT) - miter->__offset; + miter->__remaining = min_t(unsigned long, miter->__remaining, + PAGE_SIZE - miter->__offset); } miter->page = miter->piter.page; miter->consumed = miter->length = miter->__remaining;
--Imre
[1] http://lists.linaro.org/pipermail/linaro-mm-sig/2013-February/003069.html
On Sat, 23 Feb 2013 22:04:06 +0200, Imre Deak wrote:
On Fri, 2013-02-22 at 21:29 -0700, Stephen Warren wrote:
On 02/13/2013 08:10 AM, Imre Deak wrote:
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo tj@kernel.org.
This patch appears in linux-next since next-20130220. It breaks mounting a root filesystem on an SD card on the Raspberry Pi ARM platform, with errors such as those shown below.
next-20130222 with just this patch reverted works fine.
... the following might fix the problem you saw. Could you give it a try? It applies on top of v4 of the patch [1]:
Yes, thanks very much. That incremental patch you gave solves the problem perfectly.
linaro-mm-sig@lists.linaro.org