Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 0/4] zswap: compressed swap caching

This is the latest version of the zswap patchset for compressed swap caching.
This is submitted for merging into linux-next and inclusion in v3.11.

New in this Version:

Lucky 13!
Integrated feedback from Andrew and moved zbud page metadata out of the
struct page to an in-line header structure.

Useful References:

LSFMM: In-kernel memory compression
https://lwn.net/Articles/548109/

The zswap compressed swap cache
https://lwn.net/Articles/537422/

Zswap Overview:

Zswap is a lightweight compressed cache for swap pages. It takes
pages that are in the process of being swapped out and attempts to
compress them into a dynamically allocated RAM-based memory pool.
If this process is successful, the writeback to the swap device is
deferred and, in many cases, avoided completely.  This results in
a significant I/O reduction and performance gains for systems that
are swapping.

The results of a kernel building benchmark indicate a
runtime reduction of 53% and an I/O reduction 76% with zswap vs normal
swapping with a kernel build under heavy memory pressure (see
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 1/4] debugfs: add get/set for atomic types

debugfs currently lack the ability to create attributes
that set/get atomic_t values.

This patch adds support for this through a new
debugfs_create_atomic_t() function.

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh <at> linuxfoundation.org>
Acked-by: Mel Gorman <mgorman <at> suse.de>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
 fs/debugfs/file.c       |   42 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/debugfs.h |    2 ++
 lib/fault-inject.c      |   21 ---------------------
 3 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index c5ca6ae..ff64bcd 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
 <at>  <at>  -21,6 +21,7  <at>  <at> 
 #include <linux/debugfs.h>
 #include <linux/io.h>
 #include <linux/slab.h>
+#include <linux/atomic.h>

 static ssize_t default_read_file(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
 <at>  <at>  -403,6 +404,47  <at>  <at>  struct dentry *debugfs_create_size_t(const char *name, umode_t mode,
 }
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 2/4] zbud: add to mm/

zbud is an special purpose allocator for storing compressed pages. It is
designed to store up to two compressed pages per physical page.  While this
design limits storage density, it has simple and deterministic reclaim
properties that make it preferable to a higher density approach when reclaim
will be used.

zbud works by storing compressed pages, or "zpages", together in pairs in a
single memory page called a "zbud page".  The first buddy is "left
justifed" at the beginning of the zbud page, and the last buddy is "right
justified" at the end of the zbud page.  The benefit is that if either
buddy is freed, the freed buddy space, coalesced with whatever slack space
that existed between the buddies, results in the largest possible free region
within the zbud page.

zbud also provides an attractive lower bound on density. The ratio of zpages
to zbud pages can not be less than 1.  This ensures that zbud can never "do
harm" by using more pages to store zpages than the uncompressed zpages would
have used on their own.

This implementation is a rewrite of the zbud allocator internally used
by zcache in the driver/staging tree.  The rewrite was necessary to
remove some of the zcache specific elements that were ingrained throughout
and provide a generic allocation interface that can later be used by
zsmalloc and others.

This patch adds zbud to mm/ for later use by zswap.

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
(Continue reading)

Bob Liu | 5 Jun 08:43 2013
Picon

Re: [PATCHv13 2/4] zbud: add to mm/

Hi Seth,

On 06/04/2013 04:33 AM, Seth Jennings wrote:
> zbud is an special purpose allocator for storing compressed pages. It is
> designed to store up to two compressed pages per physical page.  While this
> design limits storage density, it has simple and deterministic reclaim
> properties that make it preferable to a higher density approach when reclaim
> will be used.
> 
> zbud works by storing compressed pages, or "zpages", together in pairs in a
> single memory page called a "zbud page".  The first buddy is "left
> justifed" at the beginning of the zbud page, and the last buddy is "right
> justified" at the end of the zbud page.  The benefit is that if either
> buddy is freed, the freed buddy space, coalesced with whatever slack space
> that existed between the buddies, results in the largest possible free region
> within the zbud page.
> 
> zbud also provides an attractive lower bound on density. The ratio of zpages
> to zbud pages can not be less than 1.  This ensures that zbud can never "do
> harm" by using more pages to store zpages than the uncompressed zpages would
> have used on their own.
> 
> This implementation is a rewrite of the zbud allocator internally used
> by zcache in the driver/staging tree.  The rewrite was necessary to
> remove some of the zcache specific elements that were ingrained throughout
> and provide a generic allocation interface that can later be used by
> zsmalloc and others.
> 
> This patch adds zbud to mm/ for later use by zswap.
> 
(Continue reading)

Seth Jennings | 5 Jun 15:55 2013
Picon

Re: [PATCHv13 2/4] zbud: add to mm/

On Wed, Jun 05, 2013 at 02:43:28PM +0800, Bob Liu wrote:
> Hi Seth,
> 
> On 06/04/2013 04:33 AM, Seth Jennings wrote:
> > +	/* Couldn't find unbuddied zbud page, create new one */
> 
> How about moving zswap_is_full() to here.
> 
> if (zswap_is_full()) {
> 	/* Don't alloc any new page, try to reclaim and direct use the
> reclaimed page instead */

Yes, this is at the top of the list for improvements.

I have already started on this work and it isn't quite as simple as it seems.
The difficulty rises from the fact that, for now, zswap uses per-cpu
compression buffers which require preemption to be disabled. This prevents the
calling zbud_reclaim_page() in zbud_alloc() because the eviction handler for
the user may do something that can wait; an allocation with GFP_WAIT for
example.

So it's going to take some massaging in the zswap layer to get that to work.

It's very doable.  Just not in this patchset without causing a lot of code
thrash.

> }
> 
> > +	spin_unlock(&pool->lock);
> > +	page = alloc_page(gfp);
(Continue reading)

Seth Jennings | 5 Jun 15:55 2013
Picon

Re: [PATCHv13 2/4] zbud: add to mm/

On Wed, Jun 05, 2013 at 02:43:28PM +0800, Bob Liu wrote:
> Hi Seth,
> 
> On 06/04/2013 04:33 AM, Seth Jennings wrote:
> > +	/* Couldn't find unbuddied zbud page, create new one */
> 
> How about moving zswap_is_full() to here.
> 
> if (zswap_is_full()) {
> 	/* Don't alloc any new page, try to reclaim and direct use the
> reclaimed page instead */

Yes, this is at the top of the list for improvements.

I have already started on this work and it isn't quite as simple as it seems.
The difficulty rises from the fact that, for now, zswap uses per-cpu
compression buffers which require preemption to be disabled. This prevents the
calling zbud_reclaim_page() in zbud_alloc() because the eviction handler for
the user may do something that can wait; an allocation with GFP_WAIT for
example.

So it's going to take some massaging in the zswap layer to get that to work.

It's very doable.  Just not in this patchset without causing a lot of code
thrash.

> }
> 
> > +	spin_unlock(&pool->lock);
> > +	page = alloc_page(gfp);
(Continue reading)

Bob Liu | 5 Jun 08:43 2013
Picon

Re: [PATCHv13 2/4] zbud: add to mm/

Hi Seth,

On 06/04/2013 04:33 AM, Seth Jennings wrote:
> zbud is an special purpose allocator for storing compressed pages. It is
> designed to store up to two compressed pages per physical page.  While this
> design limits storage density, it has simple and deterministic reclaim
> properties that make it preferable to a higher density approach when reclaim
> will be used.
> 
> zbud works by storing compressed pages, or "zpages", together in pairs in a
> single memory page called a "zbud page".  The first buddy is "left
> justifed" at the beginning of the zbud page, and the last buddy is "right
> justified" at the end of the zbud page.  The benefit is that if either
> buddy is freed, the freed buddy space, coalesced with whatever slack space
> that existed between the buddies, results in the largest possible free region
> within the zbud page.
> 
> zbud also provides an attractive lower bound on density. The ratio of zpages
> to zbud pages can not be less than 1.  This ensures that zbud can never "do
> harm" by using more pages to store zpages than the uncompressed zpages would
> have used on their own.
> 
> This implementation is a rewrite of the zbud allocator internally used
> by zcache in the driver/staging tree.  The rewrite was necessary to
> remove some of the zcache specific elements that were ingrained throughout
> and provide a generic allocation interface that can later be used by
> zsmalloc and others.
> 
> This patch adds zbud to mm/ for later use by zswap.
> 
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 3/4] zswap: add to mm/

zswap is a thin backend for frontswap that takes pages that are in the process
of being swapped out and attempts to compress them and store them in a
RAM-based memory pool.  This can result in a significant I/O reduction on the
swap device and, in the case where decompressing from RAM is faster than
reading from the swap device, can also improve workload performance.

It also has support for evicting swap pages that are currently compressed in
zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
true cache in that, once the cache is full, the oldest pages can be moved out
of zswap to the swap device so newer pages can be compressed and stored in
zswap.

This patch adds the zswap driver to mm/

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
 mm/Kconfig  |   20 ++
 mm/Makefile |    1 +
 mm/zswap.c  |  943 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 964 insertions(+)
 create mode 100644 mm/zswap.c

diff --git a/mm/Kconfig b/mm/Kconfig
index 3367ac3..eec97f2 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
 <at>  <at>  -487,3 +487,23  <at>  <at>  config ZBUD
 	  page.  While this design limits storage density, it has simple and
 	  deterministic reclaim properties that make it preferable to a higher
(Continue reading)

Bob Liu | 17 Jun 08:20 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

Hi Seth,

On Tue, Jun 4, 2013 at 4:33 AM, Seth Jennings
<sjenning <at> linux.vnet.ibm.com> wrote:
> zswap is a thin backend for frontswap that takes pages that are in the process
> of being swapped out and attempts to compress them and store them in a
> RAM-based memory pool.  This can result in a significant I/O reduction on the
> swap device and, in the case where decompressing from RAM is faster than
> reading from the swap device, can also improve workload performance.
>
> It also has support for evicting swap pages that are currently compressed in
> zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
> true cache in that, once the cache is full, the oldest pages can be moved out
> of zswap to the swap device so newer pages can be compressed and stored in
> zswap.
>
> This patch adds the zswap driver to mm/
>

Do you have any more benchmark can share with me ? To figure out that
we can benefit from zswap.

I found zswap will cause performance drop when using mmtests-0.10 to test it.
The config file I'm using is: config-global-dhp__parallelio-memcachetest

The result is:
(v3.10-rc4-2G-nozswap was without zswap but the performance is better.)

                                         v3.10-rc4                   v3.10-rc4
                                     2G-zswap-base                  2G-nozswap
(Continue reading)

Andrew Morton | 18 Jun 01:02 2013

Re: [PATCHv13 3/4] zswap: add to mm/

On Mon, 17 Jun 2013 14:20:05 +0800 Bob Liu <lliubbo <at> gmail.com> wrote:

> Hi Seth,
> 
> On Tue, Jun 4, 2013 at 4:33 AM, Seth Jennings
> <sjenning <at> linux.vnet.ibm.com> wrote:
> > zswap is a thin backend for frontswap that takes pages that are in the process
> > of being swapped out and attempts to compress them and store them in a
> > RAM-based memory pool.  This can result in a significant I/O reduction on the
> > swap device and, in the case where decompressing from RAM is faster than
> > reading from the swap device, can also improve workload performance.
> >
> > It also has support for evicting swap pages that are currently compressed in
> > zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
> > true cache in that, once the cache is full, the oldest pages can be moved out
> > of zswap to the swap device so newer pages can be compressed and stored in
> > zswap.
> >
> > This patch adds the zswap driver to mm/
> >
> 
> Do you have any more benchmark can share with me ? To figure out that
> we can benefit from zswap.
> 
> I found zswap will cause performance drop when using mmtests-0.10 to test it.
> The config file I'm using is: config-global-dhp__parallelio-memcachetest

Thanks for testing.

> The result is:
(Continue reading)

Bob Liu | 18 Jun 13:50 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

>
> So the minor fault rate improved and everything else got worse?

I did the test again, in a new clean environment.
I'm sure the config files are the same except enabled zswap.

                                         v3.10-rc4                   v3.10-rc4
                             2G-parallio-zswapbas         2G-parallio-nozswap
Ops memcachetest-0M               819.00 (  0.00%)           1041.00 ( 27.11%)
Ops memcachetest-198M             736.00 (  0.00%)            973.00 ( 32.20%)
Ops memcachetest-430M             700.00 (  0.00%)            892.00 ( 27.43%)
Ops memcachetest-661M             672.00 (  0.00%)            819.00 ( 21.88%)
Ops memcachetest-893M             675.00 (  0.00%)            775.00 ( 14.81%)
Ops memcachetest-1125M            665.00 (  0.00%)            764.00 ( 14.89%)
Ops memcachetest-1356M            641.00 (  0.00%)            749.00 ( 16.85%)
Ops io-duration-0M                  0.00 (  0.00%)              0.00 (  0.00%)
Ops io-duration-198M              111.00 (  0.00%)             21.00 ( 81.08%)
Ops io-duration-430M              125.00 (  0.00%)             29.00 ( 76.80%)
Ops io-duration-661M              153.00 (  0.00%)             34.00 ( 77.78%)
Ops io-duration-893M              118.00 (  0.00%)             36.00 ( 69.49%)
Ops io-duration-1125M             142.00 (  0.00%)             43.00 ( 69.72%)
Ops io-duration-1356M             156.00 (  0.00%)             50.00 ( 67.95%)
Ops swaptotal-0M               462237.00 (  0.00%)         469193.00 ( -1.50%)
Ops swaptotal-198M             490462.00 (  0.00%)         496201.00 ( -1.17%)
Ops swaptotal-430M             500469.00 (  0.00%)         520400.00 ( -3.98%)
Ops swaptotal-661M             506038.00 (  0.00%)         538872.00 ( -6.49%)
Ops swaptotal-893M             514930.00 (  0.00%)         522590.00 ( -1.49%)
Ops swaptotal-1125M            521010.00 (  0.00%)         526934.00 ( -1.14%)
Ops swaptotal-1356M            513128.00 (  0.00%)         525241.00 ( -2.36%)
Ops swapin-0M                  246425.00 (  0.00%)         251226.00 ( -1.95%)
(Continue reading)

Bob Liu | 18 Jun 14:29 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

>
> I'm not sure how representative this is of real workloads, but it does
> look rather fatal for zswap.  The differences are so large, I wonder if
> it's just some silly bug or config issue.
>

In my observation, zswap_pool_pages always close to zswap_stored_pages
in this testing.
I think it means that the fragmentation of zswap is heavy.
Since in idea state number of zswap pool pages should be half of stored pages.

The reason may be this workload is not suitable for compression.
The data of it can't be compressed to a low percent.
It can only compressed to around 70% percent(not exactly but at least
above 50%).

I made a simple patch to limit the fragment of zswap to 70%.
The result can be better but still not positive.
                                         v3.10-rc4                   v3.10-rc4
                               2G-parallio-nozswap     2G-parallio-zswapdefrag
Ops memcachetest-0M              1041.00 (  0.00%)           1058.00 (  1.63%)
Ops memcachetest-198M             973.00 (  0.00%)           1019.00 (  4.73%)
Ops memcachetest-430M             892.00 (  0.00%)            831.00 ( -6.84%)
Ops memcachetest-661M             819.00 (  0.00%)            850.00 (  3.79%)
Ops memcachetest-893M             775.00 (  0.00%)            784.00 (  1.16%)
Ops memcachetest-1125M            764.00 (  0.00%)            766.00 (  0.26%)
Ops memcachetest-1356M            749.00 (  0.00%)            782.00 (  4.41%)
Ops io-duration-0M                  0.00 (  0.00%)              0.00 (  0.00%)
Ops io-duration-198M               21.00 (  0.00%)             28.00 (-33.33%)
Ops io-duration-430M               29.00 (  0.00%)             32.00 (-10.34%)
(Continue reading)

Seth Jennings | 19 Jun 16:09 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Mon, Jun 17, 2013 at 02:20:05PM +0800, Bob Liu wrote:
> Hi Seth,
> 
> On Tue, Jun 4, 2013 at 4:33 AM, Seth Jennings
> <sjenning <at> linux.vnet.ibm.com> wrote:
> > zswap is a thin backend for frontswap that takes pages that are in the process
> > of being swapped out and attempts to compress them and store them in a
> > RAM-based memory pool.  This can result in a significant I/O reduction on the
> > swap device and, in the case where decompressing from RAM is faster than
> > reading from the swap device, can also improve workload performance.
> >
> > It also has support for evicting swap pages that are currently compressed in
> > zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
> > true cache in that, once the cache is full, the oldest pages can be moved out
> > of zswap to the swap device so newer pages can be compressed and stored in
> > zswap.
> >
> > This patch adds the zswap driver to mm/
> >
> 
> Do you have any more benchmark can share with me ? To figure out that
> we can benefit from zswap.

The two I've done or kernbench and SPECjbb.  I'm trying out the memtests
now.  I'd like to be able to explain the numbers you are seeing at least.

Sorry for the delay.  I'll get back to you once I've figured out how
to using mmtests and get some results/explanations.

Also, how much physical RAM did this box have? I see 2G in the profile name
(Continue reading)

Bob Liu | 19 Jun 16:17 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Wed, Jun 19, 2013 at 10:09 PM, Seth Jennings
<sjenning <at> linux.vnet.ibm.com> wrote:
> On Mon, Jun 17, 2013 at 02:20:05PM +0800, Bob Liu wrote:
>> Hi Seth,
>>
>> On Tue, Jun 4, 2013 at 4:33 AM, Seth Jennings
>> <sjenning <at> linux.vnet.ibm.com> wrote:
>> > zswap is a thin backend for frontswap that takes pages that are in the process
>> > of being swapped out and attempts to compress them and store them in a
>> > RAM-based memory pool.  This can result in a significant I/O reduction on the
>> > swap device and, in the case where decompressing from RAM is faster than
>> > reading from the swap device, can also improve workload performance.
>> >
>> > It also has support for evicting swap pages that are currently compressed in
>> > zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
>> > true cache in that, once the cache is full, the oldest pages can be moved out
>> > of zswap to the swap device so newer pages can be compressed and stored in
>> > zswap.
>> >
>> > This patch adds the zswap driver to mm/
>> >
>>
>> Do you have any more benchmark can share with me ? To figure out that
>> we can benefit from zswap.
>
> The two I've done or kernbench and SPECjbb.  I'm trying out the memtests

Thanks, I'll try to setup them.

> now.  I'd like to be able to explain the numbers you are seeing at least.
(Continue reading)

Seth Jennings | 20 Jun 04:37 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Mon, Jun 17, 2013 at 02:20:05PM +0800, Bob Liu wrote:
> Hi Seth,
> 
> On Tue, Jun 4, 2013 at 4:33 AM, Seth Jennings
> <sjenning <at> linux.vnet.ibm.com> wrote:
> > zswap is a thin backend for frontswap that takes pages that are in the process
> > of being swapped out and attempts to compress them and store them in a
> > RAM-based memory pool.  This can result in a significant I/O reduction on the
> > swap device and, in the case where decompressing from RAM is faster than
> > reading from the swap device, can also improve workload performance.
> >
> > It also has support for evicting swap pages that are currently compressed in
> > zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
> > true cache in that, once the cache is full, the oldest pages can be moved out
> > of zswap to the swap device so newer pages can be compressed and stored in
> > zswap.
> >
> > This patch adds the zswap driver to mm/
> >
> 
> Do you have any more benchmark can share with me ? To figure out that
> we can benefit from zswap.
> 
> I found zswap will cause performance drop when using mmtests-0.10 to test it.
> The config file I'm using is: config-global-dhp__parallelio-memcachetest
> 
> The result is:
> (v3.10-rc4-2G-nozswap was without zswap but the performance is better.)
> 
>                                          v3.10-rc4                   v3.10-rc4
(Continue reading)

Bob Liu | 20 Jun 11:42 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Thu, Jun 20, 2013 at 10:37 AM, Seth Jennings
<sjenning <at> linux.vnet.ibm.com> wrote:
> On Mon, Jun 17, 2013 at 02:20:05PM +0800, Bob Liu wrote:
>> Hi Seth,
>>
>> On Tue, Jun 4, 2013 at 4:33 AM, Seth Jennings
>> <sjenning <at> linux.vnet.ibm.com> wrote:
>> > zswap is a thin backend for frontswap that takes pages that are in the process
>> > of being swapped out and attempts to compress them and store them in a
>> > RAM-based memory pool.  This can result in a significant I/O reduction on the
>> > swap device and, in the case where decompressing from RAM is faster than
>> > reading from the swap device, can also improve workload performance.
>> >
>> > It also has support for evicting swap pages that are currently compressed in
>> > zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
>> > true cache in that, once the cache is full, the oldest pages can be moved out
>> > of zswap to the swap device so newer pages can be compressed and stored in
>> > zswap.
>> >
>> > This patch adds the zswap driver to mm/
>> >
>>
>> Do you have any more benchmark can share with me ? To figure out that
>> we can benefit from zswap.
>>
>> I found zswap will cause performance drop when using mmtests-0.10 to test it.
>> The config file I'm using is: config-global-dhp__parallelio-memcachetest
>>
>> The result is:
>> (v3.10-rc4-2G-nozswap was without zswap but the performance is better.)
(Continue reading)

Seth Jennings | 20 Jun 16:23 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Thu, Jun 20, 2013 at 05:42:04PM +0800, Bob Liu wrote:
> > Just made a mmtests run of my own and got very different results:
> >
> 
> It's strange, I'll update to rc6 and try again.
> By the way, are you using 824 hardware compressor instead of lzo?

My results where using lzo software compression.

Seth

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo <at> kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont <at> kvack.org"> email <at> kvack.org </a>

Bob Liu | 20 Jun 16:35 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Thu, Jun 20, 2013 at 10:23 PM, Seth Jennings
<sjenning <at> linux.vnet.ibm.com> wrote:
> On Thu, Jun 20, 2013 at 05:42:04PM +0800, Bob Liu wrote:
>> > Just made a mmtests run of my own and got very different results:
>> >
>>
>> It's strange, I'll update to rc6 and try again.
>> By the way, are you using 824 hardware compressor instead of lzo?
>
> My results where using lzo software compression.
>

Thanks, and today I used another machine to test zswap.
The total ram size of that machine is around 4G.
This time the result is better:
                                               rc6                         rc6
                                             zswap                        base
Ops memcachetest-0M             14619.00 (  0.00%)          15602.00 (  6.72%)
Ops memcachetest-435M           14727.00 (  0.00%)          15860.00 (  7.69%)
Ops memcachetest-944M           12452.00 (  0.00%)          11812.00 ( -5.14%)
Ops memcachetest-1452M          12183.00 (  0.00%)           9829.00 (-19.32%)
Ops memcachetest-1961M          11953.00 (  0.00%)           9337.00 (-21.89%)
Ops memcachetest-2469M          11201.00 (  0.00%)           7509.00 (-32.96%)
Ops memcachetest-2978M           9738.00 (  0.00%)           5981.00 (-38.58%)
Ops io-duration-0M                  0.00 (  0.00%)              0.00 (  0.00%)
Ops io-duration-435M               10.00 (  0.00%)              6.00 ( 40.00%)
Ops io-duration-944M               19.00 (  0.00%)             19.00 (  0.00%)
Ops io-duration-1452M              31.00 (  0.00%)             26.00 ( 16.13%)
Ops io-duration-1961M              40.00 (  0.00%)             35.00 ( 12.50%)
Ops io-duration-2469M              45.00 (  0.00%)             43.00 (  4.44%)
(Continue reading)

Dan Magenheimer | 21 Jun 17:20 2013
Picon

RE: [PATCHv13 3/4] zswap: add to mm/

> From: Bob Liu [mailto:lliubbo <at> gmail.com]
 Subject: Re: [PATCHv13 3/4] zswap: add to mm/
> 
> On Thu, Jun 20, 2013 at 10:23 PM, Seth Jennings
> <sjenning <at> linux.vnet.ibm.com> wrote:
> > On Thu, Jun 20, 2013 at 05:42:04PM +0800, Bob Liu wrote:
> >> > Just made a mmtests run of my own and got very different results:
> >> >
> >>
> >> It's strange, I'll update to rc6 and try again.
> >> By the way, are you using 824 hardware compressor instead of lzo?
> >
> > My results where using lzo software compression.
> >
> 
> Thanks, and today I used another machine to test zswap.
> The total ram size of that machine is around 4G.
> This time the result is better:
>                                                rc6                         rc6
>                                              zswap                        base
> Ops memcachetest-0M             14619.00 (  0.00%)          15602.00 (  6.72%)
> Ops memcachetest-435M           14727.00 (  0.00%)          15860.00 (  7.69%)
> Ops memcachetest-944M           12452.00 (  0.00%)          11812.00 ( -5.14%)
> Ops memcachetest-1452M          12183.00 (  0.00%)           9829.00 (-19.32%)
> Ops memcachetest-1961M          11953.00 (  0.00%)           9337.00 (-21.89%)
> Ops memcachetest-2469M          11201.00 (  0.00%)           7509.00 (-32.96%)
> Ops memcachetest-2978M           9738.00 (  0.00%)           5981.00 (-38.58%)
> Ops io-duration-0M                  0.00 (  0.00%)              0.00 (  0.00%)
> Ops io-duration-435M               10.00 (  0.00%)              6.00 ( 40.00%)
> Ops io-duration-944M               19.00 (  0.00%)             19.00 (  0.00%)
(Continue reading)

Konrad Rzeszutek Wilk | 21 Jun 20:33 2013
Picon

Re: [PATCHv13 3/4] zswap: add to mm/

On Fri, Jun 21, 2013 at 08:20:34AM -0700, Dan Magenheimer wrote:
> > From: Bob Liu [mailto:lliubbo <at> gmail.com]
>  Subject: Re: [PATCHv13 3/4] zswap: add to mm/
> > 
> > On Thu, Jun 20, 2013 at 10:23 PM, Seth Jennings
> > <sjenning <at> linux.vnet.ibm.com> wrote:
> > > On Thu, Jun 20, 2013 at 05:42:04PM +0800, Bob Liu wrote:
> > >> > Just made a mmtests run of my own and got very different results:
> > >> >
> > >>
> > >> It's strange, I'll update to rc6 and try again.
> > >> By the way, are you using 824 hardware compressor instead of lzo?
> > >
> > > My results where using lzo software compression.
> > >
> > 
> > Thanks, and today I used another machine to test zswap.
> > The total ram size of that machine is around 4G.
> > This time the result is better:
> >                                                rc6                         rc6
> >                                              zswap                        base
> > Ops memcachetest-0M             14619.00 (  0.00%)          15602.00 (  6.72%)
> > Ops memcachetest-435M           14727.00 (  0.00%)          15860.00 (  7.69%)
> > Ops memcachetest-944M           12452.00 (  0.00%)          11812.00 ( -5.14%)
> > Ops memcachetest-1452M          12183.00 (  0.00%)           9829.00 (-19.32%)
> > Ops memcachetest-1961M          11953.00 (  0.00%)           9337.00 (-21.89%)
> > Ops memcachetest-2469M          11201.00 (  0.00%)           7509.00 (-32.96%)
> > Ops memcachetest-2978M           9738.00 (  0.00%)           5981.00 (-38.58%)
> > Ops io-duration-0M                  0.00 (  0.00%)              0.00 (  0.00%)
> > Ops io-duration-435M               10.00 (  0.00%)              6.00 ( 40.00%)
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 4/4] zswap: add documentation

This patch adds the documentation file for the zswap functionality

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
 Documentation/vm/zswap.txt |   68 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100644 Documentation/vm/zswap.txt

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
new file mode 100644
index 0000000..7e492d8
--- /dev/null
+++ b/Documentation/vm/zswap.txt
 <at>  <at>  -0,0 +1,68  <at>  <at> 
+Overview:
+
+Zswap is a lightweight compressed cache for swap pages. It takes pages that are
+in the process of being swapped out and attempts to compress them into a
+dynamically allocated RAM-based memory pool.  zswap basically trades CPU cycles
+for potentially reduced swap I/O.  This trade-off can also result in a
+significant performance improvement if reads from the compressed cache are
+faster than reads from a swap device.
+
+NOTE: Zswap is a new feature as of v3.11 and interacts heavily with memory
+reclaim.  This interaction has not be fully explored on the large set of
+potential configurations and workloads that exist.  For this reason, zswap
+is a work in progress and should be considered experimental.
+
+Some potential benefits:
(Continue reading)

Bob Liu | 13 Jun 14:34 2013
Picon

Re: [PATCHv13 0/4] zswap: compressed swap caching

Hi Seth,

On 06/04/2013 04:33 AM, Seth Jennings wrote:
> This is the latest version of the zswap patchset for compressed swap caching.
> This is submitted for merging into linux-next and inclusion in v3.11.
> 

Have you noticed that pool_pages >> stored_pages, like this:
[root <at> ca-dev32 zswap]# cat *
0
424057
99538
0
2749448
0
24
60018
16837
[root <at> ca-dev32 zswap]# cat pool_pages
97372
[root <at> ca-dev32 zswap]# cat stored_pages
53701
[root <at> ca-dev32 zswap]#

I think it's unreasonable to use more pool pages than stored pages!

Regards,
-Bob

--
(Continue reading)

Seth Jennings | 13 Jun 21:06 2013
Picon

Re: [PATCHv13 0/4] zswap: compressed swap caching

On Thu, Jun 13, 2013 at 08:34:38PM +0800, Bob Liu wrote:
> Hi Seth,
> 
> On 06/04/2013 04:33 AM, Seth Jennings wrote:
> > This is the latest version of the zswap patchset for compressed swap caching.
> > This is submitted for merging into linux-next and inclusion in v3.11.
> > 
> 
> Have you noticed that pool_pages >> stored_pages, like this:
> [root <at> ca-dev32 zswap]# cat *
> 0
> 424057
> 99538
> 0
> 2749448
> 0
> 24
> 60018
> 16837
> [root <at> ca-dev32 zswap]# cat pool_pages
> 97372
> [root <at> ca-dev32 zswap]# cat stored_pages
> 53701
> [root <at> ca-dev32 zswap]#
> 
> I think it's unreasonable to use more pool pages than stored pages!

Gah, in the moving of the zbud metadata for v13, I forgot to init the new
under_reclaim field of the zbud header.  Patch going out now.

(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 1/4] debugfs: add get/set for atomic types

debugfs currently lack the ability to create attributes
that set/get atomic_t values.

This patch adds support for this through a new
debugfs_create_atomic_t() function.

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh <at> linuxfoundation.org>
Acked-by: Mel Gorman <mgorman <at> suse.de>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
 fs/debugfs/file.c       |   42 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/debugfs.h |    2 ++
 lib/fault-inject.c      |   21 ---------------------
 3 files changed, 44 insertions(+), 21 deletions(-)

diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c
index c5ca6ae..ff64bcd 100644
--- a/fs/debugfs/file.c
+++ b/fs/debugfs/file.c
 <at>  <at>  -21,6 +21,7  <at>  <at> 
 #include <linux/debugfs.h>
 #include <linux/io.h>
 #include <linux/slab.h>
+#include <linux/atomic.h>

 static ssize_t default_read_file(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
 <at>  <at>  -403,6 +404,47  <at>  <at>  struct dentry *debugfs_create_size_t(const char *name, umode_t mode,
 }
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 4/4] zswap: add documentation

This patch adds the documentation file for the zswap functionality

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
 Documentation/vm/zswap.txt |   68 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100644 Documentation/vm/zswap.txt

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
new file mode 100644
index 0000000..7e492d8
--- /dev/null
+++ b/Documentation/vm/zswap.txt
 <at>  <at>  -0,0 +1,68  <at>  <at> 
+Overview:
+
+Zswap is a lightweight compressed cache for swap pages. It takes pages that are
+in the process of being swapped out and attempts to compress them into a
+dynamically allocated RAM-based memory pool.  zswap basically trades CPU cycles
+for potentially reduced swap I/O.  This trade-off can also result in a
+significant performance improvement if reads from the compressed cache are
+faster than reads from a swap device.
+
+NOTE: Zswap is a new feature as of v3.11 and interacts heavily with memory
+reclaim.  This interaction has not be fully explored on the large set of
+potential configurations and workloads that exist.  For this reason, zswap
+is a work in progress and should be considered experimental.
+
+Some potential benefits:
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 3/4] zswap: add to mm/

zswap is a thin backend for frontswap that takes pages that are in the process
of being swapped out and attempts to compress them and store them in a
RAM-based memory pool.  This can result in a significant I/O reduction on the
swap device and, in the case where decompressing from RAM is faster than
reading from the swap device, can also improve workload performance.

It also has support for evicting swap pages that are currently compressed in
zswap to the swap device on an LRU(ish) basis. This functionality makes zswap a
true cache in that, once the cache is full, the oldest pages can be moved out
of zswap to the swap device so newer pages can be compressed and stored in
zswap.

This patch adds the zswap driver to mm/

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
 mm/Kconfig  |   20 ++
 mm/Makefile |    1 +
 mm/zswap.c  |  943 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 964 insertions(+)
 create mode 100644 mm/zswap.c

diff --git a/mm/Kconfig b/mm/Kconfig
index 3367ac3..eec97f2 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
 <at>  <at>  -487,3 +487,23  <at>  <at>  config ZBUD
 	  page.  While this design limits storage density, it has simple and
 	  deterministic reclaim properties that make it preferable to a higher
(Continue reading)

Seth Jennings | 3 Jun 22:33 2013
Picon

[PATCHv13 2/4] zbud: add to mm/

zbud is an special purpose allocator for storing compressed pages. It is
designed to store up to two compressed pages per physical page.  While this
design limits storage density, it has simple and deterministic reclaim
properties that make it preferable to a higher density approach when reclaim
will be used.

zbud works by storing compressed pages, or "zpages", together in pairs in a
single memory page called a "zbud page".  The first buddy is "left
justifed" at the beginning of the zbud page, and the last buddy is "right
justified" at the end of the zbud page.  The benefit is that if either
buddy is freed, the freed buddy space, coalesced with whatever slack space
that existed between the buddies, results in the largest possible free region
within the zbud page.

zbud also provides an attractive lower bound on density. The ratio of zpages
to zbud pages can not be less than 1.  This ensures that zbud can never "do
harm" by using more pages to store zpages than the uncompressed zpages would
have used on their own.

This implementation is a rewrite of the zbud allocator internally used
by zcache in the driver/staging tree.  The rewrite was necessary to
remove some of the zcache specific elements that were ingrained throughout
and provide a generic allocation interface that can later be used by
zsmalloc and others.

This patch adds zbud to mm/ for later use by zswap.

Signed-off-by: Seth Jennings <sjenning <at> linux.vnet.ibm.com>
Acked-by: Rik van Riel <riel <at> redhat.com>
---
(Continue reading)


Gmane