Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 00/12] Balancing the scan rate of major caches

Hi all,

This patch balances the aging rates of active_list/inactive_list/slab.

It started out as an effort to enable the adaptive read-ahead to handle large
number of concurrent readers. Then I found it involves much more stuffs, and
deserves a standalone patchset to address the balancing problem as a whole.

The whole picture of balancing:

- In each node, inactive_list scan rates are synced with each other.
  It is done in the direct/kswapd reclaim path.

- In each zone, active_list scan rate always follows that of inactive_list.

- Slab cache scan rates always follow that of the current node.
  If the shrinkers are not NUMA aware, they will effectly sync scan rates
  with that of the most scanned node.

The patches can be grouped as follows:

- balancing stuffs
vm-kswapd-incmin.patch
mm-balance-zone-aging-supporting-facilities.patch
mm-balance-zone-aging-in-direct-reclaim.patch
mm-balance-zone-aging-in-kswapd-reclaim.patch
mm-balance-slab-aging.patch
mm-balance-active-inactive-list-aging.patch

- pure code cleanups
mm-remove-unnecessary-variable-and-loop.patch
mm-remove-swap-cluster-max-from-scan-control.patch
mm-accumulate-nr-scanned-reclaimed-in-scan-control.patch
mm-turn-bool-variables-into-flags-in-scan-control.patch

- debug code
mm-page-reclaim-debug-traces.patch

- a minor fix
mm-scan-accounting-fix.patch

Thanks,
Wu Fengguang

--

-- 
Dept. Automation                University of Science and Technology of China
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 12/12] mm: fix minor scan count bugs

- in isolate_lru_pages(): reports one more scan. Fix it.
- in shrink_cache(): 0 pages taken does not mean 0 pages scanned. Fix it.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -920,7 +920,8 @@ static int isolate_lru_pages(int nr_to_s
 	struct page *page;
 	int scan = 0;

-	while (scan++ < nr_to_scan && !list_empty(src)) {
+	while (scan < nr_to_scan && !list_empty(src)) {
+		scan++;
 		page = lru_to_page(src);
 		prefetchw_prev_lru_page(page, src, flags);

@@ -967,14 +968,15 @@ static void shrink_cache(struct zone *zo
 	update_zone_age(zone, nr_scan);
 	spin_unlock_irq(&zone->lru_lock);

-	if (nr_taken == 0)
-		return;
-
 	sc->nr_scanned += nr_scan;
 	if (current_is_kswapd())
 		mod_page_state_zone(zone, pgscan_kswapd, nr_scan);
 	else
 		mod_page_state_zone(zone, pgscan_direct, nr_scan);
+
+	if (nr_taken == 0)
+		return;
+
 	nr_freed = shrink_list(&page_list, sc);
 	if (current_is_kswapd())
 		mod_page_state(kswapd_steal, nr_freed);

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 11/12] mm: add page reclaim debug traces

Show the detailed steps of direct/kswapd page reclaim.

To enable the printk traces:
# echo y > /debug/debug_page_reclaim

Sample lines:

reclaim zone3 from kswapd for watermark, prio 12, scan-reclaimed 32-32, age 2626, active to scan 6542,
hot+cold+free pages 8842+283558+352
reclaim zone2 from kswapd for aging, prio 12, scan-reclaimed 32-32, age 2626, active to scan 8018,
hot+cold+free pages 1693+200036+10360
reclaim zone3 from kswapd for watermark, prio 12, scan-reclaimed 64-64, age 2627, active to scan 7564,
hot+cold+free pages 8842+283526+384
reclaim zone2 from kswapd for aging, prio 12, scan-reclaimed 32-32, age 2627, active to scan 8296,
hot+cold+free pages 1693+200018+10360
reclaim zone3 from kswapd for watermark, prio 12, scan-reclaimed 64-63, age 2628, active to scan 8587,
hot+cold+free pages 8843+283495+416
reclaim zone2 from kswapd for aging, prio 12, scan-reclaimed 32-32, age 2628, active to scan 8574,
hot+cold+free pages 1693+200014+10392
reclaim zone3 from kswapd for watermark, prio 12, scan-reclaimed 64-63, age 2628, active to scan 9610,
hot+cold+free pages 8844+283465+448
reclaim zone2 from kswapd for aging, prio 12, scan-reclaimed 32-32, age 2628, active to scan 8852,
hot+cold+free pages 1693+199996+10424
reclaim zone3 from kswapd for watermark, prio 12, scan-reclaimed 64-64, age 2629, active to scan 10633,
hot+cold+free pages 8844+283433+480
reclaim zone2 from kswapd for aging, prio 12, scan-reclaimed 32-32, age 2629, active to scan 9130,
hot+cold+free pages 1693+199992+10456
reclaim zone3 from kswapd for watermark, prio 12, scan-reclaimed 64-64, age 2630, active to scan 11656,
hot+cold+free pages 8844+283401+512
reclaim zone2 from kswapd for aging, prio 12, scan-reclaimed 32-32, age 2630, active to scan 9408,
hot+cold+free pages 1693+199974+10488

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 69 insertions(+), 1 deletion(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -38,6 +38,7 @@
 #include <asm/div64.h>

 #include <linux/swapops.h>
+#include <linux/debugfs.h>

 /* possible outcome of pageout() */
 typedef enum {
@@ -78,6 +79,62 @@ struct scan_control {
 #define SC_MAY_WRITEPAGE	0x1
 #define SC_MAY_SWAP		0x2	/* Can pages be swapped as part of reclaim? */

+#define SC_RECLAIM_FROM_KSWAPD		0x10
+#define SC_RECLAIM_FROM_DIRECT		0x20
+#define SC_RECLAIM_FOR_WATERMARK	0x40
+#define SC_RECLAIM_FOR_AGING		0x80
+#define SC_RECLAIM_MASK			0xF0
+
+#ifdef CONFIG_DEBUG_FS
+static u32 debug_page_reclaim;
+
+static inline void debug_reclaim(struct scan_control *sc, unsigned long flags)
+{
+	sc->flags = (sc->flags & ~SC_RECLAIM_MASK) | flags;
+}
+
+static inline void debug_reclaim_report(struct scan_control *sc, struct zone *z)
+{
+	if (!debug_page_reclaim)
+		return;
+
+	printk(KERN_DEBUG "reclaim zone%d from %s for %s, "
+			"prio %d, scan-reclaimed %lu-%lu, age %lu, "
+			"active to scan %lu, "
+			"hot+cold+free pages %lu+%lu+%lu\n",
+			zone_idx(z),
+			(sc->flags & SC_RECLAIM_FROM_KSWAPD) ? "kswapd" :
+			((sc->flags & SC_RECLAIM_FROM_DIRECT) ? "direct" :
+								"early"),
+			(sc->flags & SC_RECLAIM_FOR_AGING) ?
+							"aging" : "watermark",
+			sc->priority, sc->nr_scanned, sc->nr_reclaimed,
+			z->page_age,
+			z->nr_scan_active,
+			z->nr_active, z->nr_inactive, z->free_pages);
+
+	if (atomic_read(&z->reclaim_in_progress))
+		printk(KERN_WARNING "reclaim_in_progress=%d\n",
+					atomic_read(&z->reclaim_in_progress));
+}
+
+static inline void debug_reclaim_init(void)
+{
+	debugfs_create_bool("debug_page_reclaim", 0644, NULL,
+							&debug_page_reclaim);
+}
+#else
+static inline void debug_reclaim(struct scan_control *sc, int flags)
+{
+}
+static inline void debug_reclaim_report(struct scan_control *sc, struct zone *z)
+{
+}
+static inline void debug_reclaim_init(void)
+{
+}
+#endif
+
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))

 #ifdef ARCH_HAS_PREFETCH
@@ -1141,6 +1198,7 @@ shrink_zone(struct zone *zone, struct sc

 	atomic_dec(&zone->reclaim_in_progress);

+	debug_reclaim_report(sc, zone);
 	throttle_vm_writeout();
 }

@@ -1205,11 +1263,14 @@ shrink_caches(struct zone **zones, struc
 			continue;
 		}

+		debug_reclaim(sc, SC_RECLAIM_FROM_DIRECT);
 		shrink_zone(zone, sc);
 	}

-	if (z)
+	if (z) {
+		debug_reclaim(sc, SC_RECLAIM_FROM_DIRECT|SC_RECLAIM_FOR_AGING);
 		shrink_zone(z, sc);
+	}
 }

 /*
@@ -1386,10 +1447,16 @@ loop_again:
 				if (!zone_watermark_ok(zone, order,
 							zone->pages_high,
 							0, 0)) {
+					debug_reclaim(&sc,
+							SC_RECLAIM_FROM_KSWAPD|
+							SC_RECLAIM_FOR_WATERMARK);
 					all_zones_ok = 0;
 				} else if (zone == youngest_zone &&
 						pages_more_aged(oldest_zone,
 								youngest_zone)) {
+					debug_reclaim(&sc,
+							SC_RECLAIM_FROM_KSWAPD|
+							SC_RECLAIM_FOR_AGING);
 				} else
 					continue;
 			}
@@ -1611,6 +1678,7 @@ static int __init kswapd_init(void)
 		= find_task_by_pid(kernel_thread(kswapd, pgdat, CLONE_KERNEL));
 	total_memory = nr_free_pagecache_pages();
 	hotcpu_notifier(cpu_callback, 0);
+	debug_reclaim_init();
 	return 0;
 }

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 06/12] mm: balance active/inactive list scan rates

shrink_zone() has two major design goals:
1) let active/inactive lists have equal scan rates
2) do the scans in small chunks

But the implementation has some problems:
- reluctant to scan small zones
  the callers often have to dip into low priority to free memory.

- the balance is quite rough
  the break statement in the loop breaks it.

- may scan few pages in one batch
  refill_inactive_zone can be called twice to scan 32 and 1 pages.

The new design:
1) keep perfect balance
   let active_list follow inactive_list in scan rate

2) always scan in SWAP_CLUSTER_MAX sized chunks
   simple and efficient

3) will scan at least one chunk
   the expected behavior from the callers

The perfect balance may or may not yield better performance, though it
a) is a more understandable and dependable behavior
b) together with inter-zone balancing, makes the zoned memories consistent

The atomic reclaim_in_progress is there to prevent most concurrent reclaims.
If concurrent reclaims did happen, there will be no fatal errors.

I tested the patch with the following commands:
	dd if=/dev/zero of=hot bs=1M seek=800 count=1
	dd if=/dev/zero of=cold bs=1M seek=50000 count=1
	./test-aging.sh; ./active-inactive-aging-rate.sh

Before the patch:
-----------------------------------------------------------------------------
active/inactive sizes on 2.6.14-2-686-smp:
0/1000          = 0 / 1241
563/1000        = 73343 / 130108
887/1000        = 137348 / 154816

active/inactive scan rates:
dma      38/1000        = 7731 / (198924 + 0)
normal   465/1000       = 2979780 / (6394740 + 0)
high     680/1000       = 4354230 / (6396786 + 0)

             total       used       free     shared    buffers     cached
Mem:          2027       1978         49          0          4       1923
-/+ buffers/cache:         49       1977
Swap:            0          0          0
-----------------------------------------------------------------------------

After the patch, the scan rates and the size ratios are kept roughly the same
for all zones:
-----------------------------------------------------------------------------
active/inactive sizes on 2.6.15-rc3-mm1:
0/1000          = 0 / 961
236/1000        = 38385 / 162429
319/1000        = 70607 / 221101

active/inactive scan rates:
dma      0/1000         = 0 / (42176 + 0)
normal   234/1000       = 1714688 / (7303456 + 1088)
high     317/1000       = 3151936 / (9933792 + 96)

             total       used       free     shared    buffers     cached
Mem:          2020       1969         50          0          5       1908
-/+ buffers/cache:         54       1965
Swap:            0          0          0
-----------------------------------------------------------------------------

script test-aging.sh:
------------------------------
#!/bin/zsh
cp cold /dev/null&

while {pidof cp > /dev/null};
do
        cp hot /dev/null
done
------------------------------

script active-inactive-aging-rate.sh:
-----------------------------------------------------------------------------
#!/bin/sh

echo active/inactive sizes on `uname -r`:
egrep '(active|inactive)' /proc/zoneinfo |
while true
do
	read name value
	[[ -z $name ]] && break
	eval $name=$value
	[[ $name = "inactive" ]] && echo -e "$((active * 1000 / (1 + inactive)))/1000  \t= $active / $inactive"
done

while true
do
	read name value
	[[ -z $name ]] && break
	eval $name=$value
done < /proc/vmstat

echo
echo active/inactive scan rates:
echo -e "dma \t $((pgrefill_dma * 1000 / (1 + pgscan_kswapd_dma + pgscan_direct_dma)))/1000 \t=
$pgrefill_dma / ($pgscan_kswapd_dma + $pgscan_direct_dma)"
echo -e "normal \t $((pgrefill_normal * 1000 / (1 + pgscan_kswapd_normal +
pgscan_direct_normal)))/1000 \t= $pgrefill_normal / ($pgscan_kswapd_normal + $pgscan_direct_normal)"
echo -e "high \t $((pgrefill_high * 1000 / (1 + pgscan_kswapd_high + pgscan_direct_high)))/1000 \t=
$pgrefill_high / ($pgscan_kswapd_high + $pgscan_direct_high)"

echo
free -m
-----------------------------------------------------------------------------

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 include/linux/mmzone.h |    3 --
 include/linux/swap.h   |    2 -
 mm/page_alloc.c        |    5 +---
 mm/vmscan.c            |   52 +++++++++++++++++++++++++++----------------------
 4 files changed, 33 insertions(+), 29 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -911,7 +911,7 @@ static void shrink_cache(struct zone *zo
 		int nr_scan;
 		int nr_freed;

-		nr_taken = isolate_lru_pages(sc->swap_cluster_max,
+		nr_taken = isolate_lru_pages(sc->nr_to_scan,
 					     &zone->inactive_list,
 					     &page_list, &nr_scan);
 		zone->nr_inactive -= nr_taken;
@@ -1105,56 +1105,56 @@ refill_inactive_zone(struct zone *zone, 

 /*
  * This is a basic per-zone page freer.  Used by both kswapd and direct reclaim.
+ * The reclaim process:
+ * a) scan always in batch of SWAP_CLUSTER_MAX pages
+ * b) scan inactive list at least one batch
+ * c) balance the scan rate of active/inactive list
+ * d) finish on either scanned or reclaimed enough pages
  */
 static void
 shrink_zone(struct zone *zone, struct scan_control *sc)
 {
+	unsigned long long next_scan_active;
 	unsigned long nr_active;
 	unsigned long nr_inactive;

 	atomic_inc(&zone->reclaim_in_progress);

+	next_scan_active = sc->nr_scanned;
+
 	/*
 	 * Add one to `nr_to_scan' just to make sure that the kernel will
 	 * slowly sift through the active list.
 	 */
-	zone->nr_scan_active += (zone->nr_active >> sc->priority) + 1;
-	nr_active = zone->nr_scan_active;
-	if (nr_active >= sc->swap_cluster_max)
-		zone->nr_scan_active = 0;
-	else
-		nr_active = 0;
-
-	zone->nr_scan_inactive += (zone->nr_inactive >> sc->priority) + 1;
-	nr_inactive = zone->nr_scan_inactive;
-	if (nr_inactive >= sc->swap_cluster_max)
-		zone->nr_scan_inactive = 0;
-	else
-		nr_inactive = 0;
+	nr_active = zone->nr_scan_active + 1;
+	nr_inactive = (zone->nr_inactive >> sc->priority) + SWAP_CLUSTER_MAX;
+	nr_inactive &= ~(SWAP_CLUSTER_MAX - 1);

+	sc->nr_to_scan = SWAP_CLUSTER_MAX;
 	sc->nr_to_reclaim = sc->swap_cluster_max;

-	while (nr_active || nr_inactive) {
-		if (nr_active) {
-			sc->nr_to_scan = min(nr_active,
-					(unsigned long)sc->swap_cluster_max);
-			nr_active -= sc->nr_to_scan;
+	while (nr_active >= SWAP_CLUSTER_MAX * 1024 || nr_inactive) {
+		if (nr_active >= SWAP_CLUSTER_MAX * 1024) {
+			nr_active -= SWAP_CLUSTER_MAX * 1024;
 			refill_inactive_zone(zone, sc);
 		}

 		if (nr_inactive) {
-			sc->nr_to_scan = min(nr_inactive,
-					(unsigned long)sc->swap_cluster_max);
-			nr_inactive -= sc->nr_to_scan;
+			nr_inactive -= SWAP_CLUSTER_MAX;
 			shrink_cache(zone, sc);
 			if (sc->nr_to_reclaim <= 0)
 				break;
 		}
 	}

-	throttle_vm_writeout();
+	next_scan_active = (sc->nr_scanned - next_scan_active) * 1024ULL *
+					(unsigned long long)zone->nr_active;
+	do_div(next_scan_active, zone->nr_inactive | 1);
+	zone->nr_scan_active = nr_active + (unsigned long)next_scan_active;

 	atomic_dec(&zone->reclaim_in_progress);
+
+	throttle_vm_writeout();
 }

 /*
@@ -1195,6 +1195,9 @@ shrink_caches(struct zone **zones, struc
 		if (zone->all_unreclaimable && sc->priority < DEF_PRIORITY)
 			continue;	/* Let kswapd poll it */

+		if (atomic_read(&zone->reclaim_in_progress))
+			continue;
+
 		/*
 		 * Balance page aging in local zones and following headless
 		 * zones.
@@ -1415,6 +1418,9 @@ loop_again:
 			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
 				continue;

+			if (atomic_read(&zone->reclaim_in_progress))
+				continue;
+
 			zone->temp_priority = priority;
 			if (zone->prev_priority > priority)
 				zone->prev_priority = priority;
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -2146,7 +2146,6 @@ static void __init free_area_init_core(s
 		INIT_LIST_HEAD(&zone->active_list);
 		INIT_LIST_HEAD(&zone->inactive_list);
 		zone->nr_scan_active = 0;
-		zone->nr_scan_inactive = 0;
 		zone->nr_active = 0;
 		zone->nr_inactive = 0;
 		zone->aging_total = 0;
@@ -2302,7 +2301,7 @@ static int zoneinfo_show(struct seq_file
 			   "\n        inactive %lu"
 			   "\n        aging    %lu"
 			   "\n        age      %lu"
-			   "\n        scanned  %lu (a: %lu i: %lu)"
+			   "\n        scanned  %lu (a: %lu)"
 			   "\n        spanned  %lu"
 			   "\n        present  %lu",
 			   zone->free_pages,
@@ -2314,7 +2313,7 @@ static int zoneinfo_show(struct seq_file
 			   zone->aging_total,
 			   zone->page_age,
 			   zone->pages_scanned,
-			   zone->nr_scan_active, zone->nr_scan_inactive,
+			   zone->nr_scan_active / 1024,
 			   zone->spanned_pages,
 			   zone->present_pages);
 		seq_printf(m,
--- linux.orig/include/linux/swap.h
+++ linux/include/linux/swap.h
@@ -111,7 +111,7 @@ enum {
 	SWP_SCANNING	= (1 << 8),	/* refcount in scan_swap_map */
 };

-#define SWAP_CLUSTER_MAX 32
+#define SWAP_CLUSTER_MAX 32		/* must be power of 2 */

 #define SWAP_MAP_MAX	0x7fff
 #define SWAP_MAP_BAD	0x8000
--- linux.orig/include/linux/mmzone.h
+++ linux/include/linux/mmzone.h
@@ -142,8 +142,7 @@ struct zone {
 	spinlock_t		lru_lock;	
 	struct list_head	active_list;
 	struct list_head	inactive_list;
-	unsigned long		nr_scan_active;
-	unsigned long		nr_scan_inactive;
+	unsigned long		nr_scan_active;	/* x1024 to be more precise */
 	unsigned long		nr_active;
 	unsigned long		nr_inactive;
 	unsigned long		pages_scanned;	   /* since last reclaim */

--
Peter Zijlstra | 1 Dec 12:39
Picon

Re: [PATCH 06/12] mm: balance active/inactive list scan rates


On Thu, 2005-12-01 at 18:18 +0800, Wu Fengguang wrote: > plain text document attachment > (mm-balance-active-inactive-list-aging.patch) > shrink_zone() has two major design goals: > 1) let active/inactive lists have equal scan rates > 2) do the scans in small chunks > > The new design: > 1) keep perfect balance > let active_list follow inactive_list in scan rate > > 2) always scan in SWAP_CLUSTER_MAX sized chunks > simple and efficient > > 3) will scan at least one chunk > the expected behavior from the callers > > The perfect balance may or may not yield better performance, though it > a) is a more understandable and dependable behavior > b) together with inter-zone balancing, makes the zoned memories consistent
Nice, this patch effectively separates zone balancing from active/inactive balancing. I was thinking about doing this this morning in order to nicely abstract out all the page-replacement code. Thanks!
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 09/12] mm: accumulate sc.nr_scanned/sc.nr_reclaimed

Now that there's no need to keep track of nr_scanned/nr_reclaimed for every
single round of shrink_zone(), remove the total_scanned/total_reclaimed and
let nr_scanned/nr_reclaimed accumulate between rounds.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   36 ++++++++++++++----------------------
 1 files changed, 14 insertions(+), 22 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -1229,7 +1229,6 @@ int try_to_free_pages(struct zone **zone
 {
 	int priority;
 	int ret = 0;
-	int total_scanned = 0, total_reclaimed = 0;
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct scan_control sc;
 	int i;
@@ -1239,6 +1238,8 @@ int try_to_free_pages(struct zone **zone
 	sc.gfp_mask = gfp_mask;
 	sc.may_writepage = 0;
 	sc.may_swap = 1;
+	sc.nr_scanned = 0;
+	sc.nr_reclaimed = 0;

 	inc_page_state(allocstall);

@@ -1254,8 +1255,6 @@ int try_to_free_pages(struct zone **zone
 	/* The added 10 priorities are for scan rate balancing */
 	for (priority = DEF_PRIORITY + 10; priority >= 0; priority--) {
 		sc.nr_mapped = read_page_state(nr_mapped);
-		sc.nr_scanned = 0;
-		sc.nr_reclaimed = 0;
 		sc.priority = priority;
 		sc.nr_to_reclaim = SWAP_CLUSTER_MAX;
 		if (!priority)
@@ -1266,9 +1265,7 @@ int try_to_free_pages(struct zone **zone
 			sc.nr_reclaimed += reclaim_state->reclaimed_slab;
 			reclaim_state->reclaimed_slab = 0;
 		}
-		total_scanned += sc.nr_scanned;
-		total_reclaimed += sc.nr_reclaimed;
-		if (total_reclaimed >= SWAP_CLUSTER_MAX) {
+		if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX) {
 			ret = 1;
 			goto out;
 		}
@@ -1280,13 +1277,13 @@ int try_to_free_pages(struct zone **zone
 		 * that's undesirable in laptop mode, where we *want* lumpy
 		 * writeout.  So in laptop mode, write out the whole world.
 		 */
-		if (total_scanned > SWAP_CLUSTER_MAX * 3 / 2) {
-			wakeup_pdflush(laptop_mode ? 0 : total_scanned);
+		if (sc.nr_scanned > SWAP_CLUSTER_MAX * 3 / 2) {
+			wakeup_pdflush(laptop_mode ? 0 : sc.nr_scanned);
 			sc.may_writepage = 1;
 		}

 		/* Take a nap, wait for some writeback to complete */
-		if (sc.nr_scanned && priority < DEF_PRIORITY)
+		if (priority < DEF_PRIORITY)
 			blk_congestion_wait(WRITE, HZ/10);
 	}
 out:
@@ -1332,19 +1329,18 @@ static int balance_pgdat(pg_data_t *pgda
 	int all_zones_ok;
 	int priority;
 	int i;
-	int total_scanned, total_reclaimed;
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct scan_control sc;
 	struct zone *youngest_zone = NULL;
 	struct zone *oldest_zone = NULL;

 loop_again:
-	total_scanned = 0;
-	total_reclaimed = 0;
 	sc.gfp_mask = GFP_KERNEL;
 	sc.may_writepage = 0;
 	sc.may_swap = 1;
 	sc.nr_mapped = read_page_state(nr_mapped);
+	sc.nr_scanned = 0;
+	sc.nr_reclaimed = 0;

 	inc_page_state(pageoutrun);

@@ -1366,8 +1362,6 @@ loop_again:

 	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
 		all_zones_ok = 1;
-		sc.nr_scanned = 0;
-		sc.nr_reclaimed = 0;
 		sc.priority = priority;
 		sc.nr_to_reclaim = nr_pages ? nr_pages : SWAP_CLUSTER_MAX;

@@ -1421,19 +1415,17 @@ loop_again:
 		reclaim_state->reclaimed_slab = 0;
 		shrink_slab(GFP_KERNEL);
 		sc.nr_reclaimed += reclaim_state->reclaimed_slab;
-		total_reclaimed += sc.nr_reclaimed;
-		total_scanned += sc.nr_scanned;

 		/*
 		 * If we've done a decent amount of scanning and
 		 * the reclaim ratio is low, start doing writepage
 		 * even in laptop mode
 		 */
-		if (total_scanned > SWAP_CLUSTER_MAX * 2 &&
-		    total_scanned > total_reclaimed+total_reclaimed/2)
+		if (sc.nr_scanned > SWAP_CLUSTER_MAX * 2 &&
+		    sc.nr_scanned > sc.nr_reclaimed + sc.nr_reclaimed / 2)
 			sc.may_writepage = 1;

-		if (nr_pages && to_free > total_reclaimed)
+		if (nr_pages && to_free > sc.nr_reclaimed)
 			continue;	/* swsusp: need to do more work */
 		if (all_zones_ok)
 			break;		/* kswapd: all done */
@@ -1441,7 +1433,7 @@ loop_again:
 		 * OK, kswapd is getting into trouble.  Take a nap, then take
 		 * another pass across the zones.
 		 */
-		if (total_scanned && priority < DEF_PRIORITY - 2)
+		if (priority < DEF_PRIORITY - 2)
 			blk_congestion_wait(WRITE, HZ/10);

 		/*
@@ -1450,7 +1442,7 @@ loop_again:
 		 * matches the direct reclaim path behaviour in terms of impact
 		 * on zone->*_priority.
 		 */
-		if ((total_reclaimed >= SWAP_CLUSTER_MAX) && (!nr_pages))
+		if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX && !nr_pages)
 			break;
 	}
 	for (i = 0; i < pgdat->nr_zones; i++) {
@@ -1463,7 +1455,7 @@ loop_again:
 		goto loop_again;
 	}

-	return total_reclaimed;
+	return sc.nr_reclaimed;
 }

 /*

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 10/12] mm: merge sc.may_writepage and sc.may_swap into sc.flags

Turn bool values into flags to make struct scan_control more compact.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   22 ++++++++++------------
 1 files changed, 10 insertions(+), 12 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -72,12 +72,12 @@ struct scan_control {
 	/* This context's GFP mask */
 	gfp_t gfp_mask;

-	int may_writepage;
-
-	/* Can pages be swapped as part of reclaim? */
-	int may_swap;
+	unsigned long flags;
 };

+#define SC_MAY_WRITEPAGE	0x1
+#define SC_MAY_SWAP		0x2	/* Can pages be swapped as part of reclaim? */
+
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))

 #ifdef ARCH_HAS_PREFETCH
@@ -487,7 +487,7 @@ static int shrink_list(struct list_head 
 		 * Try to allocate it some swap space here.
 		 */
 		if (PageAnon(page) && !PageSwapCache(page)) {
-			if (!sc->may_swap)
+			if (!(sc->flags & SC_MAY_SWAP))
 				goto keep_locked;
 			if (!add_to_swap(page, GFP_ATOMIC))
 				goto activate_locked;
@@ -518,7 +518,7 @@ static int shrink_list(struct list_head 
 				goto keep_locked;
 			if (!may_enter_fs)
 				goto keep_locked;
-			if (laptop_mode && !sc->may_writepage)
+			if (laptop_mode && !(sc->flags & SC_MAY_WRITEPAGE))
 				goto keep_locked;

 			/* Page is dirty, try to write it out here */
@@ -1236,8 +1236,7 @@ int try_to_free_pages(struct zone **zone
 	delay_prefetch();

 	sc.gfp_mask = gfp_mask;
-	sc.may_writepage = 0;
-	sc.may_swap = 1;
+	sc.flags = SC_MAY_SWAP;
 	sc.nr_scanned = 0;
 	sc.nr_reclaimed = 0;

@@ -1279,7 +1278,7 @@ int try_to_free_pages(struct zone **zone
 		 */
 		if (sc.nr_scanned > SWAP_CLUSTER_MAX * 3 / 2) {
 			wakeup_pdflush(laptop_mode ? 0 : sc.nr_scanned);
-			sc.may_writepage = 1;
+			sc.flags |= SC_MAY_WRITEPAGE;
 		}

 		/* Take a nap, wait for some writeback to complete */
@@ -1336,8 +1335,7 @@ static int balance_pgdat(pg_data_t *pgda

 loop_again:
 	sc.gfp_mask = GFP_KERNEL;
-	sc.may_writepage = 0;
-	sc.may_swap = 1;
+	sc.flags = SC_MAY_SWAP;
 	sc.nr_mapped = read_page_state(nr_mapped);
 	sc.nr_scanned = 0;
 	sc.nr_reclaimed = 0;
@@ -1423,7 +1421,7 @@ loop_again:
 		 */
 		if (sc.nr_scanned > SWAP_CLUSTER_MAX * 2 &&
 		    sc.nr_scanned > sc.nr_reclaimed + sc.nr_reclaimed / 2)
-			sc.may_writepage = 1;
+			sc.flags |= SC_MAY_WRITEPAGE;

 		if (nr_pages && to_free > sc.nr_reclaimed)
 			continue;	/* swsusp: need to do more work */

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 08/12] mm: remove swap_cluster_max from scan_control

The use of sc.swap_cluster_max is weird and redundant.

The callers should just set sc.priority/sc.nr_to_reclaim, and let
shrink_zone() decide the proper loop parameters.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   15 ++++-----------
 1 files changed, 4 insertions(+), 11 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -76,12 +76,6 @@ struct scan_control {

 	/* Can pages be swapped as part of reclaim? */
 	int may_swap;
-
-	/* This context's SWAP_CLUSTER_MAX. If freeing memory for
-	 * suspend, we effectively ignore SWAP_CLUSTER_MAX.
-	 * In this context, it doesn't matter that we scan the
-	 * whole list at once. */
-	int swap_cluster_max;
 };

 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -1125,7 +1119,6 @@ shrink_zone(struct zone *zone, struct sc
 	nr_inactive &= ~(SWAP_CLUSTER_MAX - 1);

 	sc->nr_to_scan = SWAP_CLUSTER_MAX;
-	sc->nr_to_reclaim = sc->swap_cluster_max;

 	while (nr_active >= SWAP_CLUSTER_MAX * 1024 || nr_inactive) {
 		if (nr_active >= SWAP_CLUSTER_MAX * 1024) {
@@ -1264,7 +1257,7 @@ int try_to_free_pages(struct zone **zone
 		sc.nr_scanned = 0;
 		sc.nr_reclaimed = 0;
 		sc.priority = priority;
-		sc.swap_cluster_max = SWAP_CLUSTER_MAX;
+		sc.nr_to_reclaim = SWAP_CLUSTER_MAX;
 		if (!priority)
 			disable_swap_token();
 		shrink_caches(zones, &sc);
@@ -1275,7 +1268,7 @@ int try_to_free_pages(struct zone **zone
 		}
 		total_scanned += sc.nr_scanned;
 		total_reclaimed += sc.nr_reclaimed;
-		if (total_reclaimed >= sc.swap_cluster_max) {
+		if (total_reclaimed >= SWAP_CLUSTER_MAX) {
 			ret = 1;
 			goto out;
 		}
@@ -1287,7 +1280,7 @@ int try_to_free_pages(struct zone **zone
 		 * that's undesirable in laptop mode, where we *want* lumpy
 		 * writeout.  So in laptop mode, write out the whole world.
 		 */
-		if (total_scanned > sc.swap_cluster_max + sc.swap_cluster_max/2) {
+		if (total_scanned > SWAP_CLUSTER_MAX * 3 / 2) {
 			wakeup_pdflush(laptop_mode ? 0 : total_scanned);
 			sc.may_writepage = 1;
 		}
@@ -1376,7 +1369,7 @@ loop_again:
 		sc.nr_scanned = 0;
 		sc.nr_reclaimed = 0;
 		sc.priority = priority;
-		sc.swap_cluster_max = nr_pages ? nr_pages : SWAP_CLUSTER_MAX;
+		sc.nr_to_reclaim = nr_pages ? nr_pages : SWAP_CLUSTER_MAX;

 		/* The swap token gets in the way of swapout... */
 		if (!priority)

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 02/12] mm: supporting variables and functions for balanced zone aging

The zone aging rates are currently imbalanced, the gap can be as large as 3
times, which can severely damage read-ahead requests and shorten their
effective life time.

This patch adds three variables in struct zone
	- aging_total
	- aging_milestone
	- page_age
to keep track of page aging rate, and keep it in sync on page reclaim time.

The aging_total is just a per-zone counter-part to the per-cpu
pgscan_{kswapd,direct}_{zone name}. But it is not direct comparable between
zones, so the aging_milestone/page_age are maintained based on aging_total.

The page_age is a normalized value that can be direct compared between zones
with the helper macro pages_more_aged(). The goal of balancing logics are to
keep this normalized value in sync between zones.

One can check the balanced aging progress by running:
                        tar c / | cat > /dev/null &
                        watch -n1 'grep "age " /proc/zoneinfo'

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 include/linux/mmzone.h |   14 ++++++++++++++
 mm/page_alloc.c        |   11 +++++++++++
 mm/vmscan.c            |   39 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+)

--- linux.orig/include/linux/mmzone.h
+++ linux/include/linux/mmzone.h
@@ -149,6 +149,20 @@ struct zone {
 	unsigned long		pages_scanned;	   /* since last reclaim */
 	int			all_unreclaimable; /* All pages pinned */

+	/* Fields for balanced page aging:
+	 * aging_total     - The accumulated number of activities that may
+	 *                   cause page aging, that is, make some pages closer
+	 *                   to the tail of inactive_list.
+	 * aging_milestone - A snapshot of total_scan every time a full
+	 *                   inactive_list of pages become aged.
+	 * page_age        - A normalized value showing the percent of pages
+	 *                   have been aged.  It is compared between zones to
+	 *                   balance the rate of page aging.
+	 */
+	unsigned long		aging_total;
+	unsigned long		aging_milestone;
+	unsigned long		page_age;
+
 	/*
 	 * Does the allocator try to reclaim pages from the zone as soon
 	 * as it fails a watermark_ok() in __alloc_pages?
--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -123,6 +123,44 @@ static long total_memory;
 static LIST_HEAD(shrinker_list);
 static DECLARE_RWSEM(shrinker_rwsem);

+#ifdef CONFIG_HIGHMEM64G
+#define		PAGE_AGE_SHIFT  8
+#elif BITS_PER_LONG == 32
+#define		PAGE_AGE_SHIFT  12
+#elif BITS_PER_LONG == 64
+#define		PAGE_AGE_SHIFT  20
+#else
+#error unknown BITS_PER_LONG
+#endif
+#define		PAGE_AGE_MASK   ((1 << PAGE_AGE_SHIFT) - 1)
+
+/*
+ * The simplified code is: (a->page_age > b->page_age)
+ * The complexity deals with the wrap-around problem.
+ * Two page ages not close enough should also be ignored:
+ * they are out of sync and the comparison may be nonsense.
+ */
+#define pages_more_aged(a, b) 						\
+	((b->page_age - a->page_age) & PAGE_AGE_MASK) >			\
+			PAGE_AGE_MASK - (1 << (PAGE_AGE_SHIFT - 3))	\
+
+/*
+ * Keep track of the percent of cold pages that have been scanned / aged.
+ * It's not really ##%, but a high resolution normalized value.
+ */
+static inline void update_zone_age(struct zone *z, int nr_scan)
+{
+	unsigned long len = z->nr_inactive | 1;
+
+	z->aging_total += nr_scan;
+
+	if (z->aging_total - z->aging_milestone > len)
+		z->aging_milestone += len;
+
+	z->page_age = ((z->aging_total - z->aging_milestone)
+						<< PAGE_AGE_SHIFT) / len;
+}
+
 /*
  * Add a shrinker callback to be called from the vm
  */
@@ -888,6 +926,7 @@ static void shrink_cache(struct zone *zo
 					     &page_list, &nr_scan);
 		zone->nr_inactive -= nr_taken;
 		zone->pages_scanned += nr_scan;
+		update_zone_age(zone, nr_scan);
 		spin_unlock_irq(&zone->lru_lock);

 		if (nr_taken == 0)
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -1521,6 +1521,8 @@ void show_free_areas(void)
 			" active:%lukB"
 			" inactive:%lukB"
 			" present:%lukB"
+			" aging:%lukB"
+			" age:%lu"
 			" pages_scanned:%lu"
 			" all_unreclaimable? %s"
 			"\n",
@@ -1532,6 +1534,8 @@ void show_free_areas(void)
 			K(zone->nr_active),
 			K(zone->nr_inactive),
 			K(zone->present_pages),
+			K(zone->aging_total),
+			zone->page_age,
 			zone->pages_scanned,
 			(zone->all_unreclaimable ? "yes" : "no")
 			);
@@ -2145,6 +2149,9 @@ static void __init free_area_init_core(s
 		zone->nr_scan_inactive = 0;
 		zone->nr_active = 0;
 		zone->nr_inactive = 0;
+		zone->aging_total = 0;
+		zone->aging_milestone = 0;
+		zone->page_age = 0;
 		atomic_set(&zone->reclaim_in_progress, 0);
 		if (!size)
 			continue;
@@ -2293,6 +2300,8 @@ static int zoneinfo_show(struct seq_file
 			   "\n        high     %lu"
 			   "\n        active   %lu"
 			   "\n        inactive %lu"
+			   "\n        aging    %lu"
+			   "\n        age      %lu"
 			   "\n        scanned  %lu (a: %lu i: %lu)"
 			   "\n        spanned  %lu"
 			   "\n        present  %lu",
@@ -2302,6 +2311,8 @@ static int zoneinfo_show(struct seq_file
 			   zone->pages_high,
 			   zone->nr_active,
 			   zone->nr_inactive,
+			   zone->aging_total,
+			   zone->page_age,
 			   zone->pages_scanned,
 			   zone->nr_scan_active, zone->nr_scan_inactive,
 			   zone->spanned_pages,

--
Andrew Morton | 1 Dec 11:37

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote:
>
>  The zone aging rates are currently imbalanced,

ZONE_DMA is out of whack.  It shouldn't be, and I'm not aware of anyone
getting in and working out why.  I certainly wouldn't want to go and add
all this stuff without having a good understanding of _why_ it's out of
whack.  Perhaps it's just some silly bug, like the thing I pointed at in
the previous email.

> the gap can be as large as 3 times,

What's the testcase?

> which can severely damage read-ahead requests and shorten their
>  effective life time.

Have you any performance numbers for this?
Marcelo Tosatti | 1 Dec 23:28
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

Hi Andrew,

On Thu, Dec 01, 2005 at 02:37:14AM -0800, Andrew Morton wrote:

> Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote: > > > > The zone aging rates are currently imbalanced, > > ZONE_DMA is out of whack. It shouldn't be, and I'm not aware of anyone > getting in and working out why. I certainly wouldn't want to go and add > all this stuff without having a good understanding of _why_ it's out of > whack. Perhaps it's just some silly bug, like the thing I pointed at in > the previous email.
I think that the problem is caused by the interaction between the way reclaiming is quantified and parallel allocators. The zones have different sizes, and each zone reclaim iteration scans the same number of pages. It is unfair. On top of that, kswapd is likely to block while doing its job, which means that allocators have a chance to run. It seems that scaling the number of isolated pages to zone size fixes the unbalancing problem, making the Normal zone be _more_ scanned than DMA. Which is expected since the lower zone protection logic decreases allocation pressure from DMA sending it straight to the Normal zone (therefore zeroing lower_zone_protection should make the scanning proportionally equal). Test used was FFSB using the following profile on a 128MB P-3 machine: num_filesystems=1 num_threadgroups=1 directio=0 time=300 [filesystem0] location=/mnt/hda4/ num_files=20 num_dirs=10 max_filesize=91534338 min_filesize=65535 [end0] [threadgroup0] num_threads=10 write_size=2816 write_blocksize=4096 read_size=2816 read_blocksize=4096 create_weight=100 write_weight=30 read_weight=100 [end0] And the scanning ratios are: Normal: 114688kB DMA: 16384kB Normal/DMA ratio = 114688 / 16384 = 7.000 ******* 2.6.14 vanilla ******** * kswapd scanning rates pgscan_kswapd_normal 450483 pgscan_kswapd_dma 84645 pgscan_kswapd Normal/DMA = (450483 / 88869) = 5.069 * direct scanning rates pgscan_direct_normal 23826 pgscan_direct_dma 4224 pgscan_direct Normal/DMA = (23826 / 4224) = 5.640 * global (kswapd+direct) scanning rates pgscan_normal = (450483 + 23826) = 474309 pgscan_dma = (84645 + 4224) = 88869 pgscan Normal/DMA = (474309 / 88869) = 5.337 pgalloc_normal = 794293 pgalloc_dma = 123805 pgalloc_normal_dma_ratio = (794293/123805) = 6.415 ****** 2.6.14 isolate relative ***** * kswapd scanning rates pgscan_kswapd_normal 664883 pgscan_kswapd_dma 82845 pgscan_kswapd Normal/DMA (664883/82845) = 8.025 * direct scanning rates pgscan_direct_normal 13485 pgscan_direct_dma 1745 pgscan_direct Normal/DMA = (13485/1745) = 7.727 * global (kswapd+direct) scanning rates pgscan_normal = (664883 + 13485) = 678368 pgscan_dma = (82845 + 1745) = 84590 pgscan Normal/DMA = (678368 / 84590) = 8.019 pgalloc_normal 699927 pgalloc_dma 66313 pgalloc_normal_dma_ratio = (699927/66313) = 10.554 I think it becomes pretty clear that this is really the case. On vanilla, for each DMA allocation, there are 6.415 NORMAL allocations, while the NORMAL zone is 7.000 times the size of DMA. With the patch, there are 10.5 NORMAL allocations for each DMA one. Batching of reclaim is affected by the relative isolation (now in smaller batches) though. --- mm/vmscan.c.orig 2006-01-01 12:44:39.000000000 -0200 +++ mm/vmscan.c 2006-01-01 16:43:54.000000000 -0200 @@ -616,8 +616,12 @@ { LIST_HEAD(page_list); struct pagevec pvec; + int nr_to_isolate; int max_scan = sc->nr_to_scan; + nr_to_isolate = (sc->swap_cluster_max * zone->present_pages) + / total_memory; + pagevec_init(&pvec, 1); lru_add_drain(); @@ -628,7 +632,8 @@ int nr_scan; int nr_freed; - nr_taken = isolate_lru_pages(sc->swap_cluster_max, + + nr_taken = isolate_lru_pages(nr_to_isolate, &zone->inactive_list, &page_list, &nr_scan); zone->nr_inactive -= nr_taken;
Andrew Morton | 2 Dec 00:03

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

Marcelo Tosatti <marcelo.tosatti <at> cyclades.com> wrote:
>
> Hi Andrew,
> 
> On Thu, Dec 01, 2005 at 02:37:14AM -0800, Andrew Morton wrote:
> > Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote:
> > >
> > >  The zone aging rates are currently imbalanced,
> > 
> > ZONE_DMA is out of whack.  It shouldn't be, and I'm not aware of anyone
> > getting in and working out why.  I certainly wouldn't want to go and add
> > all this stuff without having a good understanding of _why_ it's out of
> > whack.  Perhaps it's just some silly bug, like the thing I pointed at in
> > the previous email.
> 
> I think that the problem is caused by the interaction between 
> the way reclaiming is quantified and parallel allocators.

Could be.  But what about the bug which I think is there?  That'll cause
overscanning of the DMA zone.

> The zones have different sizes, and each zone reclaim iteration
> scans the same number of pages. It is unfair.

Nope.  See how shrink_zone() bases nr_active and nr_inactive on
zone->nr_active and zone_nr_inactive.  These calculations are intended to
cause the number of scanned pages in each zone to be

	(zone->nr-active + zone->nr_inactive) >> sc->priority.

> On top of that, kswapd is likely to block while doing its job, 
> which means that allocators have a chance to run.

kswapd should only block under rare circumstances - huge amounts of dirty
pages coming off the tail of the LRU.

> --- mm/vmscan.c.orig	2006-01-01 12:44:39.000000000 -0200
> +++ mm/vmscan.c	2006-01-01 16:43:54.000000000 -0200
> @@ -616,8 +616,12 @@
>  {

Please use `diff -p'.

Marcelo Tosatti | 2 Dec 02:26
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Thu, Dec 01, 2005 at 03:03:49PM -0800, Andrew Morton wrote:
> Marcelo Tosatti <marcelo.tosatti <at> cyclades.com> wrote:
> >
> > Hi Andrew,
> > 
> > On Thu, Dec 01, 2005 at 02:37:14AM -0800, Andrew Morton wrote:
> > > Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote:
> > > >
> > > >  The zone aging rates are currently imbalanced,
> > > 
> > > ZONE_DMA is out of whack.  It shouldn't be, and I'm not aware of anyone
> > > getting in and working out why.  I certainly wouldn't want to go and add
> > > all this stuff without having a good understanding of _why_ it's out of
> > > whack.  Perhaps it's just some silly bug, like the thing I pointed at in
> > > the previous email.
> > 
> > I think that the problem is caused by the interaction between 
> > the way reclaiming is quantified and parallel allocators.
> 
> Could be.  But what about the bug which I think is there?  That'll cause
> overscanning of the DMA zone. 

There were about 12Mb of inactive pages on the DMA zone. You're hypothesis 
was that there were no LRU pages to be scanned on DMA zone?

> > The zones have different sizes, and each zone reclaim iteration
> > scans the same number of pages. It is unfair.
> 
> Nope.  See how shrink_zone() bases nr_active and nr_inactive on
> zone->nr_active and zone_nr_inactive.  These calculations are intended to
> cause the number of scanned pages in each zone to be
> 
> 	(zone->nr-active + zone->nr_inactive) >> sc->priority.  

True... Well, I don't know, then.

> > On top of that, kswapd is likely to block while doing its job, 
> > which means that allocators have a chance to run.
> 
> kswapd should only block under rare circumstances - huge amounts of dirty
> pages coming off the tail of the LRU. 

Alright. I don't know - what could be the problem, then? 

> > --- mm/vmscan.c.orig	2006-01-01 12:44:39.000000000 -0200
> > +++ mm/vmscan.c	2006-01-01 16:43:54.000000000 -0200
> > @@ -616,8 +616,12 @@
> >  {
> 
> Please use `diff -p'.
Andrew Morton | 2 Dec 04:40

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


Marcelo Tosatti <marcelo.tosatti <at> cyclades.com> wrote: > > > Could be. But what about the bug which I think is there? That'll cause > > overscanning of the DMA zone. > > There were about 12Mb of inactive pages on the DMA zone. You're hypothesis > was that there were no LRU pages to be scanned on DMA zone?
No, my hypothesis was that balance_pgdat() had a bug. Looking at it again, I don't see it any more..
Wu Fengguang | 2 Dec 02:19
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Thu, Dec 01, 2005 at 03:03:49PM -0800, Andrew Morton wrote: > > > ZONE_DMA is out of whack. It shouldn't be, and I'm not aware of anyone > > > getting in and working out why. I certainly wouldn't want to go and add > > > all this stuff without having a good understanding of _why_ it's out of > > > whack. Perhaps it's just some silly bug, like the thing I pointed at in > > > the previous email. > > > > I think that the problem is caused by the interaction between > > the way reclaiming is quantified and parallel allocators. > > Could be. But what about the bug which I think is there? That'll cause > overscanning of the DMA zone.
Take for example these numbers: -------------------------------------------------------------------------------- active/inactive sizes on 2.6.14-1-k7-smp: 43/1000 = 116 / 2645 819/1000 = 54023 / 65881 active/inactive scan rates: dma 480/1000 = 31364 / (58377 + 6963) normal 985/1000 = 719219 / (645051 + 84579) high 0/1000 = 0 / (0 + 0) total used free shared buffers cached Mem: 503 497 6 0 0 328 -/+ buffers/cache: 168 335 Swap: 127 2 125 -------------------------------------------------------------------------------- cold-page-scan-rate = K * (direct-reclaim-count * direct-scan-prob + kswapd-reclaim-count * kswapd-scan-prob) * shrink-zone-prob (direct-reclaim-count : kswapd-reclaim-count) depends on memory pressure. Here it is DMA: 8 = 58377 / 6963 Normal: 7 = 645051 / 84579 (direct-scan-prob) is roughly equal for all zones. (kswapd-scan-prob) is expected to be equal too. So the equation can be simplified to: cold-page-scan-rate ~= C * shrink-zone-prob It depends largely on the shrink_zone() function: 843 zone->nr_scan_inactive += (zone->nr_inactive >> sc->priority) + 1; 844 nr_inactive = zone->nr_scan_inactive; 845 if (nr_inactive >= sc->swap_cluster_max) 846 zone->nr_scan_inactive = 0; 847 else 848 nr_inactive = 0; 849 850 sc->nr_to_reclaim = sc->swap_cluster_max; 851 852 while (nr_active || nr_inactive) { //... 860 if (nr_inactive) { 861 sc->nr_to_scan = min(nr_inactive, 862 (unsigned long)sc->swap_cluster_max); 863 nr_inactive -= sc->nr_to_scan; 864 shrink_cache(zone, sc); 865 if (sc->nr_to_reclaim <= 0) 866 break; 867 } 868 } Line 843 is the core of the scan balancing logic: priority 12 11 10 On each call nr_scan_inactive is increased by: DMA(2k pages) +1 +2 +3 Normal(64k pages) +17 +33 +65 Round it up to SWAP_CLUSTER_MAX=32, we get (scan batches/accumulate rounds): DMA 1/32 1/16 2/11 Normal 2/2 2/1 3/1 DMA:Normal ratio 1:32 1:32 2:33 This keeps the scan rate roughly balanced(i.e. 1:32) in low vm pressure. But lines 865-866 together with line 846 make most shrink_zone() invocations only run one batch of scan. The numbers become: DMA 1/32 1/16 1/11 Normal 1/2 1/1 1/1 DMA:Normal ratio 1:16 1:16 1:11 Now the scan ratio turns into something between 2:1 ~ 3:1 ! Another problem is that the equation in line 843 is quite coarse, 64k/127k pages result in the same result, leading to a large variance range. Thanks, Wu
Andrew Morton | 2 Dec 06:49

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote: > > 865 if (sc->nr_to_reclaim <= 0) > 866 break; > 867 } > 868 } > > Line 843 is the core of the scan balancing logic: > > priority 12 11 10 > > On each call nr_scan_inactive is increased by: > DMA(2k pages) +1 +2 +3 > Normal(64k pages) +17 +33 +65 > > Round it up to SWAP_CLUSTER_MAX=32, we get (scan batches/accumulate rounds): > DMA 1/32 1/16 2/11 > Normal 2/2 2/1 3/1 > DMA:Normal ratio 1:32 1:32 2:33 > > This keeps the scan rate roughly balanced(i.e. 1:32) in low vm pressure. > > But lines 865-866 together with line 846 make most shrink_zone() invocations > only run one batch of scan.
Yes, this seems to be the problem. Sigh. By the time 2.6.8 came around I just didn't have time to do the amount of testing which any page reclaim tweak necessitates. From: Andrew Morton <akpm <at> osdl.org> Revert a patch which went into 2.6.8-rc1. The changelog for that patch was: The shrink_zone() logic can, under some circumstances, cause far too many pages to be reclaimed. Say, we're scanning at high priority and suddenly hit a large number of reclaimable pages on the LRU. Change things so we bale out when SWAP_CLUSTER_MAX pages have been reclaimed. Problem is, this change caused significant imbalance in inter-zone scan balancing by truncating scans of larger zones. Suppose, for example, ZONE_HIGHMEM is 10x the size of ZONE_NORMAL. The zone balancing algorithm would require that if we're scanning 100 pages of ZONE_HIGHMEM, we should scan 10 pages of ZONE_NORMAL. But this logic will cause the scanning of ZONE_HIGHMEM to bale out after only 32 pages are reclaimed. Thus effectively causing smaller zones to be scanned relatively harder than large ones. Now I need to remember what the workload was which caused me to write this patch originally, then fix it up in a different way... Signed-off-by: Andrew Morton <akpm <at> osdl.org> --- mm/vmscan.c | 8 -------- 1 files changed, 8 deletions(-) diff -puN mm/vmscan.c~vmscan-balancing-fix mm/vmscan.c --- devel/mm/vmscan.c~vmscan-balancing-fix 2005-12-01 21:20:44.000000000 -0800 +++ devel-akpm/mm/vmscan.c 2005-12-01 21:21:38.000000000 -0800 @@ -63,9 +63,6 @@ struct scan_control { unsigned long nr_mapped; /* From page_state */ - /* How many pages shrink_cache() should reclaim */ - int nr_to_reclaim; - /* Ask shrink_caches, or shrink_zone to scan at this priority */ unsigned int priority; @@ -901,7 +898,6 @@ static void shrink_cache(struct zone *zo if (current_is_kswapd()) mod_page_state(kswapd_steal, nr_freed); mod_page_state_zone(zone, pgsteal, nr_freed); - sc->nr_to_reclaim -= nr_freed; spin_lock_irq(&zone->lru_lock); /* @@ -1101,8 +1097,6 @@ shrink_zone(struct zone *zone, struct sc else nr_inactive = 0; - sc->nr_to_reclaim = sc->swap_cluster_max; - while (nr_active || nr_inactive) { if (nr_active) { sc->nr_to_scan = min(nr_active, @@ -1116,8 +1110,6 @@ shrink_zone(struct zone *zone, struct sc (unsigned long)sc->swap_cluster_max); nr_inactive -= sc->nr_to_scan; shrink_cache(zone, sc); - if (sc->nr_to_reclaim <= 0) - break; } } _
Marcelo Tosatti | 2 Dec 16:13
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Thu, Dec 01, 2005 at 09:49:31PM -0800, Andrew Morton wrote: > Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote: > > > > 865 if (sc->nr_to_reclaim <= 0) > > 866 break; > > 867 } > > 868 } > > > > Line 843 is the core of the scan balancing logic: > > > > priority 12 11 10 > > > > On each call nr_scan_inactive is increased by: > > DMA(2k pages) +1 +2 +3 > > Normal(64k pages) +17 +33 +65 > > > > Round it up to SWAP_CLUSTER_MAX=32, we get (scan batches/accumulate rounds): > > DMA 1/32 1/16 2/11 > > Normal 2/2 2/1 3/1 > > DMA:Normal ratio 1:32 1:32 2:33 > > > > This keeps the scan rate roughly balanced(i.e. 1:32) in low vm pressure. > > > > But lines 865-866 together with line 846 make most shrink_zone() invocations > > only run one batch of scan. > > Yes, this seems to be the problem. Sigh. By the time 2.6.8 came around I > just didn't have time to do the amount of testing which any page reclaim > tweak necessitates.
Hi Andrew, It all makes sense to me (Wu's description of the problem and your patch), but still no good with reference to fair scanning. Moreover the patch hurts interactivity _badly_, not sure why (ssh into the box with FFSB testcase takes more than one minute to login, while vanilla takes few dozens of seconds). Follows an interesting part of "diff -u 2614-vanilla.vmstat 2614-akpm.vmstat" (they were not retrieve at the exact same point in the benchmark run, but that should not matter much): -slabs_scanned 37632 -kswapd_steal 731859 -kswapd_inodesteal 1363 -pageoutrun 26573 -allocstall 636 -pgrotated 1898 +slabs_scanned 2688 +kswapd_steal 502946 +kswapd_inodesteal 1 +pageoutrun 10612 +allocstall 90 +pgrotated 68 Note how direct reclaim (and slabs_scanned) are hugely affected. Normal: 114688kB DMA: 16384kB Normal/DMA ratio = 114688 / 16384 = 7.000 ******* 2.6.14 vanilla ******** * kswapd scanning rates pgscan_kswapd_normal 450483 pgscan_kswapd_dma 84645 pgscan_kswapd Normal/DMA = (450483 / 88869) = 5.069 * direct scanning rates pgscan_direct_normal 23826 pgscan_direct_dma 4224 pgscan_direct Normal/DMA = (23826 / 4224) = 5.640 * global (kswapd+direct) scanning rates pgscan_normal = (450483 + 23826) = 474309 pgscan_dma = (84645 + 4224) = 88869 pgscan Normal/DMA = (474309 / 88869) = 5.337 pgalloc_normal = 794293 pgalloc_dma = 123805 pgalloc_normal_dma_ratio = (794293/123805) = 6.415 ******* 2.6.14 akpm-no-nr_to_reclaim ******** * kswapd scanning rates pgscan_kswapd_normal 441936 pgscan_kswapd_dma 80520 pgscan_kswapd Normal/DMA = (441936 / 80520) = 5.488 * direct scanning rates pgscan_direct_normal 7392 pgscan_direct_dma 1188 pgscan_direct Normal/DMA = (7392/1188) = 6.222 * global (kswapd+direct) scanning rates pgscan_normal = (441936 + 7392) = 449328 pgscan_dma = (80520 + 1188) = 81708 pgscan Normal/DMA = (449328 / 81708) = 5.499 pgalloc_normal = 559994 pgalloc_dma = 84883 pgalloc_normal_dma_ratio = (559994 / 8488) = 6.597 ****** 2.6.14 isolate relative ***** * kswapd scanning rates pgscan_kswapd_normal 664883 pgscan_kswapd_dma 82845 pgscan_kswapd Normal/DMA (664883/82845) = 8.025 * direct scanning rates pgscan_direct_normal 13485 pgscan_direct_dma 1745 pgscan_direct Normal/DMA = (13485/1745) = 7.727 * global (kswapd+direct) scanning rates pgscan_normal = (664883 + 13485) = 678368 pgscan_dma = (82845 + 1745) = 84590 pgscan Normal/DMA = (678368 / 84590) = 8.019 pgalloc_normal 699927 pgalloc_dma 66313 pgalloc_normal_dma_ratio = (699927/66313) = 10.554
Andrew Morton | 2 Dec 22:39

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

Marcelo Tosatti <marcelo.tosatti <at> cyclades.com> wrote:
>
> 
> It all makes sense to me (Wu's description of the problem and your patch), 
> but still no good with reference to fair scanning.

Not so.  On a 4G x86 box doing a simple 8GB write this patch took the
highmem/normal scanning ratio from 0.7 to 3.5.  On that setup the highmem
zone has 3.6x as many pages as the normal zone, so it's bang-on-target.

There's not a lot of point in jumping straight into the complex stresstests
without having first tested the simple stuff.

> Moreover the patch hurts 
> interactivity _badly_, not sure why (ssh into the box with FFSB testcase 
> takes more than one minute to login, while vanilla takes few dozens of seconds). 

Well, we know that the revert reintroduces an overscanning problem.

How are you invoking FFSB?  Exactly?  On what sort of machine, with how
much memory?

> Follows an interesting part of "diff -u 2614-vanilla.vmstat 2614-akpm.vmstat"
> (they were not retrieve at the exact same point in the benchmark run, but 
> that should not matter much):
> 
> -slabs_scanned 37632
> -kswapd_steal 731859
> -kswapd_inodesteal 1363
> -pageoutrun 26573
> -allocstall 636
> -pgrotated 1898
> +slabs_scanned 2688
> +kswapd_steal 502946
> +kswapd_inodesteal 1
> +pageoutrun 10612
> +allocstall 90
> +pgrotated 68
>
> Note how direct reclaim (and slabs_scanned) are hugely affected. 

hm.  allocstall is much lower and pgrotated has improved and direct reclaim
has improved.  All of which would indicate that kswapd is doing more work. 
Yet kswapd reclaimed less pages.  It's hard to say what's going on as these
numbers came from different stages of the test.

> 
> Normal: 114688kB
> DMA: 16384kB
> 
> Normal/DMA ratio = 114688 / 16384 = 7.000
>
> pgscan_kswapd Normal/DMA = (450483 / 88869) = 5.069
> pgscan_direct Normal/DMA = (23826 / 4224) = 5.640
> pgscan Normal/DMA = (474309 / 88869) = 5.337
> pgscan_kswapd Normal/DMA = (441936 / 80520) = 5.488
> pgscan_direct Normal/DMA = (7392/1188) = 6.222
> pgscan Normal/DMA = (449328 / 81708) = 5.499
> pgalloc_normal_dma_ratio = (559994 / 8488) = 6.597
> pgscan_kswapd Normal/DMA (664883/82845) = 8.025
> pgscan_direct Normal/DMA = (13485/1745) = 7.727
> pgscan Normal/DMA = (678368 / 84590) = 8.019
> pgalloc_normal_dma_ratio = (699927/66313) = 10.554

All of these look close enough to me.  10-20% over- or under-scanning of
the teeny DMA zone doesn't seem very important.  Getting normal-vs-highmem
right is more important.

It's hard to say what effect the watermark thingies have on all of this. 
I'd sugget that you start out with much less complex tests and see if `echo
10000 10000 10000 > /proc/sys/vm/lowmem_reserve_ratio' changes anything. 
(I have that in my rc.local - the thing is a daft waste of memory).

I'd be more concerned about the interactivity thing, although it sounds
like the machine is so overloaded with this test that it'd be fairly
pointless to try to tune that workload first.  It's more important to tune
the system for more typical heavy loads.

Also, the choice of IO scheduler matters.  Which are you using?
Marcelo Tosatti | 3 Dec 01:26
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

On Fri, Dec 02, 2005 at 01:39:17PM -0800, Andrew Morton wrote:
> Marcelo Tosatti <marcelo.tosatti <at> cyclades.com> wrote:
> >
> > 
> > It all makes sense to me (Wu's description of the problem and your patch), 
> > but still no good with reference to fair scanning.
> 
> Not so.  On a 4G x86 box doing a simple 8GB write this patch took the
> highmem/normal scanning ratio from 0.7 to 3.5.  On that setup the highmem
> zone has 3.6x as many pages as the normal zone, so it's bang-on-target.

Humpf!  What are the pgalloc dma/normal/highmem numbers under such test?

Does this machine need bounce buffers for disk I/O?

> There's not a lot of point in jumping straight into the complex stresstests
> without having first tested the simple stuff.

Its not a really complex stresstest, though yours is simpler. There are 10 
threads operating on 20 files. You can reproduce the load using the 
following FFSB profile (I remake the filesystem each time, results are 
pretty stable):

num_filesystems=1
num_threadgroups=1
directio=0
time=300

[filesystem0]
location=/mnt/hda4/
num_files=20
num_dirs=10
max_filesize=91534338
min_filesize=65535
[end0]

[threadgroup0]
num_threads=10
write_size=2816
write_blocksize=4096
read_size=2816
read_blocksize=4096
create_weight=100
write_weight=30
read_weight=100
[end0]

> > Moreover the patch hurts 
> > interactivity _badly_, not sure why (ssh into the box with FFSB testcase 
> > takes more than one minute to login, while vanilla takes few dozens of seconds). 
> 
> Well, we know that the revert reintroduces an overscanning problem.

Can you remember the testcase for which you added the "truncate reclaim"
logic more precisely? 

> How are you invoking FFSB?  Exactly?  On what sort of machine, with how
> much memory?

Its a single processor Pentium-3 1GHz+ booted with mem=128M, 4/5 years old IDE disk.

> > Follows an interesting part of "diff -u 2614-vanilla.vmstat 2614-akpm.vmstat"
> > (they were not retrieve at the exact same point in the benchmark run, but 
> > that should not matter much):
> > 
> > -slabs_scanned 37632
> > -kswapd_steal 731859
> > -kswapd_inodesteal 1363
> > -pageoutrun 26573
> > -allocstall 636
> > -pgrotated 1898
> > +slabs_scanned 2688
> > +kswapd_steal 502946
> > +kswapd_inodesteal 1
> > +pageoutrun 10612
> > +allocstall 90
> > +pgrotated 68
> >
> > Note how direct reclaim (and slabs_scanned) are hugely affected. 
> 
> hm.  allocstall is much lower and pgrotated has improved and direct reclaim
> has improved.  All of which would indicate that kswapd is doing more work. 
> Yet kswapd reclaimed less pages.  It's hard to say what's going on as these
> numbers came from different stages of the test.

I have a feeling they came from a somewhat equivalent stage (FFSB is a cyclic
test, there are not much of "phases" after the initial creation of files).

Feel free to reproduce the testcase, you simply need the FFSB profile 
above and mem=128M.

It seems very fragile (Wu's patches attempt to address that) in general: you
tweak it here and watch it go nuts there.

> > Normal: 114688kB
> > DMA: 16384kB
> > 
> > Normal/DMA ratio = 114688 / 16384 = 7.000
> >
> > pgscan_kswapd Normal/DMA = (450483 / 88869) = 5.069
> > pgscan_direct Normal/DMA = (23826 / 4224) = 5.640
> > pgscan Normal/DMA = (474309 / 88869) = 5.337
> > pgscan_kswapd Normal/DMA = (441936 / 80520) = 5.488
> > pgscan_direct Normal/DMA = (7392/1188) = 6.222
> > pgscan Normal/DMA = (449328 / 81708) = 5.499
> > pgalloc_normal_dma_ratio = (559994 / 8488) = 6.597
> > pgscan_kswapd Normal/DMA (664883/82845) = 8.025
> > pgscan_direct Normal/DMA = (13485/1745) = 7.727
> > pgscan Normal/DMA = (678368 / 84590) = 8.019
> > pgalloc_normal_dma_ratio = (699927/66313) = 10.554
> 
> All of these look close enough to me.  10-20% over- or under-scanning of
> the teeny DMA zone doesn't seem very important.

Hopefully yes. The lowmem_reserve[] logic is there to _avoid_ over-allocation
(over-scanning) of the DMA zone by GFP_NORMAL allocations, isnt it?

Note, there should be no DMA limited hardware on this box (I'm using PIO for the 
IDE disk). BTW, why do you need lowmem_reserve for the DMA zone if you don't 
have 16MB capped ISA devices on your system?

> Getting normal-vs-highmem right is more important.
> 
> It's hard to say what effect the watermark thingies have on all of this. 
> I'd sugget that you start out with much less complex tests and see if `echo
> 10000 10000 10000 > /proc/sys/vm/lowmem_reserve_ratio' changes anything. 
> (I have that in my rc.local - the thing is a daft waste of memory).
> 
> I'd be more concerned about the interactivity thing, although it sounds
> like the machine is so overloaded with this test that it'd be fairly
> pointless to try to tune that workload first.  It's more important to tune
> the system for more typical heavy loads.

What made me notice it was the huge interactivity difference between
vanilla and your patch, again, I'm not really sure about its root.

> Also, the choice of IO scheduler matters.  Which are you using?

The default for 2.6.14. Thats AS right? 

I'll see if I can do more tests next week.

Best wishes.
Wu Fengguang | 4 Dec 07:06
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Fri, Dec 02, 2005 at 10:26:14PM -0200, Marcelo Tosatti wrote: > It seems very fragile (Wu's patches attempt to address that) in general: you > tweak it here and watch it go nuts there.
The patch still has problems, and it can lead to more page allocations in remote nodes. For NUMA systems, basicly HPC applications want locality, and file servers want cache consistency. More worse two types of applications can coexist in one single system. The general solution may be classifying pages into two types: local pages: mostly local accessed, and low latency is first priority global pages: for consistent file caching Reclaims from global pages should be balanced globally to make a seamlessly single global cache. We can allocate special zones to hold the global pages, and make the reclaims from them in sync. Nick, are you working on this? Thanks, Wu
Wu Fengguang | 2 Dec 08:18
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Thu, Dec 01, 2005 at 09:49:31PM -0800, Andrew Morton wrote: > From: Andrew Morton <akpm <at> osdl.org> > > Revert a patch which went into 2.6.8-rc1. The changelog for that patch was: > > The shrink_zone() logic can, under some circumstances, cause far too many > pages to be reclaimed. Say, we're scanning at high priority and suddenly > hit a large number of reclaimable pages on the LRU. > > Change things so we bale out when SWAP_CLUSTER_MAX pages have been > reclaimed. > > Problem is, this change caused significant imbalance in inter-zone scan > balancing by truncating scans of larger zones. > > Suppose, for example, ZONE_HIGHMEM is 10x the size of ZONE_NORMAL. The zone > balancing algorithm would require that if we're scanning 100 pages of > ZONE_HIGHMEM, we should scan 10 pages of ZONE_NORMAL. But this logic will > cause the scanning of ZONE_HIGHMEM to bale out after only 32 pages are > reclaimed. Thus effectively causing smaller zones to be scanned relatively > harder than large ones. > > Now I need to remember what the workload was which caused me to write this > patch originally, then fix it up in a different way...
Maybe it's a situation like this: __|____|________|________________|________________________________|________________________________________________________________|________________________________________________________________________________________________________________________________|________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------- _: pinned chunk -: reclaimable chunk |: shrink_zone() invocation First we run into a large range of pinned chunks, which lowered the scan priority. And then there are plenty of reclaimable chunks, bomb... Thanks, Wu
Andrew Morton | 2 Dec 08:27

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote: > > First we run into a large range of pinned chunks, which lowered the scan > priority. And then there are plenty of reclaimable chunks, bomb...
It doesn't have to be that complex - the unreclaimable pages could be referenced, or under writeback or even simply dirty.
Wu Fengguang | 10 Dec 12:59
Picon

[BUG 2.6.15-rc5] EXT3-fs error and soft lockup detected

Hello,

I got this message when exiting qemu:

[11266.262154] EXT3-fs error (device hda): ext3_free_blocks_sb: bit already cleared for block 318015
[11266.276897] Aborting journal on device hda.
[11266.283815] EXT3-fs error (device hda): ext3_free_blocks_sb: bit already cleared for block 318016
[11266.293567] EXT3-fs error (device hda): ext3_free_blocks_sb: bit already cleared for block 318017
[11266.303451] EXT3-fs error (device hda): ext3_free_blocks_sb: bit already cleared for block 318018
[11266.313478] EXT3-fs error (device hda): ext3_free_blocks_sb: bit already cleared for block 318019
[11266.323347] EXT3-fs error (device hda): ext3_free_blocks_sb: bit already cleared for block 318020
[11266.333543] EXT3-fs error (device hda) in ext3_reserve_inode_write: Journal has aborted
[11266.342839] EXT3-fs error (device hda) in ext3_reserve_inode_write: Journal has aborted
[11266.351870] EXT3-fs error (device hda) in ext3_orphan_del: Journal has aborted
[11266.360540] EXT3-fs error (device hda) in ext3_truncate: Journal has aborted
[11266.412056] __journal_remove_journal_head: freeing b_committed_data
[11266.421834] ext3_abort called.
[11266.425341] EXT3-fs error (device hda): ext3_journal_start_sb: Detected aborted journal
[11266.433776] Remounting filesystem read-only
[11269.300117] md: stopping all md devices.
[11269.304244] md: md0 switched to read-only mode.
[11280.827080] BUG: soft lockup detected on CPU#0!
[11280.831556]
[11280.833147] Pid: 4045, comm:               reboot
[11280.837736] EIP: 0060:[<c010eee0>] CPU: 0
[11280.841809] EIP is at delay_pit+0x20/0x30
[11280.845681]  EFLAGS: 00000207    Not tainted  (2.6.15-rc5)
[11280.850828] EAX: 0021fb30 EBX: 00000263 ECX: 01062560 EDX: c0382ce0
[11280.856882] ESI: c039e694 EDI: 00000001 EBP: f76bfe5c DS: 007b ES: 007b
[11280.863077] CR0: 8005003b CR2: b7eed1e0 CR3: 376b6000 CR4: 00000690
[11292.459790] BUG: soft lockup detected on CPU#0!
[11292.464232]
[11292.465761] Pid: 4045, comm:               reboot
[11292.470325] EIP: 0060:[<c010eee0>] CPU: 0
[11292.474308] EIP is at delay_pit+0x20/0x30
[11292.478323]  EFLAGS: 00000203    Not tainted  (2.6.15-rc5)
[11292.483916] EAX: 001c3b0d EBX: 000000e8 ECX: 01062560 EDX: c0382ce0
[11292.490033] ESI: c039e694 EDI: 00000001 EBP: f76bfe5c DS: 007b ES: 007b
[11292.496736] CR0: 8005003b CR2: b7eed1e0 CR3: 376b6000 CR4: 00000690
[11300.161270] Restarting system.
[11300.164557] .
zsh: 20369 terminated  qemu -m 1156 ~/linux_image -kernel ~/linux-2.6.15-rc5/arch/i386/boot/bzImage


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.15-rc5
# Sat Dec 10 16:01:04 2005
#
CONFIG_X86_32=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
# CONFIG_IKCONFIG is not set
# CONFIG_CPUSETS is not set
CONFIG_INITRAMFS_SOURCE=""
CONFIG_EMBEDDED=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_LBD=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=m
# CONFIG_DEFAULT_AS is not set
CONFIG_DEFAULT_DEADLINE=y
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="deadline"

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
CONFIG_M586=y
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_PPRO_FENCE=y
CONFIG_X86_F00F_BUG=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_ALIGNMENT_16=y
CONFIG_HPET_TIMER=y
CONFIG_SMP=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_SMT=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=m
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_TOSHIBA=m
CONFIG_I8K=m
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

#
# Firmware Drivers
#
CONFIG_EDD=m
# CONFIG_DELL_RBU is not set
CONFIG_DCDBAS=m
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
# CONFIG_SPARSEMEM_STATIC is not set
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_IRQBALANCE=y
CONFIG_REGPARM=y
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_PHYSICAL_START=0x100000
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
CONFIG_PM_LEGACY=y
# CONFIG_PM_DEBUG is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_AC=m
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
# CONFIG_ACPI_HOTKEY is not set
CONFIG_ACPI_FAN=m
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_THERMAL=m
CONFIG_ACPI_ASUS=m
CONFIG_ACPI_IBM=m
CONFIG_ACPI_TOSHIBA=m
CONFIG_ACPI_BLACKLIST_YEAR=2001
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=m

#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=m
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=m
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=m
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=m
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set

#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_POWERNOW_K6=m
CONFIG_X86_POWERNOW_K7=m
CONFIG_X86_POWERNOW_K7_ACPI=y
CONFIG_X86_POWERNOW_K8=m
CONFIG_X86_POWERNOW_K8_ACPI=y
CONFIG_X86_GX_SUSPMOD=m
CONFIG_X86_SPEEDSTEP_CENTRINO=m
CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI=y
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y
CONFIG_X86_SPEEDSTEP_ICH=m
CONFIG_X86_SPEEDSTEP_SMI=m
CONFIG_X86_P4_CLOCKMOD=m
CONFIG_X86_CPUFREQ_NFORCE2=m
CONFIG_X86_LONGRUN=m
CONFIG_X86_LONGHAUL=m

#
# shared options
#
# CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set
CONFIG_X86_SPEEDSTEP_LIB=m
CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK=y

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI_PCIE=m
# CONFIG_HOTPLUG_PCI_PCIE_POLL_EVENT_MODE is not set
CONFIG_PCI_MSI=y
# CONFIG_PCI_LEGACY_PROC is not set
# CONFIG_PCI_DEBUG is not set
CONFIG_ISA_DMA_API=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
CONFIG_SCx200=m
# CONFIG_HOTPLUG_CPU is not set

#
# PCCARD (PCMCIA/CardBus) support
#
CONFIG_PCCARD=m
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_PCMCIA=m
CONFIG_PCMCIA_LOAD_CIS=y
CONFIG_PCMCIA_IOCTL=y
CONFIG_CARDBUS=y

#
# PC-card bridges
#
CONFIG_YENTA=m
CONFIG_PD6729=m
CONFIG_I82092=m
CONFIG_I82365=m
CONFIG_TCIC=m
CONFIG_PCMCIA_PROBE=y
CONFIG_PCCARD_NONSTATIC=m

#
# PCI Hotplug Support
#
CONFIG_HOTPLUG_PCI=m
CONFIG_HOTPLUG_PCI_FAKE=m
CONFIG_HOTPLUG_PCI_COMPAQ=m
# CONFIG_HOTPLUG_PCI_COMPAQ_NVRAM is not set
CONFIG_HOTPLUG_PCI_IBM=m
CONFIG_HOTPLUG_PCI_ACPI=m
CONFIG_HOTPLUG_PCI_ACPI_IBM=m
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_CPCI_ZT5550=m
CONFIG_HOTPLUG_PCI_CPCI_GENERIC=m
CONFIG_HOTPLUG_PCI_SHPC=m
# CONFIG_HOTPLUG_PCI_SHPC_POLL_EVENT_MODE is not set

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_MISC=m

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
CONFIG_PACKET=m
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=m
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
CONFIG_NET_KEY=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_MULTIPATH=y
# CONFIG_IP_ROUTE_MULTIPATH_CACHED is not set
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_BIC=y

#
# IP: Virtual Server Configuration
#
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12

#
# IPVS transport protocol load balancing support
#
CONFIG_IP_VS_PROTO_TCP=y
CONFIG_IP_VS_PROTO_UDP=y
CONFIG_IP_VS_PROTO_ESP=y
CONFIG_IP_VS_PROTO_AH=y

#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
CONFIG_IP_VS_SED=m
CONFIG_IP_VS_NQ=m

#
# IPVS application helper
#
CONFIG_IP_VS_FTP=m
CONFIG_IPV6=m
CONFIG_IPV6_PRIVACY=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_INET6_TUNNEL=m
CONFIG_IPV6_TUNNEL=m
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_BRIDGE_NETFILTER=y

#
# Core Netfilter Configuration
#
# CONFIG_NETFILTER_NETLINK is not set

#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_CT_ACCT=y
CONFIG_IP_NF_CONNTRACK_MARK=y
# CONFIG_IP_NF_CONNTRACK_EVENTS is not set
CONFIG_IP_NF_CT_PROTO_SCTP=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
# CONFIG_IP_NF_NETBIOS_NS is not set
CONFIG_IP_NF_TFTP=m
CONFIG_IP_NF_AMANDA=m
# CONFIG_IP_NF_PPTP is not set
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_PHYSDEV=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
CONFIG_IP_NF_MATCH_SCTP=m
# CONFIG_IP_NF_MATCH_DCCP is not set
CONFIG_IP_NF_MATCH_COMMENT=m
CONFIG_IP_NF_MATCH_CONNMARK=m
# CONFIG_IP_NF_MATCH_CONNBYTES is not set
CONFIG_IP_NF_MATCH_HASHLIMIT=m
# CONFIG_IP_NF_MATCH_STRING is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
# CONFIG_IP_NF_TARGET_NFQUEUE is not set
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
CONFIG_IP_NF_NAT_SNMP_BASIC=m
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_NAT_TFTP=m
CONFIG_IP_NF_NAT_AMANDA=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
# CONFIG_IP_NF_TARGET_TTL is not set
CONFIG_IP_NF_TARGET_CONNMARK=m
CONFIG_IP_NF_TARGET_CLUSTERIP=m
CONFIG_IP_NF_RAW=m
CONFIG_IP_NF_TARGET_NOTRACK=m
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m

#
# IPv6: Netfilter Configuration (EXPERIMENTAL)
#
CONFIG_IP6_NF_QUEUE=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_LIMIT=m
CONFIG_IP6_NF_MATCH_MAC=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_MULTIPORT=m
CONFIG_IP6_NF_MATCH_OWNER=m
CONFIG_IP6_NF_MATCH_MARK=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_AHESP=m
CONFIG_IP6_NF_MATCH_LENGTH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_PHYSDEV=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_LOG=m
# CONFIG_IP6_NF_TARGET_REJECT is not set
# CONFIG_IP6_NF_TARGET_NFQUEUE is not set
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_TARGET_MARK=m
# CONFIG_IP6_NF_TARGET_HL is not set
CONFIG_IP6_NF_RAW=m

#
# DECnet: Netfilter Configuration
#
CONFIG_DECNET_NF_GRABULATOR=m

#
# Bridge: Netfilter Configuration
#
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m

#
# DCCP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_DCCP is not set

#
# SCTP Configuration (EXPERIMENTAL)
#
CONFIG_IP_SCTP=m
# CONFIG_SCTP_DBG_MSG is not set
# CONFIG_SCTP_DBG_OBJCNT is not set
# CONFIG_SCTP_HMAC_NONE is not set
# CONFIG_SCTP_HMAC_SHA1 is not set
CONFIG_SCTP_HMAC_MD5=y
CONFIG_ATM=m
CONFIG_ATM_CLIP=m
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
CONFIG_ATM_MPOA=m
CONFIG_ATM_BR2684=m
# CONFIG_ATM_BR2684_IPFILTER is not set
CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
CONFIG_DECNET=m
# CONFIG_DECNET_ROUTER is not set
CONFIG_LLC=y
CONFIG_LLC2=m
CONFIG_IPX=m
# CONFIG_IPX_INTERN is not set
CONFIG_ATALK=m
CONFIG_DEV_APPLETALK=y
CONFIG_LTPC=m
CONFIG_COPS=m
CONFIG_COPS_DAYNA=y
CONFIG_COPS_TANGENT=y
CONFIG_IPDDP=m
CONFIG_IPDDP_ENCAP=y
CONFIG_IPDDP_DECAP=y
CONFIG_X25=m
CONFIG_LAPB=m
# CONFIG_NET_DIVERT is not set
CONFIG_ECONET=m
CONFIG_ECONET_AUNUDP=y
CONFIG_ECONET_NATIVE=y
CONFIG_WAN_ROUTER=m

#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set

#
# Queueing/Scheduling
#
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
CONFIG_NET_SCH_HFSC=m
CONFIG_NET_SCH_ATM=m
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
CONFIG_NET_SCH_NETEM=m
CONFIG_NET_SCH_INGRESS=m

#
# Classification
#
CONFIG_NET_CLS=y
# CONFIG_NET_CLS_BASIC is not set
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
# CONFIG_CLS_U32_MARK is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_EMATCH is not set
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_ESTIMATOR=y

#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_IEEE80211 is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=m
# CONFIG_DEBUG_DRIVER is not set

#
# Connector - unified userspace <-> kernelspace linker
#
CONFIG_CONNECTOR=m

#
# Memory Technology Devices (MTD)
#
CONFIG_MTD=m
# CONFIG_MTD_DEBUG is not set
CONFIG_MTD_CONCAT=m
CONFIG_MTD_PARTITIONS=y
CONFIG_MTD_REDBOOT_PARTS=m
CONFIG_MTD_REDBOOT_DIRECTORY_BLOCK=-1
# CONFIG_MTD_REDBOOT_PARTS_UNALLOCATED is not set
# CONFIG_MTD_REDBOOT_PARTS_READONLY is not set
# CONFIG_MTD_CMDLINE_PARTS is not set

#
# User Modules And Translation Layers
#
CONFIG_MTD_CHAR=m
CONFIG_MTD_BLOCK=m
CONFIG_MTD_BLOCK_RO=m
CONFIG_FTL=m
CONFIG_NFTL=m
CONFIG_NFTL_RW=y
CONFIG_INFTL=m
# CONFIG_RFD_FTL is not set

#
# RAM/ROM/Flash chip drivers
#
CONFIG_MTD_CFI=m
CONFIG_MTD_JEDECPROBE=m
CONFIG_MTD_GEN_PROBE=m
# CONFIG_MTD_CFI_ADV_OPTIONS is not set
CONFIG_MTD_MAP_BANK_WIDTH_1=y
CONFIG_MTD_MAP_BANK_WIDTH_2=y
CONFIG_MTD_MAP_BANK_WIDTH_4=y
# CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
# CONFIG_MTD_MAP_BANK_WIDTH_32 is not set
CONFIG_MTD_CFI_I1=y
CONFIG_MTD_CFI_I2=y
# CONFIG_MTD_CFI_I4 is not set
# CONFIG_MTD_CFI_I8 is not set
CONFIG_MTD_CFI_INTELEXT=m
CONFIG_MTD_CFI_AMDSTD=m
CONFIG_MTD_CFI_AMDSTD_RETRY=0
CONFIG_MTD_CFI_STAA=m
CONFIG_MTD_CFI_UTIL=m
CONFIG_MTD_RAM=m
CONFIG_MTD_ROM=m
CONFIG_MTD_ABSENT=m
# CONFIG_MTD_OBSOLETE_CHIPS is not set

#
# Mapping drivers for chip access
#
CONFIG_MTD_COMPLEX_MAPPINGS=y
CONFIG_MTD_PHYSMAP=m
CONFIG_MTD_PHYSMAP_START=0x8000000
CONFIG_MTD_PHYSMAP_LEN=0x4000000
CONFIG_MTD_PHYSMAP_BANKWIDTH=2
CONFIG_MTD_PNC2000=m
CONFIG_MTD_SC520CDP=m
CONFIG_MTD_NETSC520=m
# CONFIG_MTD_TS5500 is not set
CONFIG_MTD_SBC_GXX=m
CONFIG_MTD_SCx200_DOCFLASH=m
# CONFIG_MTD_AMD76XROM is not set
# CONFIG_MTD_ICHXROM is not set
# CONFIG_MTD_SCB2_FLASH is not set
CONFIG_MTD_NETtel=m
CONFIG_MTD_DILNETPC=m
CONFIG_MTD_DILNETPC_BOOTSIZE=0x80000
# CONFIG_MTD_L440GX is not set
CONFIG_MTD_PCI=m
CONFIG_MTD_PCMCIA=m
# CONFIG_MTD_PCMCIA_ANONYMOUS is not set
# CONFIG_MTD_PLATRAM is not set

#
# Self-contained MTD device drivers
#
CONFIG_MTD_PMC551=m
# CONFIG_MTD_PMC551_BUGFIX is not set
# CONFIG_MTD_PMC551_DEBUG is not set
CONFIG_MTD_SLRAM=m
CONFIG_MTD_PHRAM=m
CONFIG_MTD_MTDRAM=m
CONFIG_MTDRAM_TOTAL_SIZE=4096
CONFIG_MTDRAM_ERASE_SIZE=128
CONFIG_MTD_BLKMTD=m
CONFIG_MTD_BLOCK2MTD=m

#
# Disk-On-Chip Device Drivers
#
CONFIG_MTD_DOC2000=m
CONFIG_MTD_DOC2001=m
CONFIG_MTD_DOC2001PLUS=m
CONFIG_MTD_DOCPROBE=m
CONFIG_MTD_DOCECC=m
# CONFIG_MTD_DOCPROBE_ADVANCED is not set
CONFIG_MTD_DOCPROBE_ADDRESS=0

#
# NAND Flash Device Drivers
#
CONFIG_MTD_NAND=m
# CONFIG_MTD_NAND_VERIFY_WRITE is not set
CONFIG_MTD_NAND_IDS=m
CONFIG_MTD_NAND_DISKONCHIP=m
# CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADVANCED is not set
CONFIG_MTD_NAND_DISKONCHIP_PROBE_ADDRESS=0
# CONFIG_MTD_NAND_DISKONCHIP_BBTWRITE is not set
# CONFIG_MTD_NAND_NANDSIM is not set

#
# OneNAND Flash Device Drivers
#
# CONFIG_MTD_ONENAND is not set

#
# Parallel port support
#
CONFIG_PARPORT=m
CONFIG_PARPORT_PC=m
CONFIG_PARPORT_SERIAL=m
CONFIG_PARPORT_PC_FIFO=y
# CONFIG_PARPORT_PC_SUPERIO is not set
CONFIG_PARPORT_PC_PCMCIA=m
CONFIG_PARPORT_NOT_PC=y
# CONFIG_PARPORT_GSC is not set
CONFIG_PARPORT_1284=y

#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set

#
# Protocols
#
CONFIG_ISAPNP=y
CONFIG_PNPBIOS=y
CONFIG_PNPBIOS_PROC_FS=y
CONFIG_PNPACPI=y

#
# Block devices
#
CONFIG_BLK_DEV_FD=m
CONFIG_BLK_DEV_XD=m
CONFIG_PARIDE=m
CONFIG_PARIDE_PARPORT=m

#
# Parallel IDE high-level drivers
#
CONFIG_PARIDE_PD=m
CONFIG_PARIDE_PCD=m
CONFIG_PARIDE_PF=m
CONFIG_PARIDE_PT=m
CONFIG_PARIDE_PG=m

#
# Parallel IDE protocol modules
#
CONFIG_PARIDE_ATEN=m
CONFIG_PARIDE_BPCK=m
CONFIG_PARIDE_BPCK6=m
CONFIG_PARIDE_COMM=m
CONFIG_PARIDE_DSTR=m
CONFIG_PARIDE_FIT2=m
CONFIG_PARIDE_FIT3=m
CONFIG_PARIDE_EPAT=m
# CONFIG_PARIDE_EPATC8 is not set
CONFIG_PARIDE_EPIA=m
CONFIG_PARIDE_FRIQ=m
CONFIG_PARIDE_FRPW=m
CONFIG_PARIDE_KBIC=m
CONFIG_PARIDE_KTTI=m
CONFIG_PARIDE_ON20=m
CONFIG_PARIDE_ON26=m
CONFIG_BLK_CPQ_DA=m
CONFIG_BLK_CPQ_CISS_DA=m
CONFIG_CISS_SCSI_TAPE=y
CONFIG_BLK_DEV_DAC960=m
CONFIG_BLK_DEV_UMEM=m
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_SX8=m
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=8192
CONFIG_BLK_DEV_INITRD=y
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
# CONFIG_CDROM_PKTCDVD_WCACHE is not set
CONFIG_ATA_OVER_ETH=m

#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
# CONFIG_IDEDISK_MULTI_MODE is not set
CONFIG_BLK_DEV_IDECS=m
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDETAPE=m
CONFIG_BLK_DEV_IDEFLOPPY=m
CONFIG_BLK_DEV_IDESCSI=m
# CONFIG_IDE_TASK_IOCTL is not set

#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=m
CONFIG_BLK_DEV_OPTI621=m
CONFIG_BLK_DEV_RZ1000=m
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_AEC62XX=m
CONFIG_BLK_DEV_ALI15X3=m
# CONFIG_WDC_ALI15X3 is not set
CONFIG_BLK_DEV_AMD74XX=m
CONFIG_BLK_DEV_ATIIXP=m
CONFIG_BLK_DEV_CMD64X=m
CONFIG_BLK_DEV_TRIFLEX=m
CONFIG_BLK_DEV_CY82C693=m
CONFIG_BLK_DEV_CS5520=m
CONFIG_BLK_DEV_CS5530=m
# CONFIG_BLK_DEV_CS5535 is not set
CONFIG_BLK_DEV_HPT34X=m
# CONFIG_HPT34X_AUTODMA is not set
CONFIG_BLK_DEV_HPT366=m
CONFIG_BLK_DEV_SC1200=m
CONFIG_BLK_DEV_PIIX=m
# CONFIG_BLK_DEV_IT821X is not set
CONFIG_BLK_DEV_NS87415=m
CONFIG_BLK_DEV_PDC202XX_OLD=m
CONFIG_PDC202XX_BURST=y
CONFIG_BLK_DEV_PDC202XX_NEW=m
CONFIG_PDC202XX_FORCE=y
CONFIG_BLK_DEV_SVWKS=m
CONFIG_BLK_DEV_SIIMAGE=m
CONFIG_BLK_DEV_SIS5513=m
CONFIG_BLK_DEV_SLC90E66=m
CONFIG_BLK_DEV_TRM290=m
CONFIG_BLK_DEV_VIA82CXXX=m
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set

#
# SCSI device support
#
CONFIG_RAID_ATTRS=m
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=m
CONFIG_CHR_DEV_SCH=m

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y

#
# SCSI Transport Attributes
#
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_FC_ATTRS=m
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m

#
# SCSI low-level drivers
#
CONFIG_ISCSI_TCP=m
CONFIG_BLK_DEV_3W_XXXX_RAID=m
CONFIG_SCSI_3W_9XXX=m
CONFIG_SCSI_7000FASST=m
CONFIG_SCSI_ACARD=m
CONFIG_SCSI_AHA152X=m
CONFIG_SCSI_AHA1542=m
CONFIG_SCSI_AACRAID=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=8
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
CONFIG_SCSI_AIC7XXX_OLD=m
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=15000
CONFIG_AIC79XX_ENABLE_RD_STRM=y
CONFIG_AIC79XX_DEBUG_ENABLE=y
CONFIG_AIC79XX_DEBUG_MASK=0
CONFIG_AIC79XX_REG_PRETTY_PRINT=y
CONFIG_SCSI_DPT_I2O=m
CONFIG_SCSI_ADVANSYS=m
CONFIG_SCSI_IN2000=m
CONFIG_MEGARAID_NEWGEN=y
CONFIG_MEGARAID_MM=y
CONFIG_MEGARAID_MAILBOX=y
CONFIG_MEGARAID_SAS=m
CONFIG_SCSI_SATA=m
CONFIG_SCSI_SATA_AHCI=m
CONFIG_SCSI_SATA_SVW=m
CONFIG_SCSI_ATA_PIIX=m
# CONFIG_SCSI_SATA_MV is not set
CONFIG_SCSI_SATA_NV=m
# CONFIG_SCSI_PDC_ADMA is not set
CONFIG_SCSI_SATA_QSTOR=m
CONFIG_SCSI_SATA_PROMISE=m
CONFIG_SCSI_SATA_SX4=m
CONFIG_SCSI_SATA_SIL=m
CONFIG_SCSI_SATA_SIL24=m
CONFIG_SCSI_SATA_SIS=m
CONFIG_SCSI_SATA_ULI=m
CONFIG_SCSI_SATA_VIA=m
CONFIG_SCSI_SATA_VITESSE=m
CONFIG_SCSI_SATA_INTEL_COMBINED=y
CONFIG_SCSI_BUSLOGIC=m
# CONFIG_SCSI_OMIT_FLASHPOINT is not set
CONFIG_SCSI_DMX3191D=m
CONFIG_SCSI_DTC3280=m
CONFIG_SCSI_EATA=m
CONFIG_SCSI_EATA_TAGGED_QUEUE=y
CONFIG_SCSI_EATA_LINKED_COMMANDS=y
CONFIG_SCSI_EATA_MAX_TAGS=16
CONFIG_SCSI_EATA_PIO=m
CONFIG_SCSI_FUTURE_DOMAIN=m
CONFIG_SCSI_GDTH=m
CONFIG_SCSI_GENERIC_NCR5380=m
CONFIG_SCSI_GENERIC_NCR5380_MMIO=m
CONFIG_SCSI_GENERIC_NCR53C400=y
CONFIG_SCSI_IPS=m
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
CONFIG_SCSI_PPA=m
CONFIG_SCSI_IMM=m
# CONFIG_SCSI_IZIP_EPP16 is not set
# CONFIG_SCSI_IZIP_SLOW_CTR is not set
CONFIG_SCSI_NCR53C406A=m
CONFIG_SCSI_SYM53C8XX_2=m
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set
CONFIG_SCSI_IPR=m
# CONFIG_SCSI_IPR_TRACE is not set
# CONFIG_SCSI_IPR_DUMP is not set
CONFIG_SCSI_PAS16=m
CONFIG_SCSI_PSI240I=m
CONFIG_SCSI_QLOGIC_FAS=m
CONFIG_SCSI_QLOGIC_FC=m
CONFIG_SCSI_QLOGIC_FC_FIRMWARE=y
CONFIG_SCSI_QLOGIC_1280=m
CONFIG_SCSI_QLA2XXX=y
CONFIG_SCSI_QLA21XX=m
CONFIG_SCSI_QLA22XX=m
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA24XX is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_SEAGATE is not set
CONFIG_SCSI_SYM53C416=m
CONFIG_SCSI_DC395x=m
CONFIG_SCSI_DC390T=m
CONFIG_SCSI_T128=m
CONFIG_SCSI_U14_34F=m
CONFIG_SCSI_U14_34F_TAGGED_QUEUE=y
CONFIG_SCSI_U14_34F_LINKED_COMMANDS=y
CONFIG_SCSI_U14_34F_MAX_TAGS=8
CONFIG_SCSI_ULTRASTOR=m
CONFIG_SCSI_NSP32=m
CONFIG_SCSI_DEBUG=m

#
# PCMCIA SCSI adapter support
#
CONFIG_PCMCIA_AHA152X=m
CONFIG_PCMCIA_FDOMAIN=m
CONFIG_PCMCIA_NINJA_SCSI=m
CONFIG_PCMCIA_QLOGIC=m
CONFIG_PCMCIA_SYM53C500=m

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
CONFIG_CD_NO_IDESCSI=y
CONFIG_AZTCD=m
CONFIG_GSCD=m
# CONFIG_SBPCD is not set
CONFIG_MCDX=m
CONFIG_OPTCD=m
# CONFIG_CM206 is not set
CONFIG_SJCD=m
CONFIG_ISP16_CDI=m
# CONFIG_CDU31A is not set
CONFIG_CDU535=m

#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=y
CONFIG_MD_RAID10=m
CONFIG_MD_RAID5=m
CONFIG_MD_RAID6=m
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
CONFIG_BLK_DEV_DM=m
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_MIRROR=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
CONFIG_DM_MULTIPATH_EMC=m

#
# Fusion MPT device support
#
CONFIG_FUSION=y
CONFIG_FUSION_SPI=y
CONFIG_FUSION_FC=m
CONFIG_FUSION_SAS=m
CONFIG_FUSION_MAX_SGE=128
CONFIG_FUSION_CTL=m
CONFIG_FUSION_LAN=m

#
# IEEE 1394 (FireWire) support
#
CONFIG_IEEE1394=m

#
# Subsystem Options
#
# CONFIG_IEEE1394_VERBOSEDEBUG is not set
# CONFIG_IEEE1394_OUI_DB is not set
CONFIG_IEEE1394_EXTRA_CONFIG_ROMS=y
CONFIG_IEEE1394_CONFIG_ROM_IP1394=y
# CONFIG_IEEE1394_EXPORT_FULL_API is not set

#
# Device Drivers
#
CONFIG_IEEE1394_PCILYNX=m
CONFIG_IEEE1394_OHCI1394=m

#
# Protocol Drivers
#
CONFIG_IEEE1394_VIDEO1394=m
CONFIG_IEEE1394_SBP2=m
# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
CONFIG_IEEE1394_ETH1394=m
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
# CONFIG_IEEE1394_CMP is not set

#
# I2O device support
#
CONFIG_I2O=m
CONFIG_I2O_EXT_ADAPTEC=y
CONFIG_I2O_CONFIG=m
CONFIG_I2O_CONFIG_OLD_IOCTL=y
CONFIG_I2O_BUS=m
CONFIG_I2O_BLOCK=m
CONFIG_I2O_SCSI=m
CONFIG_I2O_PROC=m

#
# Network device support
#
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
CONFIG_EQUALIZER=m
CONFIG_TUN=m
CONFIG_NET_SB1000=m

#
# ARCnet devices
#
CONFIG_ARCNET=m
CONFIG_ARCNET_1201=m
CONFIG_ARCNET_1051=m
CONFIG_ARCNET_RAW=m
CONFIG_ARCNET_CAP=m
CONFIG_ARCNET_COM90xx=m
CONFIG_ARCNET_COM90xxIO=m
CONFIG_ARCNET_RIM_I=m
CONFIG_ARCNET_COM20020=m
CONFIG_ARCNET_COM20020_ISA=m
CONFIG_ARCNET_COM20020_PCI=m

#
# PHY device support
#
# CONFIG_PHYLIB is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
CONFIG_HAPPYMEAL=m
CONFIG_SUNGEM=m
# CONFIG_CASSINI is not set
CONFIG_NET_VENDOR_3COM=y
CONFIG_EL1=m
CONFIG_EL2=m
CONFIG_ELPLUS=m
CONFIG_EL16=m
CONFIG_EL3=m
CONFIG_3C515=m
CONFIG_VORTEX=m
CONFIG_TYPHOON=m
CONFIG_LANCE=m
CONFIG_NET_VENDOR_SMC=y
CONFIG_WD80x3=m
CONFIG_ULTRA=m
CONFIG_SMC9194=m
CONFIG_NET_VENDOR_RACAL=y
# CONFIG_NI5010 is not set
CONFIG_NI52=m
CONFIG_NI65=m

#
# Tulip family network device support
#
CONFIG_NET_TULIP=y
CONFIG_DE2104X=m
CONFIG_TULIP=m
# CONFIG_TULIP_MWI is not set
# CONFIG_TULIP_MMIO is not set
# CONFIG_TULIP_NAPI is not set
CONFIG_DE4X5=m
CONFIG_WINBOND_840=m
CONFIG_DM9102=m
# CONFIG_ULI526X is not set
CONFIG_PCMCIA_XIRCOM=m
# CONFIG_PCMCIA_XIRTULIP is not set
CONFIG_AT1700=m
CONFIG_DEPCA=m
CONFIG_HP100=m
CONFIG_NET_ISA=y
CONFIG_E2100=m
CONFIG_EWRK3=m
CONFIG_EEXPRESS=m
CONFIG_EEXPRESS_PRO=m
CONFIG_HPLAN_PLUS=m
CONFIG_HPLAN=m
CONFIG_LP486E=m
CONFIG_ETH16I=m
CONFIG_NE2000=m
CONFIG_ZNET=m
CONFIG_SEEQ8005=m
CONFIG_NET_PCI=y
CONFIG_PCNET32=m
CONFIG_AMD8111_ETH=m
# CONFIG_AMD8111E_NAPI is not set
CONFIG_ADAPTEC_STARFIRE=m
# CONFIG_ADAPTEC_STARFIRE_NAPI is not set
CONFIG_AC3200=m
CONFIG_APRICOT=m
CONFIG_B44=m
CONFIG_FORCEDETH=m
CONFIG_CS89x0=m
CONFIG_DGRS=m
CONFIG_EEPRO100=m
CONFIG_E100=y
CONFIG_FEALNX=m
CONFIG_NATSEMI=m
CONFIG_NE2K_PCI=y
CONFIG_8139CP=m
CONFIG_8139TOO=m
CONFIG_8139TOO_PIO=y
CONFIG_8139TOO_TUNE_TWISTER=y
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_SIS900=m
CONFIG_EPIC100=m
CONFIG_SUNDANCE=m
# CONFIG_SUNDANCE_MMIO is not set
CONFIG_TLAN=m
CONFIG_VIA_RHINE=m
CONFIG_VIA_RHINE_MMIO=y
CONFIG_NET_POCKET=y
CONFIG_ATP=m
CONFIG_DE600=m
CONFIG_DE620=m

#
# Ethernet (1000 Mbit)
#
CONFIG_ACENIC=m
# CONFIG_ACENIC_OMIT_TIGON_I is not set
CONFIG_DL2K=m
CONFIG_E1000=m
# CONFIG_E1000_NAPI is not set
CONFIG_NS83820=m
CONFIG_HAMACHI=m
CONFIG_YELLOWFIN=m
CONFIG_R8169=m
# CONFIG_R8169_NAPI is not set
CONFIG_R8169_VLAN=y
CONFIG_SIS190=m
CONFIG_SKGE=m
CONFIG_SK98LIN=m
CONFIG_VIA_VELOCITY=m
CONFIG_TIGON3=m
CONFIG_BNX2=m

#
# Ethernet (10000 Mbit)
#
# CONFIG_CHELSIO_T1 is not set
CONFIG_IXGB=m
# CONFIG_IXGB_NAPI is not set
CONFIG_S2IO=m
# CONFIG_S2IO_NAPI is not set

#
# Token Ring devices
#
CONFIG_TR=y
CONFIG_IBMTR=m
CONFIG_IBMOL=m
CONFIG_IBMLS=m
CONFIG_3C359=m
CONFIG_TMS380TR=m
CONFIG_TMSPCI=m
CONFIG_SKISA=m
CONFIG_PROTEON=m
CONFIG_ABYSS=m
# CONFIG_SMCTR is not set

#
# Wireless LAN (non-hamradio)
#
CONFIG_NET_RADIO=y

#
# Obsolete Wireless cards support (pre-802.11)
#
CONFIG_STRIP=m
CONFIG_ARLAN=m
CONFIG_WAVELAN=m
CONFIG_PCMCIA_WAVELAN=m
CONFIG_PCMCIA_NETWAVE=m

#
# Wireless 802.11 Frequency Hopping cards support
#
CONFIG_PCMCIA_RAYCS=m

#
# Wireless 802.11b ISA/PCI cards support
#
CONFIG_AIRO=m
CONFIG_HERMES=m
CONFIG_PLX_HERMES=m
CONFIG_TMD_HERMES=m
# CONFIG_NORTEL_HERMES is not set
CONFIG_PCI_HERMES=m
CONFIG_ATMEL=m
CONFIG_PCI_ATMEL=m

#
# Wireless 802.11b Pcmcia/Cardbus cards support
#
CONFIG_PCMCIA_HERMES=m
# CONFIG_PCMCIA_SPECTRUM is not set
CONFIG_AIRO_CS=m
CONFIG_PCMCIA_ATMEL=m
CONFIG_PCMCIA_WL3501=m

#
# Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support
#
CONFIG_PRISM54=m
# CONFIG_HOSTAP is not set
CONFIG_NET_WIRELESS=y

#
# PCMCIA network device support
#
CONFIG_NET_PCMCIA=y
CONFIG_PCMCIA_3C589=m
CONFIG_PCMCIA_3C574=m
CONFIG_PCMCIA_FMVJ18X=m
CONFIG_PCMCIA_PCNET=m
CONFIG_PCMCIA_NMCLAN=m
CONFIG_PCMCIA_SMC91C92=m
CONFIG_PCMCIA_XIRC2PS=m
CONFIG_PCMCIA_AXNET=m
CONFIG_ARCNET_COM20020_CS=m
CONFIG_PCMCIA_IBMTR=m

#
# Wan interfaces
#
CONFIG_WAN=y
CONFIG_HOSTESS_SV11=m
CONFIG_COSA=m
CONFIG_DSCC4=m
CONFIG_DSCC4_PCISYNC=y
CONFIG_DSCC4_PCI_RST=y
CONFIG_LANMEDIA=m
CONFIG_SEALEVEL_4021=m
CONFIG_SYNCLINK_SYNCPPP=m
CONFIG_HDLC=m
CONFIG_HDLC_RAW=y
CONFIG_HDLC_RAW_ETH=y
CONFIG_HDLC_CISCO=y
CONFIG_HDLC_FR=y
CONFIG_HDLC_PPP=y
CONFIG_HDLC_X25=y
CONFIG_PCI200SYN=m
CONFIG_WANXL=m
CONFIG_PC300=m
CONFIG_PC300_MLPPP=y
CONFIG_N2=m
CONFIG_C101=m
CONFIG_FARSYNC=m
CONFIG_DLCI=m
CONFIG_DLCI_COUNT=24
CONFIG_DLCI_MAX=8
CONFIG_SDLA=m
CONFIG_WAN_ROUTER_DRIVERS=y
# CONFIG_VENDOR_SANGOMA is not set
CONFIG_CYCLADES_SYNC=m
CONFIG_CYCLOMX_X25=y
CONFIG_LAPBETHER=m
CONFIG_X25_ASY=m
CONFIG_SBNI=m
# CONFIG_SBNI_MULTILINE is not set

#
# ATM drivers
#
# CONFIG_ATM_DUMMY is not set
CONFIG_ATM_TCP=m
CONFIG_ATM_LANAI=m
CONFIG_ATM_ENI=m
# CONFIG_ATM_ENI_DEBUG is not set
# CONFIG_ATM_ENI_TUNE_BURST is not set
CONFIG_ATM_FIRESTREAM=m
CONFIG_ATM_ZATM=m
# CONFIG_ATM_ZATM_DEBUG is not set
CONFIG_ATM_NICSTAR=m
# CONFIG_ATM_NICSTAR_USE_SUNI is not set
# CONFIG_ATM_NICSTAR_USE_IDT77105 is not set
CONFIG_ATM_IDT77252=m
# CONFIG_ATM_IDT77252_DEBUG is not set
# CONFIG_ATM_IDT77252_RCV_ALL is not set
CONFIG_ATM_IDT77252_USE_SUNI=y
CONFIG_ATM_AMBASSADOR=m
# CONFIG_ATM_AMBASSADOR_DEBUG is not set
CONFIG_ATM_HORIZON=m
# CONFIG_ATM_HORIZON_DEBUG is not set
CONFIG_ATM_IA=m
# CONFIG_ATM_IA_DEBUG is not set
CONFIG_ATM_FORE200E_MAYBE=m
CONFIG_ATM_FORE200E_PCA=y
CONFIG_ATM_FORE200E_PCA_DEFAULT_FW=y
# CONFIG_ATM_FORE200E_USE_TASKLET is not set
CONFIG_ATM_FORE200E_TX_RETRY=16
CONFIG_ATM_FORE200E_DEBUG=0
CONFIG_ATM_FORE200E=m
CONFIG_ATM_HE=m
CONFIG_ATM_HE_USE_SUNI=y
CONFIG_FDDI=y
CONFIG_DEFXX=m
CONFIG_SKFP=m
CONFIG_HIPPI=y
CONFIG_ROADRUNNER=m
# CONFIG_ROADRUNNER_LARGE_RINGS is not set
CONFIG_PLIP=m
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPP_MPPE=m
CONFIG_PPPOE=m
CONFIG_PPPOATM=m
CONFIG_SLIP=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
CONFIG_SLIP_MODE_SLIP6=y
CONFIG_NET_FC=y
CONFIG_SHAPER=m
CONFIG_NETCONSOLE=y
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=m
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
CONFIG_INPUT_TSDEV=m
CONFIG_INPUT_TSDEV_SCREEN_X=240
CONFIG_INPUT_TSDEV_SCREEN_Y=320
CONFIG_INPUT_EVDEV=m
CONFIG_INPUT_EVBUG=m

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_KEYBOARD_SUNKBD=m
CONFIG_KEYBOARD_LKKBD=m
CONFIG_KEYBOARD_XTKBD=m
CONFIG_KEYBOARD_NEWTON=m
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=m
CONFIG_MOUSE_SERIAL=m
CONFIG_MOUSE_INPORT=m
# CONFIG_MOUSE_ATIXL is not set
CONFIG_MOUSE_LOGIBM=m
CONFIG_MOUSE_PC110PAD=m
CONFIG_MOUSE_VSXXXAA=m
CONFIG_INPUT_JOYSTICK=y
CONFIG_JOYSTICK_ANALOG=m
CONFIG_JOYSTICK_A3D=m
CONFIG_JOYSTICK_ADI=m
CONFIG_JOYSTICK_COBRA=m
CONFIG_JOYSTICK_GF2K=m
CONFIG_JOYSTICK_GRIP=m
CONFIG_JOYSTICK_GRIP_MP=m
CONFIG_JOYSTICK_GUILLEMOT=m
CONFIG_JOYSTICK_INTERACT=m
CONFIG_JOYSTICK_SIDEWINDER=m
CONFIG_JOYSTICK_TMDC=m
CONFIG_JOYSTICK_IFORCE=m
CONFIG_JOYSTICK_IFORCE_USB=y
CONFIG_JOYSTICK_IFORCE_232=y
CONFIG_JOYSTICK_WARRIOR=m
CONFIG_JOYSTICK_MAGELLAN=m
CONFIG_JOYSTICK_SPACEORB=m
CONFIG_JOYSTICK_SPACEBALL=m
CONFIG_JOYSTICK_STINGER=m
# CONFIG_JOYSTICK_TWIDJOY is not set
CONFIG_JOYSTICK_DB9=m
CONFIG_JOYSTICK_GAMECON=m
CONFIG_JOYSTICK_TURBOGRAFX=m
CONFIG_JOYSTICK_JOYDUMP=m
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_GUNZE=m
# CONFIG_TOUCHSCREEN_ELO is not set
# CONFIG_TOUCHSCREEN_MTOUCH is not set
# CONFIG_TOUCHSCREEN_MK712 is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=m
# CONFIG_INPUT_WISTRON_BTNS is not set
CONFIG_INPUT_UINPUT=m

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=m
CONFIG_SERIO_CT82C710=m
CONFIG_SERIO_PARKBD=m
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
CONFIG_GAMEPORT=m
CONFIG_GAMEPORT_NS558=m
CONFIG_GAMEPORT_L4=m
CONFIG_GAMEPORT_EMU10K1=m
CONFIG_GAMEPORT_FM801=m

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
CONFIG_ROCKETPORT=m
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_ESPSERIAL is not set
# CONFIG_MOXA_INTELLIO is not set
CONFIG_MOXA_SMARTIO=m
# CONFIG_ISI is not set
CONFIG_SYNCLINK=m
CONFIG_SYNCLINKMP=m
CONFIG_N_HDLC=m
# CONFIG_RISCOM8 is not set
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
# CONFIG_RIO is not set
CONFIG_STALDRV=y
# CONFIG_STALLION is not set
# CONFIG_ISTALLION is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_CS=m
# CONFIG_SERIAL_8250_ACPI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
CONFIG_SERIAL_8250_RSA=y
# CONFIG_SERIAL_8250_FOURPORT is not set
# CONFIG_SERIAL_8250_ACCENT is not set
# CONFIG_SERIAL_8250_BOCA is not set
# CONFIG_SERIAL_8250_HUB6 is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_PRINTER=m
# CONFIG_LP_CONSOLE is not set
CONFIG_PPDEV=m
CONFIG_TIPAR=m

#
# IPMI
#
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m

#
# Watchdog Cards
#
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set

#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_ACQUIRE_WDT=m
CONFIG_ADVANTECH_WDT=m
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_SC520_WDT=m
CONFIG_EUROTECH_WDT=m
CONFIG_IB700_WDT=m
# CONFIG_IBMASR is not set
CONFIG_WAFER_WDT=m
# CONFIG_I6300ESB_WDT is not set
CONFIG_I8XX_TCO=m
CONFIG_SC1200_WDT=m
CONFIG_SCx200_WDT=m
CONFIG_60XX_WDT=m
# CONFIG_SBC8360_WDT is not set
CONFIG_CPU5_WDT=m
CONFIG_W83627HF_WDT=m
CONFIG_W83877F_WDT=m
# CONFIG_W83977F_WDT is not set
CONFIG_MACHZ_WDT=m

#
# ISA-based Watchdog Cards
#
CONFIG_PCWATCHDOG=m
CONFIG_MIXCOMWD=m
CONFIG_WDT=m
CONFIG_WDT_501=y

#
# PCI-based Watchdog Cards
#
CONFIG_PCIPCWATCHDOG=m
CONFIG_WDTPCI=m
CONFIG_WDT_501_PCI=y

#
# USB-based Watchdog Cards
#
CONFIG_USBPCWATCHDOG=m
CONFIG_HW_RANDOM=m
CONFIG_NVRAM=m
CONFIG_RTC=m
CONFIG_GEN_RTC=m
CONFIG_GEN_RTC_X=y
CONFIG_DTLK=m
CONFIG_R3964=m
CONFIG_APPLICOM=m
CONFIG_SONYPI=m

#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
CONFIG_AGP_ALI=m
CONFIG_AGP_ATI=m
CONFIG_AGP_AMD=m
CONFIG_AGP_AMD64=m
CONFIG_AGP_INTEL=m
CONFIG_AGP_NVIDIA=m
CONFIG_AGP_SIS=m
CONFIG_AGP_SWORKS=m
CONFIG_AGP_VIA=m
CONFIG_AGP_EFFICEON=m
CONFIG_DRM=m
CONFIG_DRM_TDFX=m
CONFIG_DRM_R128=m
CONFIG_DRM_RADEON=m
CONFIG_DRM_I810=m
CONFIG_DRM_I830=m
CONFIG_DRM_I915=m
CONFIG_DRM_MGA=m
CONFIG_DRM_SIS=m
CONFIG_DRM_VIA=m
CONFIG_DRM_SAVAGE=m

#
# PCMCIA character devices
#
CONFIG_SYNCLINK_CS=m
# CONFIG_CARDMAN_4000 is not set
# CONFIG_CARDMAN_4040 is not set
CONFIG_MWAVE=m
CONFIG_SCx200_GPIO=m
CONFIG_RAW_DRIVER=m
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y
CONFIG_MAX_RAW_DEVS=256
CONFIG_HANGCHECK_TIMER=m

#
# TPM devices
#
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set

#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m

#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
CONFIG_I2C_ALGOPCA=m

#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD756_S4882=m
CONFIG_I2C_AMD8111=m
# CONFIG_I2C_ELEKTOR is not set
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_ISA=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_PARPORT=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
CONFIG_SCx200_I2C=m
CONFIG_SCx200_I2C_SCL=12
CONFIG_SCx200_I2C_SDA=13
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
CONFIG_I2C_STUB=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
CONFIG_I2C_PCA_ISA=m

#
# Miscellaneous I2C Chip support
#
# CONFIG_SENSORS_DS1337 is not set
# CONFIG_SENSORS_DS1374 is not set
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
# CONFIG_SENSORS_PCA9539 is not set
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_RTC8564=m
# CONFIG_SENSORS_MAX6875 is not set
# CONFIG_RTC_X1205_I2C is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set

#
# Dallas's 1-wire bus
#
CONFIG_W1=m
CONFIG_W1_MATROX=m
CONFIG_W1_DS9490=m
CONFIG_W1_DS9490_BRIDGE=m
CONFIG_W1_THERM=m
CONFIG_W1_SMEM=m
# CONFIG_W1_DS2433 is not set

#
# Hardware Monitoring support
#
CONFIG_HWMON=y
CONFIG_HWMON_VID=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1031=m
# CONFIG_SENSORS_ADM9240 is not set
CONFIG_SENSORS_ASB100=m
# CONFIG_SENSORS_ATXP1 is not set
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_FSCHER=m
# CONFIG_SENSORS_FSCPOS is not set
CONFIG_SENSORS_GL518SM=m
# CONFIG_SENSORS_GL520SM is not set
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM63=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
# CONFIG_SENSORS_LM92 is not set
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_PC87360=m
# CONFIG_SENSORS_SIS5595 is not set
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_SMSC47B397=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
# CONFIG_SENSORS_W83792D is not set
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
# CONFIG_SENSORS_W83627EHF is not set
# CONFIG_SENSORS_HDAPS is not set
# CONFIG_HWMON_DEBUG_CHIP is not set

#
# Misc devices
#
CONFIG_IBM_ASM=m

#
# Multimedia Capabilities Port drivers
#

#
# Multimedia devices
#
CONFIG_VIDEO_DEV=m

#
# Video For Linux
#

#
# Video Adapters
#
CONFIG_VIDEO_BT848=m
# CONFIG_VIDEO_BT848_DVB is not set
# CONFIG_VIDEO_SAA6588 is not set
CONFIG_VIDEO_PMS=m
CONFIG_VIDEO_BWQCAM=m
CONFIG_VIDEO_CQCAM=m
CONFIG_VIDEO_W9966=m
CONFIG_VIDEO_CPIA=m
CONFIG_VIDEO_CPIA_PP=m
CONFIG_VIDEO_CPIA_USB=m
CONFIG_VIDEO_SAA5246A=m
CONFIG_VIDEO_SAA5249=m
CONFIG_TUNER_3036=m
CONFIG_VIDEO_STRADIS=m
CONFIG_VIDEO_ZORAN=m
CONFIG_VIDEO_ZORAN_BUZ=m
CONFIG_VIDEO_ZORAN_DC10=m
CONFIG_VIDEO_ZORAN_DC30=m
CONFIG_VIDEO_ZORAN_LML33=m
CONFIG_VIDEO_ZORAN_LML33R10=m
# CONFIG_VIDEO_ZR36120 is not set
CONFIG_VIDEO_MEYE=m
CONFIG_VIDEO_SAA7134=m
CONFIG_VIDEO_SAA7134_DVB=m
CONFIG_VIDEO_SAA7134_DVB_ALL_FRONTENDS=y
CONFIG_VIDEO_MXB=m
CONFIG_VIDEO_DPC=m
CONFIG_VIDEO_HEXIUM_ORION=m
CONFIG_VIDEO_HEXIUM_GEMINI=m
# CONFIG_VIDEO_CX88 is not set
# CONFIG_VIDEO_EM28XX is not set
CONFIG_VIDEO_OVCAMCHIP=m
# CONFIG_VIDEO_AUDIO_DECODER is not set
# CONFIG_VIDEO_DECODER is not set

#
# Radio Adapters
#
CONFIG_RADIO_CADET=m
CONFIG_RADIO_RTRACK=m
CONFIG_RADIO_RTRACK2=m
CONFIG_RADIO_AZTECH=m
CONFIG_RADIO_GEMTEK=m
CONFIG_RADIO_GEMTEK_PCI=m
CONFIG_RADIO_MAXIRADIO=m
CONFIG_RADIO_MAESTRO=m
CONFIG_RADIO_MIROPCM20=m
CONFIG_RADIO_MIROPCM20_RDS=m
CONFIG_RADIO_SF16FMI=m
CONFIG_RADIO_SF16FMR2=m
CONFIG_RADIO_TERRATEC=m
CONFIG_RADIO_TRUST=m
CONFIG_RADIO_TYPHOON=m
CONFIG_RADIO_TYPHOON_PROC_FS=y
CONFIG_RADIO_ZOLTRIX=m

#
# Digital Video Broadcasting Devices
#
CONFIG_DVB=y
CONFIG_DVB_CORE=m

#
# Supported SAA7146 based PCI Adapters
#
CONFIG_DVB_AV7110=m
CONFIG_DVB_AV7110_OSD=y
CONFIG_DVB_BUDGET=m
CONFIG_DVB_BUDGET_CI=m
CONFIG_DVB_BUDGET_AV=m
CONFIG_DVB_BUDGET_PATCH=m

#
# Supported USB Adapters
#
# CONFIG_DVB_USB is not set
CONFIG_DVB_TTUSB_BUDGET=m
CONFIG_DVB_TTUSB_DEC=m
CONFIG_DVB_CINERGYT2=m
# CONFIG_DVB_CINERGYT2_TUNING is not set

#
# Supported FlexCopII (B2C2) Adapters
#
# CONFIG_DVB_B2C2_FLEXCOP is not set

#
# Supported BT878 Adapters
#
CONFIG_DVB_BT8XX=m

#
# Supported Pluto2 Adapters
#
# CONFIG_DVB_PLUTO2 is not set

#
# Supported DVB Frontends
#

#
# Customise DVB Frontends
#

#
# DVB-S (satellite) frontends
#
CONFIG_DVB_STV0299=m
CONFIG_DVB_CX24110=m
CONFIG_DVB_TDA8083=m
CONFIG_DVB_TDA80XX=m
CONFIG_DVB_MT312=m
CONFIG_DVB_VES1X93=m
CONFIG_DVB_S5H1420=m

#
# DVB-T (terrestrial) frontends
#
CONFIG_DVB_SP8870=m
CONFIG_DVB_SP887X=m
CONFIG_DVB_CX22700=m
CONFIG_DVB_CX22702=m
CONFIG_DVB_L64781=m
CONFIG_DVB_TDA1004X=m
CONFIG_DVB_NXT6000=m
CONFIG_DVB_MT352=m
CONFIG_DVB_DIB3000MB=m
CONFIG_DVB_DIB3000MC=m

#
# DVB-C (cable) frontends
#
CONFIG_DVB_ATMEL_AT76C651=m
CONFIG_DVB_VES1820=m
CONFIG_DVB_TDA10021=m
CONFIG_DVB_STV0297=m

#
# ATSC (North American/Korean Terresterial DTV) frontends
#
CONFIG_DVB_NXT2002=m
CONFIG_DVB_NXT200X=m
CONFIG_DVB_OR51211=m
# CONFIG_DVB_OR51132 is not set
# CONFIG_DVB_BCM3510 is not set
CONFIG_DVB_LGDT330X=m
CONFIG_VIDEO_SAA7146=m
CONFIG_VIDEO_SAA7146_VV=m
CONFIG_VIDEO_VIDEOBUF=m
CONFIG_VIDEO_TUNER=m
CONFIG_VIDEO_BUF=m
CONFIG_VIDEO_BUF_DVB=m
CONFIG_VIDEO_BTCX=m
CONFIG_VIDEO_IR=m
CONFIG_VIDEO_TVEEPROM=m

#
# Graphics support
#
CONFIG_FB=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
# CONFIG_FB_MACMODES is not set
CONFIG_FB_MODE_HELPERS=y
CONFIG_FB_TILEBLITTING=y
CONFIG_FB_CIRRUS=m
CONFIG_FB_PM2=m
CONFIG_FB_PM2_FIFO_DISCONNECT=y
CONFIG_FB_CYBER2000=m
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
CONFIG_FB_HGA=m
# CONFIG_FB_HGA_ACCEL is not set
# CONFIG_FB_S1D13XXX is not set
CONFIG_FB_NVIDIA=m
CONFIG_FB_NVIDIA_I2C=y
CONFIG_FB_RIVA=m
CONFIG_FB_RIVA_I2C=y
CONFIG_FB_RIVA_DEBUG=y
CONFIG_FB_I810=m
# CONFIG_FB_I810_GTF is not set
CONFIG_FB_INTEL=m
# CONFIG_FB_INTEL_DEBUG is not set
CONFIG_FB_MATROX=m
CONFIG_FB_MATROX_MILLENIUM=y
CONFIG_FB_MATROX_MYSTIQUE=y
CONFIG_FB_MATROX_G=y
CONFIG_FB_MATROX_I2C=m
CONFIG_FB_MATROX_MAVEN=m
CONFIG_FB_MATROX_MULTIHEAD=y
CONFIG_FB_RADEON_OLD=m
CONFIG_FB_RADEON=m
CONFIG_FB_RADEON_I2C=y
# CONFIG_FB_RADEON_DEBUG is not set
CONFIG_FB_ATY128=m
CONFIG_FB_ATY=m
CONFIG_FB_ATY_CT=y
CONFIG_FB_ATY_GENERIC_LCD=y
# CONFIG_FB_ATY_XL_INIT is not set
CONFIG_FB_ATY_GX=y
CONFIG_FB_SAVAGE=m
# CONFIG_FB_SAVAGE_I2C is not set
# CONFIG_FB_SAVAGE_ACCEL is not set
CONFIG_FB_SIS=m
CONFIG_FB_SIS_300=y
CONFIG_FB_SIS_315=y
CONFIG_FB_NEOMAGIC=m
CONFIG_FB_KYRO=m
CONFIG_FB_3DFX=m
# CONFIG_FB_3DFX_ACCEL is not set
CONFIG_FB_VOODOO1=m
# CONFIG_FB_CYBLA is not set
CONFIG_FB_TRIDENT=m
# CONFIG_FB_TRIDENT_ACCEL is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_GEODE is not set
CONFIG_FB_VIRTUAL=m

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_MDA_CONSOLE=m
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE_ROTATION is not set
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y

#
# Logo configuration
#
# CONFIG_LOGO is not set
CONFIG_BACKLIGHT_LCD_SUPPORT=y
CONFIG_BACKLIGHT_CLASS_DEVICE=m
CONFIG_BACKLIGHT_DEVICE=y
CONFIG_LCD_CLASS_DEVICE=m
CONFIG_LCD_DEVICE=y

#
# Sound
#
CONFIG_SOUND=m

#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_AC97_BUS=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
CONFIG_SND_SEQ_RTCTIMER_DEFAULT=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
CONFIG_SND_GENERIC_DRIVER=y

#
# Generic devices
#
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
CONFIG_SND_OPL4_LIB=m
CONFIG_SND_VX_LIB=m
CONFIG_SND_DUMMY=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
CONFIG_SND_SERIAL_U16550=m
CONFIG_SND_MPU401=m

#
# ISA devices
#
CONFIG_SND_AD1848_LIB=m
CONFIG_SND_CS4231_LIB=m
CONFIG_SND_AD1816A=m
CONFIG_SND_AD1848=m
CONFIG_SND_CS4231=m
CONFIG_SND_CS4232=m
CONFIG_SND_CS4236=m
CONFIG_SND_ES968=m
CONFIG_SND_ES1688=m
CONFIG_SND_ES18XX=m
CONFIG_SND_GUS_SYNTH=m
CONFIG_SND_GUSCLASSIC=m
CONFIG_SND_GUSEXTREME=m
CONFIG_SND_GUSMAX=m
CONFIG_SND_INTERWAVE=m
CONFIG_SND_INTERWAVE_STB=m
CONFIG_SND_OPTI92X_AD1848=m
CONFIG_SND_OPTI92X_CS4231=m
CONFIG_SND_OPTI93X=m
CONFIG_SND_SB8=m
CONFIG_SND_SB16=m
CONFIG_SND_SBAWE=m
CONFIG_SND_SB16_CSP=y
CONFIG_SND_WAVEFRONT=m
CONFIG_SND_ALS100=m
CONFIG_SND_AZT2320=m
CONFIG_SND_CMI8330=m
CONFIG_SND_DT019X=m
CONFIG_SND_OPL3SA2=m
CONFIG_SND_SGALAXY=m
CONFIG_SND_SSCAPE=m

#
# PCI devices
#
CONFIG_SND_ALI5451=m
CONFIG_SND_ATIIXP=m
CONFIG_SND_ATIIXP_MODEM=m
CONFIG_SND_AU8810=m
CONFIG_SND_AU8820=m
CONFIG_SND_AU8830=m
CONFIG_SND_AZT3328=m
CONFIG_SND_BT87X=m
# CONFIG_SND_BT87X_OVERCLOCK is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
CONFIG_SND_CS4281=m
CONFIG_SND_EMU10K1=m
CONFIG_SND_EMU10K1X=m
CONFIG_SND_CA0106=m
CONFIG_SND_KORG1212=m
CONFIG_SND_MIXART=m
CONFIG_SND_NM256=m
CONFIG_SND_RME32=m
CONFIG_SND_RME96=m
CONFIG_SND_RME9652=m
CONFIG_SND_HDSP=m
CONFIG_SND_HDSPM=m
CONFIG_SND_TRIDENT=m
CONFIG_SND_YMFPCI=m
# CONFIG_SND_AD1889 is not set
CONFIG_SND_ALS4000=m
CONFIG_SND_CMIPCI=m
CONFIG_SND_ENS1370=m
CONFIG_SND_ENS1371=m
CONFIG_SND_ES1938=m
CONFIG_SND_ES1968=m
CONFIG_SND_MAESTRO3=m
CONFIG_SND_FM801=m
CONFIG_SND_FM801_TEA575X=m
CONFIG_SND_ICE1712=m
CONFIG_SND_ICE1724=m
CONFIG_SND_INTEL8X0=m
CONFIG_SND_INTEL8X0M=m
CONFIG_SND_SONICVIBES=m
CONFIG_SND_VIA82XX=m
CONFIG_SND_VIA82XX_MODEM=m
CONFIG_SND_VX222=m
CONFIG_SND_HDA_INTEL=m

#
# USB devices
#
CONFIG_SND_USB_AUDIO=m
CONFIG_SND_USB_USX2Y=m

#
# PCMCIA devices
#
# CONFIG_SND_VXPOCKET is not set
# CONFIG_SND_PDAUDIOCF is not set

#
# Open Sound System
#
CONFIG_SOUND_PRIME=m
# CONFIG_OBSOLETE_OSS_DRIVER is not set
CONFIG_SOUND_FUSION=m
CONFIG_SOUND_ICH=m
CONFIG_SOUND_TRIDENT=m
# CONFIG_SOUND_MSNDCLAS is not set
# CONFIG_SOUND_MSNDPIN is not set
CONFIG_SOUND_OSS=m
# CONFIG_SOUND_TRACEINIT is not set
# CONFIG_SOUND_DMAP is not set
# CONFIG_SOUND_AD1816 is not set
CONFIG_SOUND_AD1889=m
CONFIG_SOUND_ADLIB=m
CONFIG_SOUND_ACI_MIXER=m
CONFIG_SOUND_VMIDI=m
CONFIG_SOUND_TRIX=m
CONFIG_SOUND_MSS=m
CONFIG_SOUND_MPU401=m
CONFIG_SOUND_PAS=m
CONFIG_SOUND_PSS=m
CONFIG_PSS_MIXER=y
CONFIG_SOUND_SB=m
CONFIG_SOUND_OPL3SA2=m
CONFIG_SOUND_UART6850=m
CONFIG_SOUND_AEDSP16=m
CONFIG_SC6600=y
CONFIG_SC6600_JOY=y
CONFIG_SC6600_CDROM=4
CONFIG_SC6600_CDROMBASE=0x0
# CONFIG_AEDSP16_MSS is not set
# CONFIG_AEDSP16_SBPRO is not set
# CONFIG_AEDSP16_MPU401 is not set
CONFIG_SOUND_TVMIXER=m
CONFIG_SOUND_KAHLUA=m

#
# USB support
#
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set

#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
CONFIG_USB_BANDWIDTH=y
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set

#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
CONFIG_USB_EHCI_SPLIT_ISO=y
CONFIG_USB_EHCI_ROOT_HUB_TT=y
# CONFIG_USB_ISP116X_HCD is not set
CONFIG_USB_OHCI_HCD=m
# CONFIG_USB_OHCI_BIG_ENDIAN is not set
CONFIG_USB_OHCI_LITTLE_ENDIAN=y
CONFIG_USB_UHCI_HCD=m
CONFIG_USB_SL811_HCD=m
# CONFIG_USB_SL811_CS is not set

#
# USB Device Class drivers
#
# CONFIG_OBSOLETE_OSS_USB_DRIVER is not set
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# may also be needed; see USB_STORAGE Help for more information
#
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_DPCM=y
# CONFIG_USB_STORAGE_USBAT is not set
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y

#
# USB Input Devices
#
CONFIG_USB_HID=m
CONFIG_USB_HIDINPUT=y
# CONFIG_HID_FF is not set
CONFIG_USB_HIDDEV=y

#
# USB HID Boot Protocol drivers
#
CONFIG_USB_KBD=m
CONFIG_USB_MOUSE=m
CONFIG_USB_AIPTEK=m
CONFIG_USB_WACOM=m
# CONFIG_USB_ACECAD is not set
CONFIG_USB_KBTAB=m
CONFIG_USB_POWERMATE=m
CONFIG_USB_MTOUCH=m
# CONFIG_USB_ITMTOUCH is not set
CONFIG_USB_EGALAX=m
# CONFIG_USB_YEALINK is not set
CONFIG_USB_XPAD=m
CONFIG_USB_ATI_REMOTE=m
# CONFIG_USB_KEYSPAN_REMOTE is not set
# CONFIG_USB_APPLETOUCH is not set

#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m

#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set
CONFIG_USB_VICAM=m
CONFIG_USB_DSBR=m
CONFIG_USB_IBMCAM=m
CONFIG_USB_KONICAWC=m
CONFIG_USB_OV511=m
CONFIG_USB_SE401=m
CONFIG_USB_SN9C102=m
CONFIG_USB_STV680=m
# CONFIG_USB_W9968CF is not set
# CONFIG_USB_PWC is not set

#
# USB Network Adapters
#
CONFIG_USB_CATC=m
CONFIG_USB_KAWETH=m
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
CONFIG_USB_USBNET=m
CONFIG_USB_NET_AX8817X=m
CONFIG_USB_NET_CDCETHER=m
# CONFIG_USB_NET_GL620A is not set
CONFIG_USB_NET_NET1080=m
# CONFIG_USB_NET_PLUSB is not set
# CONFIG_USB_NET_RNDIS_HOST is not set
# CONFIG_USB_NET_CDC_SUBSET is not set
CONFIG_USB_NET_ZAURUS=m
# CONFIG_USB_ZD1201 is not set
CONFIG_USB_MON=y

#
# USB port drivers
#
CONFIG_USB_USS720=m

#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_AIRPRIME is not set
# CONFIG_USB_SERIAL_ANYDATA is not set
CONFIG_USB_SERIAL_BELKIN=m
# CONFIG_USB_SERIAL_WHITEHEAT is not set
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
# CONFIG_USB_SERIAL_CP2101 is not set
CONFIG_USB_SERIAL_CYPRESS_M8=m
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
CONFIG_USB_SERIAL_GARMIN=m
CONFIG_USB_SERIAL_IPW=m
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
CONFIG_USB_SERIAL_PL2303=m
# CONFIG_USB_SERIAL_HP4X is not set
CONFIG_USB_SERIAL_SAFE=m
# CONFIG_USB_SERIAL_SAFE_PADDED is not set
CONFIG_USB_SERIAL_TI=m
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
# CONFIG_USB_SERIAL_OPTION is not set
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_EZUSB=y

#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
CONFIG_USB_EMI26=m
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
CONFIG_USB_LEGOTOWER=m
CONFIG_USB_LCD=m
CONFIG_USB_LED=m
CONFIG_USB_CYTHERM=m
CONFIG_USB_PHIDGETKIT=m
CONFIG_USB_PHIDGETSERVO=m
CONFIG_USB_IDMOUSE=m
# CONFIG_USB_SISUSBVGA is not set
# CONFIG_USB_LD is not set
CONFIG_USB_TEST=m

#
# USB DSL modem support
#
CONFIG_USB_ATM=m
CONFIG_USB_SPEEDTOUCH=m
# CONFIG_USB_CXACRU is not set
# CONFIG_USB_XUSBATM is not set

#
# USB Gadget Support
#
CONFIG_USB_GADGET=m
# CONFIG_USB_GADGET_DEBUG_FILES is not set
CONFIG_USB_GADGET_SELECTED=y
CONFIG_USB_GADGET_NET2280=y
CONFIG_USB_NET2280=m
# CONFIG_USB_GADGET_PXA2XX is not set
# CONFIG_USB_GADGET_GOKU is not set
# CONFIG_USB_GADGET_LH7A40X is not set
# CONFIG_USB_GADGET_OMAP is not set
# CONFIG_USB_GADGET_DUMMY_HCD is not set
CONFIG_USB_GADGET_DUALSPEED=y
CONFIG_USB_ZERO=m
CONFIG_USB_ETH=m
CONFIG_USB_ETH_RNDIS=y
CONFIG_USB_GADGETFS=m
CONFIG_USB_FILE_STORAGE=m
# CONFIG_USB_FILE_STORAGE_TEST is not set
CONFIG_USB_G_SERIAL=m

#
# MMC/SD Card support
#
CONFIG_MMC=m
# CONFIG_MMC_DEBUG is not set
CONFIG_MMC_BLOCK=m
CONFIG_MMC_WBSD=m

#
# InfiniBand support
#
CONFIG_INFINIBAND=m
# CONFIG_INFINIBAND_USER_MAD is not set
# CONFIG_INFINIBAND_USER_ACCESS is not set
CONFIG_INFINIBAND_MTHCA=m
# CONFIG_INFINIBAND_MTHCA_DEBUG is not set
CONFIG_INFINIBAND_IPOIB=m
# CONFIG_INFINIBAND_IPOIB_DEBUG is not set
# CONFIG_INFINIBAND_SRP is not set

#
# SN Devices
#

#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_REISERFS_FS_XATTR is not set
CONFIG_JFS_FS=m
CONFIG_JFS_POSIX_ACL=y
CONFIG_JFS_SECURITY=y
# CONFIG_JFS_DEBUG is not set
CONFIG_JFS_STATISTICS=y
CONFIG_FS_POSIX_ACL=y
CONFIG_XFS_FS=m
CONFIG_XFS_EXPORT=y
# CONFIG_XFS_QUOTA is not set
CONFIG_XFS_SECURITY=y
CONFIG_XFS_POSIX_ACL=y
CONFIG_XFS_RT=y
CONFIG_MINIX_FS=m
CONFIG_ROMFS_FS=m
CONFIG_INOTIFY=y
CONFIG_QUOTA=y
CONFIG_QFMT_V1=m
CONFIG_QFMT_V2=m
CONFIG_QUOTACTL=y
CONFIG_DNOTIFY=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
CONFIG_FUSE_FS=m

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=936
CONFIG_FAT_DEFAULT_IOCHARSET="cp936"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
CONFIG_NTFS_RW=y

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
# CONFIG_PROC_VMCORE is not set
CONFIG_SYSFS=y
CONFIG_TMPFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
CONFIG_RELAYFS_FS=y

#
# Miscellaneous filesystems
#
CONFIG_ADFS_FS=m
# CONFIG_ADFS_FS_RW is not set
CONFIG_AFFS_FS=m
CONFIG_HFS_FS=m
CONFIG_HFSPLUS_FS=m
CONFIG_BEFS_FS=m
# CONFIG_BEFS_DEBUG is not set
CONFIG_BFS_FS=m
CONFIG_EFS_FS=m
CONFIG_JFFS_FS=m
CONFIG_JFFS_FS_VERBOSE=0
CONFIG_JFFS_PROC_FS=y
CONFIG_JFFS2_FS=m
CONFIG_JFFS2_FS_DEBUG=0
CONFIG_JFFS2_FS_WRITEBUFFER=y
# CONFIG_JFFS2_SUMMARY is not set
# CONFIG_JFFS2_COMPRESSION_OPTIONS is not set
CONFIG_JFFS2_ZLIB=y
CONFIG_JFFS2_RTIME=y
# CONFIG_JFFS2_RUBIN is not set
CONFIG_CRAMFS=y
CONFIG_VXFS_FS=m
CONFIG_HPFS_FS=m
CONFIG_QNX4FS_FS=m
# CONFIG_QNX4FS_RW is not set
CONFIG_SYSV_FS=m
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set

#
# Network File Systems
#
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFS_DIRECTIO=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_SPKM3=m
CONFIG_SMB_FS=m
CONFIG_SMB_NLS_DEFAULT=y
CONFIG_SMB_NLS_REMOTE="cp936"
CONFIG_CIFS=m
# CONFIG_CIFS_STATS is not set
# CONFIG_CIFS_XATTR is not set
# CONFIG_CIFS_EXPERIMENTAL is not set
CONFIG_NCP_FS=m
CONFIG_NCPFS_PACKET_SIGNING=y
CONFIG_NCPFS_IOCTL_LOCKING=y
CONFIG_NCPFS_STRONG=y
CONFIG_NCPFS_NFS_NS=y
CONFIG_NCPFS_OS2_NS=y
# CONFIG_NCPFS_SMALLDOS is not set
CONFIG_NCPFS_NLS=y
CONFIG_NCPFS_EXTRAS=y
CONFIG_CODA_FS=m
# CONFIG_CODA_FS_OLD_API is not set
CONFIG_AFS_FS=m
CONFIG_RXRPC=m
CONFIG_9P_FS=m

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
CONFIG_ACORN_PARTITION=y
CONFIG_ACORN_PARTITION_CUMANA=y
# CONFIG_ACORN_PARTITION_EESOX is not set
CONFIG_ACORN_PARTITION_ICS=y
# CONFIG_ACORN_PARTITION_ADFS is not set
# CONFIG_ACORN_PARTITION_POWERTEC is not set
CONFIG_ACORN_PARTITION_RISCIX=y
CONFIG_OSF_PARTITION=y
CONFIG_AMIGA_PARTITION=y
CONFIG_ATARI_PARTITION=y
CONFIG_MAC_PARTITION=y
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
CONFIG_LDM_PARTITION=y
# CONFIG_LDM_DEBUG is not set
CONFIG_SGI_PARTITION=y
CONFIG_ULTRIX_PARTITION=y
CONFIG_SUN_PARTITION=y
CONFIG_EFI_PARTITION=y

#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="cp437"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
CONFIG_NLS_ASCII=m
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=m

#
# Instrumentation Support
#
CONFIG_PROFILING=y
CONFIG_OPROFILE=m
CONFIG_KPROBES=y

#
# Kernel hacking
#
CONFIG_PRINTK_TIME=y
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_DETECT_SOFTLOCKUP=y
CONFIG_SCHEDSTATS=y
# CONFIG_DEBUG_SLAB is not set
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_VM=y
CONFIG_FRAME_POINTER=y
CONFIG_RCU_TORTURE_TEST=m
# CONFIG_EARLY_PRINTK is not set
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_4KSTACKS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y

#
# Security options
#
CONFIG_KEYS=y
# CONFIG_KEYS_DEBUG_PROC_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=m
CONFIG_SECURITY_ROOTPLUG=m
CONFIG_SECURITY_SECLVL=m
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=0
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_AVC_STATS=y
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE=1

#
# Cryptographic options
#
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=m
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_WP512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_AES_586=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_ANUBIS=m
CONFIG_CRYPTO_DEFLATE=m
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_CRC32C=m
CONFIG_CRYPTO_TEST=m

#
# Hardware crypto devices
#
CONFIG_CRYPTO_DEV_PADLOCK=m
CONFIG_CRYPTO_DEV_PADLOCK_AES=y

#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
CONFIG_CRC32=y
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_REED_SOLOMON=m
CONFIG_REED_SOLOMON_DEC16=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
Andrew Morton | 2 Dec 02:30

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote: > > 850 sc->nr_to_reclaim = sc->swap_cluster_max; > 851 > 852 while (nr_active || nr_inactive) { > //... > 860 if (nr_inactive) { > 861 sc->nr_to_scan = min(nr_inactive, > 862 (unsigned long)sc->swap_cluster_max); > 863 nr_inactive -= sc->nr_to_scan; > 864 shrink_cache(zone, sc); > 865 if (sc->nr_to_reclaim <= 0) > 866 break; > 867 } > 868 } > > Line 843 is the core of the scan balancing logic: > > priority 12 11 10 > > On each call nr_scan_inactive is increased by: > DMA(2k pages) +1 +2 +3 > Normal(64k pages) +17 +33 +65 > > Round it up to SWAP_CLUSTER_MAX=32, we get (scan batches/accumulate rounds): > DMA 1/32 1/16 2/11 > Normal 2/2 2/1 3/1 > DMA:Normal ratio 1:32 1:32 2:33 > > This keeps the scan rate roughly balanced(i.e. 1:32) in low vm pressure. > > But lines 865-866 together with line 846 make most shrink_zone() invocations > only run one batch of scan. The numbers become:
True. Need to go into a huddle with the changelogs, but I have a feeling that lines 865 and 866 aren't very important. What happens if we remove them?
Wu Fengguang | 2 Dec 03:04
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Thu, Dec 01, 2005 at 05:30:15PM -0800, Andrew Morton wrote: > > But lines 865-866 together with line 846 make most shrink_zone() invocations > > only run one batch of scan. The numbers become: > > True. Need to go into a huddle with the changelogs, but I have a feeling > that lines 865 and 866 aren't very important. What happens if we remove > them?
Maybe the answer is: can we accept to free 15M memory at one time for a 64G zone? (Or can we simply increase the DEF_PRIORITY?) btw, maybe it's time to lower the low_mem_reserve. There should be no need to keep ~50M free memory with the balancing patch. Regards, Wu
Nick Piggin | 2 Dec 03:27
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

Wu Fengguang wrote:
> On Thu, Dec 01, 2005 at 05:30:15PM -0800, Andrew Morton wrote:
> 
>>> But lines 865-866 together with line 846 make most shrink_zone() invocations
>>> only run one batch of scan. The numbers become:
>>
>>True.  Need to go into a huddle with the changelogs, but I have a feeling
>>that lines 865 and 866 aren't very important.  What happens if we remove
>>them?
> 
> 
> Maybe the answer is: can we accept to free 15M memory at one time for a 64G zone?
> (Or can we simply increase the DEF_PRIORITY?)
> 

0.02% of the memory? Why not? I think you should be more worried
about what happens when the priority winds up.

I think your proposal to synch reclaim rates between zones is fine
when all pages have similar properties, but could behave strangely
when you do have different requirements on different zones.

> btw, maybe it's time to lower the low_mem_reserve.
> There should be no need to keep ~50M free memory with the balancing patch.
> 

min_free_kbytes? This number really isn't anything to do with balancing
and more to do with the amount of reserve kept for things like GFP_ATOMIC
and recursive allocations. Let's not lower it ;)

--

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

Wu Fengguang | 2 Dec 03:43
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

On Fri, Dec 02, 2005 at 01:27:06PM +1100, Nick Piggin wrote:
> Wu Fengguang wrote:
> >On Thu, Dec 01, 2005 at 05:30:15PM -0800, Andrew Morton wrote:
> >
> >>>But lines 865-866 together with line 846 make most shrink_zone() 
> >>>invocations
> >>>only run one batch of scan. The numbers become:
> >>
> >>True.  Need to go into a huddle with the changelogs, but I have a feeling
> >>that lines 865 and 866 aren't very important.  What happens if we remove
> >>them?
> >
> >
> >Maybe the answer is: can we accept to free 15M memory at one time for a 
> >64G zone?
> >(Or can we simply increase the DEF_PRIORITY?)
> >
> 
> 0.02% of the memory? Why not? I think you should be more worried
> about what happens when the priority winds up.

Yes, sounds reasonable.

> I think your proposal to synch reclaim rates between zones is fine
> when all pages have similar properties, but could behave strangely
> when you do have different requirements on different zones.

Thanks.
That requirement might be addressed by disabling the feature on specific zones,
or attaching them with a shrinker.seeks like ratio, or something else...

> >btw, maybe it's time to lower the low_mem_reserve.
> >There should be no need to keep ~50M free memory with the balancing patch.
> >
> 
> min_free_kbytes? This number really isn't anything to do with balancing
> and more to do with the amount of reserve kept for things like GFP_ATOMIC
> and recursive allocations. Let's not lower it ;)

ok :)

Regards,
Wu
Andrea Arcangeli | 2 Dec 03:36
Picon
Favicon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Fri, Dec 02, 2005 at 01:27:06PM +1100, Nick Piggin wrote: > min_free_kbytes? This number really isn't anything to do with balancing > and more to do with the amount of reserve kept for things like GFP_ATOMIC > and recursive allocations. Let's not lower it ;)
Agreed. Or at the very least that should be discussed in a separate thread, it has no relation with shrink_cache changes or anything else related to zone aging IMHO.
Andrea Arcangeli | 2 Dec 03:18
Picon
Favicon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Fri, Dec 02, 2005 at 10:04:07AM +0800, Wu Fengguang wrote: > btw, maybe it's time to lower the low_mem_reserve. > There should be no need to keep ~50M free memory with the balancing patch.
low_mem_reserve is indipendent from shrink_cache, because shrink_cache can't free unfreeable pinned memory. If you want to remove low_mem_reserve you'd better start by adding migration of memory across the zones with pte updates etc... That would at least mitigate the effect of anonymous memory w/o swap. But low_mem_reserve is still needed for all other kind of allocations like kmalloc or pci_alloc_consistent (i.e. not relocatable) etc...
Andrew Morton | 2 Dec 05:45

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


Andrea Arcangeli <andrea <at> suse.de> wrote: > > low_mem_reserve
I've a suspicion that the addition of the dma32 zone might have broken this.
Wu Fengguang | 2 Dec 07:38
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

On Thu, Dec 01, 2005 at 08:45:49PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea <at> suse.de> wrote:
> >
> > low_mem_reserve
> 
> I've a suspicion that the addition of the dma32 zone might have
> broken this.

And there is a danger of (the last zone != the largest zone). This breaks my
assumption. Either we should remove the two lines in shrink_zone():

>      865                         if (sc->nr_to_reclaim <= 0)
>      866                                 break;

Or explicitly add more weight to the balancing efforts with
mm-add-weight-to-reclaim-for-aging.patch below.

Thanks,
Wu

Subject: mm: add more weight to reclaim for aging
Cc: Marcelo Tosatti <marcelo.tosatti <at> cyclades.com>, Magnus Damm <magnus.damm <at> gmail.com>
Cc: Nick Piggin <npiggin <at> suse.de>, Andrea Arcangeli <andrea <at> suse.de>

Let HighMem = the last zone, we get in normal cases:
- HighMem zone is the largest zone
- HighMem zone is mainly reclaimed for watermark, other zones is almost always
  reclaimed for aging
- While HighMem is reclaimed N times for watermark, other zones has N+1 chances
  to reclaim for aging
- shrink_zone() only scans one chunk of SWAP_CLUSTER_MAX pages to get
  SWAP_CLUSTER_MAX free pages

In the above situation, the force of balancing will win out the force of
unbalancing. But if HighMem(or the last zone) is not the largest zone, the
other larger zones can no longer catch up.

This patch multiplies the force of balancing by 8 times, which should be more
than enough.  It just prevents shrink_zone() to return prematurely, and will
not cause DMA zone to be scanned more than SWAP_CLUSTER_MAX at one time in
normal cases.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -1453,12 +1453,14 @@ loop_again:
 							SC_RECLAIM_FROM_KSWAPD|
 							SC_RECLAIM_FOR_WATERMARK);
 					all_zones_ok = 0;
+					sc.nr_to_reclaim = SWAP_CLUSTER_MAX;
 				} else if (zone == youngest_zone &&
 						pages_more_aged(oldest_zone,
 								youngest_zone)) {
 					debug_reclaim(&sc,
 							SC_RECLAIM_FROM_KSWAPD|
 							SC_RECLAIM_FOR_AGING);
+					sc.nr_to_reclaim = SWAP_CLUSTER_MAX * 8;
 				} else
 					continue;
 			}
Wu Fengguang | 2 Dec 03:37
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Fri, Dec 02, 2005 at 03:18:11AM +0100, Andrea Arcangeli wrote: > On Fri, Dec 02, 2005 at 10:04:07AM +0800, Wu Fengguang wrote: > > btw, maybe it's time to lower the low_mem_reserve. > > There should be no need to keep ~50M free memory with the balancing patch. > > low_mem_reserve is indipendent from shrink_cache, because shrink_cache can't > free unfreeable pinned memory. > > If you want to remove low_mem_reserve you'd better start by adding > migration of memory across the zones with pte updates etc... That would > at least mitigate the effect of anonymous memory w/o swap. But > low_mem_reserve is still needed for all other kind of allocations like > kmalloc or pci_alloc_consistent (i.e. not relocatable) etc...
Thanks for the clarification, I was concerning too much ;) Regards, Wu
Andrea Arcangeli | 2 Dec 03:52
Picon
Favicon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging


On Fri, Dec 02, 2005 at 10:37:27AM +0800, Wu Fengguang wrote: > Thanks for the clarification, I was concerning too much ;)
You're welcome. I'm also not concerned because the cost is linear with the amount of memory (and the cost has an high bound, that is the size of the lower zones, so it's not like the struct page that is a percentage of ram guaranteed to be lost) so it's generally not noticeable at runtime, and it's most important in the big systems (where in turn the cost is higher).
Wu Fengguang | 1 Dec 13:11
Picon

Re: [PATCH 02/12] mm: supporting variables and functions for balanced zone aging

On Thu, Dec 01, 2005 at 02:37:14AM -0800, Andrew Morton wrote:
> Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote:
> >
> >  The zone aging rates are currently imbalanced,
> 
> ZONE_DMA is out of whack.  It shouldn't be, and I'm not aware of anyone
> getting in and working out why.  I certainly wouldn't want to go and add
> all this stuff without having a good understanding of _why_ it's out of
> whack.  Perhaps it's just some silly bug, like the thing I pointed at in
> the previous email.

Yep, my rule is that if ever the DMA zone is reclaimed for watermark, it will
be running wild ;) So I leave it out by setting classzone_idx=0, and let the
age balancing code to catch it up. This scheme works fine: tested to be OK from
64M to 2G memory.

> > the gap can be as large as 3 times,
> 
> What's the testcase?
> 
> > which can severely damage read-ahead requests and shorten their
> >  effective life time.
> 
> Have you any performance numbers for this?

That's months ago, if I remember it right, the number of concurrent readers the
adaptive read-ahead code can handle without much thrashing was raised from ~100
to 800 with the balancing work.

This is my original announce back then:

    The page aging problem showed up when I was testing many slow reads with
    limited memory. Pages in the DMA zone were found out to be aged about 3
    times faster than that of Normal zone in systems with 64-512M memory.
    That is a BIG threat to the read-ahead pages. So I added some code to
    make the aging rates synchronized. You can see the effect by running:
    $ tar c / | cat > /dev/null &
    $ watch -n1 'grep "age " /proc/zoneinfo'
    There are still some extra DMA scans in the direct page reclaim path.
    It tend to happen in large memory system and is therefore not a big
    problem.

And here is some numbers collected by Magnus Damm:
http://lkml.org/lkml/2005/10/25/50

Good night!
Wu
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 07/12] mm: remove unnecessary variable and loop

shrink_cache() and refill_inactive_zone() do not need loops.

Simplify them to scan one chunk at a time.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   92 ++++++++++++++++++++++++++++--------------------------------
 1 files changed, 43 insertions(+), 49 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -899,63 +899,58 @@ static void shrink_cache(struct zone *zo
 {
 	LIST_HEAD(page_list);
 	struct pagevec pvec;
-	int max_scan = sc->nr_to_scan;
+	struct page *page;
+	int nr_taken;
+	int nr_scan;
+	int nr_freed;

 	pagevec_init(&pvec, 1);

 	lru_add_drain();
 	spin_lock_irq(&zone->lru_lock);
-	while (max_scan > 0) {
-		struct page *page;
-		int nr_taken;
-		int nr_scan;
-		int nr_freed;
-
-		nr_taken = isolate_lru_pages(sc->nr_to_scan,
-					     &zone->inactive_list,
-					     &page_list, &nr_scan);
-		zone->nr_inactive -= nr_taken;
-		zone->pages_scanned += nr_scan;
-		update_zone_age(zone, nr_scan);
-		spin_unlock_irq(&zone->lru_lock);
+	nr_taken = isolate_lru_pages(sc->nr_to_scan,
+				     &zone->inactive_list,
+				     &page_list, &nr_scan);
+	zone->nr_inactive -= nr_taken;
+	zone->pages_scanned += nr_scan;
+	update_zone_age(zone, nr_scan);
+	spin_unlock_irq(&zone->lru_lock);

-		if (nr_taken == 0)
-			goto done;
+	if (nr_taken == 0)
+		return;

-		max_scan -= nr_scan;
-		sc->nr_scanned += nr_scan;
-		if (current_is_kswapd())
-			mod_page_state_zone(zone, pgscan_kswapd, nr_scan);
-		else
-			mod_page_state_zone(zone, pgscan_direct, nr_scan);
-		nr_freed = shrink_list(&page_list, sc);
-		if (current_is_kswapd())
-			mod_page_state(kswapd_steal, nr_freed);
-		mod_page_state_zone(zone, pgsteal, nr_freed);
-		sc->nr_to_reclaim -= nr_freed;
+	sc->nr_scanned += nr_scan;
+	if (current_is_kswapd())
+		mod_page_state_zone(zone, pgscan_kswapd, nr_scan);
+	else
+		mod_page_state_zone(zone, pgscan_direct, nr_scan);
+	nr_freed = shrink_list(&page_list, sc);
+	if (current_is_kswapd())
+		mod_page_state(kswapd_steal, nr_freed);
+	mod_page_state_zone(zone, pgsteal, nr_freed);
+	sc->nr_to_reclaim -= nr_freed;

-		spin_lock_irq(&zone->lru_lock);
-		/*
-		 * Put back any unfreeable pages.
-		 */
-		while (!list_empty(&page_list)) {
-			page = lru_to_page(&page_list);
-			if (TestSetPageLRU(page))
-				BUG();
-			list_del(&page->lru);
-			if (PageActive(page))
-				add_page_to_active_list(zone, page);
-			else
-				add_page_to_inactive_list(zone, page);
-			if (!pagevec_add(&pvec, page)) {
-				spin_unlock_irq(&zone->lru_lock);
-				__pagevec_release(&pvec);
-				spin_lock_irq(&zone->lru_lock);
-			}
+	spin_lock_irq(&zone->lru_lock);
+	/*
+	 * Put back any unfreeable pages.
+	 */
+	while (!list_empty(&page_list)) {
+		page = lru_to_page(&page_list);
+		if (TestSetPageLRU(page))
+			BUG();
+		list_del(&page->lru);
+		if (PageActive(page))
+			add_page_to_active_list(zone, page);
+		else
+			add_page_to_inactive_list(zone, page);
+		if (!pagevec_add(&pvec, page)) {
+			spin_unlock_irq(&zone->lru_lock);
+			__pagevec_release(&pvec);
+			spin_lock_irq(&zone->lru_lock);
 		}
-  	}
+	}
 	spin_unlock_irq(&zone->lru_lock);
-done:
+
 	pagevec_release(&pvec);
 }

@@ -982,7 +977,6 @@ refill_inactive_zone(struct zone *zone, 
 	int pgmoved;
 	int pgdeactivate = 0;
 	int pgscanned;
-	int nr_pages = sc->nr_to_scan;
 	LIST_HEAD(l_hold);	/* The pages which were snipped off */
 	LIST_HEAD(l_inactive);	/* Pages to go onto the inactive_list */
 	LIST_HEAD(l_active);	/* Pages to go onto the active_list */
@@ -995,7 +989,7 @@ refill_inactive_zone(struct zone *zone, 

 	lru_add_drain();
 	spin_lock_irq(&zone->lru_lock);
-	pgmoved = isolate_lru_pages(nr_pages, &zone->active_list,
+	pgmoved = isolate_lru_pages(sc->nr_to_scan, &zone->active_list,
 				    &l_hold, &pgscanned);
 	zone->pages_scanned += pgscanned;
 	zone->nr_active -= pgmoved;

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 03/12] mm: balance zone aging in direct reclaim path

Add 10 extra priorities to the direct page reclaim path, which makes 10 round of
balancing effort(reclaim only from the least aged local/headless zone) before
falling back to the reclaim-all scheme.

Ten rounds should be enough to get enough free pages in normal cases, which
prevents unnecessarily disturbing remote nodes. If further restrict the first
round of page allocation to local zones, we might get what the early zone
reclaim patch want: memory affinity/locality.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   31 ++++++++++++++++++++++++++++---
 1 files changed, 28 insertions(+), 3 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -1186,6 +1186,7 @@ static void
 shrink_caches(struct zone **zones, struct scan_control *sc)
 {
 	int i;
+	struct zone *z = NULL;

 	for (i = 0; zones[i] != NULL; i++) {
 		struct zone *zone = zones[i];
@@ -1200,11 +1201,34 @@ shrink_caches(struct zone **zones, struc
 		if (zone->prev_priority > sc->priority)
 			zone->prev_priority = sc->priority;

-		if (zone->all_unreclaimable && sc->priority != DEF_PRIORITY)
+		if (zone->all_unreclaimable && sc->priority < DEF_PRIORITY)
 			continue;	/* Let kswapd poll it */

+		/*
+		 * Balance page aging in local zones and following headless
+		 * zones.
+		 */
+		if (sc->priority > DEF_PRIORITY) {
+			if (zone->zone_pgdat != zones[0]->zone_pgdat) {
+				cpumask_t cpu = node_to_cpumask(
+						zone->zone_pgdat->node_id);
+				if (!cpus_empty(cpu))
+					break;
+			}
+
+			if (!z)
+				z = zone;
+			else if (pages_more_aged(z, zone))
+				z = zone;
+
+			continue;
+		}
+
 		shrink_zone(zone, sc);
 	}
+
+	if (z)
+		shrink_zone(z, sc);
 }

 /*
@@ -1248,7 +1272,8 @@ int try_to_free_pages(struct zone **zone
 		lru_pages += zone->nr_active + zone->nr_inactive;
 	}

-	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
+	/* The added 10 priorities are for scan rate balancing */
+	for (priority = DEF_PRIORITY + 10; priority >= 0; priority--) {
 		sc.nr_mapped = read_page_state(nr_mapped);
 		sc.nr_scanned = 0;
 		sc.nr_reclaimed = 0;
@@ -1282,7 +1307,7 @@ int try_to_free_pages(struct zone **zone
 		}

 		/* Take a nap, wait for some writeback to complete */
-		if (sc.nr_scanned && priority < DEF_PRIORITY - 2)
+		if (sc.nr_scanned && priority < DEF_PRIORITY)
 			blk_congestion_wait(WRITE, HZ/10);
 	}
 out:

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 05/12] mm: balance slab aging

The current slab shrinking code is way too fragile.
Let it manage aging pace by itself, and provide a simple and robust interface.

The design considerations:
- use the same syncing facilities as that of the zones
- keep the age of slabs in line with that of the largest zone
  this in effect makes aging rate of slabs follow that of the most aged node.

- reserve a minimal number of unused slabs
  the size of reservation depends on vm pressure

- shrink more slab caches only when vm pressure is high
  the old logic, `mmap pages found' - `shrink more caches' - `avoid swapping',
  sounds not quite logical, so the code is removed.

- let sc->nr_scanned record the exact number of cold pages scanned
  it is no longer used by the slab cache shrinking algorithm, but good for other
  algorithms(e.g. the active_list/inactive_list balancing).

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 include/linux/mm.h |    4 +
 mm/vmscan.c        |  118 +++++++++++++++++++++++------------------------------
 2 files changed, 55 insertions(+), 67 deletions(-)

--- linux.orig/include/linux/mm.h
+++ linux/include/linux/mm.h
@@ -798,7 +798,9 @@ struct shrinker {
 	shrinker_t		shrinker;
 	struct list_head	list;
 	int			seeks;	/* seeks to recreate an obj */
-	long			nr;	/* objs pending delete */
+	unsigned long		aging_total;
+	unsigned long		aging_milestone;
+	unsigned long		page_age;
 	struct shrinker_stats	*s_stats;
 };

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -161,6 +161,18 @@ static inline void update_zone_age(struc
 						<< PAGE_AGE_SHIFT) / len;
 }

+static inline void update_slab_age(struct shrinker *s,
+					unsigned long len, int nr_scan)
+{
+	s->aging_total += nr_scan;
+
+	if (s->aging_total - s->aging_milestone > len)
+		s->aging_milestone += len;
+
+	s->page_age = ((s->aging_total - s->aging_milestone)
+						<< PAGE_AGE_SHIFT) / len;
+}
+
 /*
  * Add a shrinker callback to be called from the vm
  */
@@ -172,7 +184,9 @@ struct shrinker *set_shrinker(int seeks,
         if (shrinker) {
 	        shrinker->shrinker = theshrinker;
 	        shrinker->seeks = seeks;
-	        shrinker->nr = 0;
+	        shrinker->aging_total = 0;
+	        shrinker->aging_milestone = 0;
+	        shrinker->page_age = 0;
 		shrinker->s_stats = alloc_percpu(struct shrinker_stats);
 		if (!shrinker->s_stats) {
 			kfree(shrinker);
@@ -208,80 +222,61 @@ EXPORT_SYMBOL(remove_shrinker);
  * percentages of the lru and ageable caches.  This should balance the seeks
  * generated by these structures.
  *
- * If the vm encounted mapped pages on the LRU it increase the pressure on
- * slab to avoid swapping.
- *
- * We do weird things to avoid (scanned*seeks*entries) overflowing 32 bits.
- *
- * `lru_pages' represents the number of on-LRU pages in all the zones which
- * are eligible for the caller's allocation attempt.  It is used for balancing
- * slab reclaim versus page reclaim.
+ * If the vm pressure is high, shrink the slabs more.
  *
  * Returns the number of slab objects which we shrunk.
  */
-static int shrink_slab(unsigned long scanned, gfp_t gfp_mask,
-			unsigned long lru_pages)
+static int shrink_slab(gfp_t gfp_mask)
 {
 	struct shrinker *shrinker;
-	int ret = 0;
-
-	if (scanned == 0)
-		scanned = SWAP_CLUSTER_MAX;
+	struct pglist_data *pgdat;
+	struct zone *zone;
+	int n;

 	if (!down_read_trylock(&shrinker_rwsem))
 		return 1;	/* Assume we'll be able to shrink next time */

-	list_for_each_entry(shrinker, &shrinker_list, list) {
-		unsigned long long delta;
-		unsigned long total_scan;
-		unsigned long max_pass = (*shrinker->shrinker)(0, gfp_mask);
-
-		delta = (4 * scanned) / shrinker->seeks;
-		delta *= max_pass;
-		do_div(delta, lru_pages + 1);
-		shrinker->nr += delta;
-		if (shrinker->nr < 0) {
-			printk(KERN_ERR "%s: nr=%ld\n",
-					__FUNCTION__, shrinker->nr);
-			shrinker->nr = max_pass;
-		}
+	/* find the major zone for the slabs to catch up age with */
+	pgdat = NODE_DATA(numa_node_id());
+	zone = pgdat->node_zones;
+	for (n = 1; n < pgdat->nr_zones; n++) {
+		struct zone *z = pgdat->node_zones + n;

-		/*
-		 * Avoid risking looping forever due to too large nr value:
-		 * never try to free more than twice the estimate number of
-		 * freeable entries.
-		 */
-		if (shrinker->nr > max_pass * 2)
-			shrinker->nr = max_pass * 2;
-
-		total_scan = shrinker->nr;
-		shrinker->nr = 0;
+		if (zone->present_pages < z->present_pages)
+			zone = z;
+	}

-		while (total_scan >= SHRINK_BATCH) {
-			long this_scan = SHRINK_BATCH;
-			int shrink_ret;
+	n = 0;
+	list_for_each_entry(shrinker, &shrinker_list, list) {
+		while (pages_more_aged(zone, shrinker)) {
 			int nr_before;
+			int nr_after;

 			nr_before = (*shrinker->shrinker)(0, gfp_mask);
-			shrink_ret = (*shrinker->shrinker)(this_scan, gfp_mask);
-			if (shrink_ret == -1)
+			if (nr_before <= SHRINK_BATCH * zone->prev_priority)
+				break;
+
+			nr_after = (*shrinker->shrinker)(SHRINK_BATCH, gfp_mask);
+			if (nr_after == -1)
 				break;
-			if (shrink_ret < nr_before) {
-				ret += nr_before - shrink_ret;
-				shrinker_stat_add(shrinker, nr_freed,
-					(nr_before - shrink_ret));
+
+			if (nr_after < nr_before) {
+				int nr_freed = nr_before - nr_after;
+
+				n += nr_freed;
+				shrinker_stat_add(shrinker, nr_freed, nr_freed);
 			}
-			shrinker_stat_add(shrinker, nr_req, this_scan);
-			mod_page_state(slabs_scanned, this_scan);
-			total_scan -= this_scan;
+			shrinker_stat_add(shrinker, nr_req, SHRINK_BATCH);
+			mod_page_state(slabs_scanned, SHRINK_BATCH);
+			update_slab_age(shrinker, nr_before * DEF_PRIORITY,
+						SHRINK_BATCH * shrinker->seeks *
+							zone->prev_priority);

 			cond_resched();
 		}
-
-		shrinker->nr += total_scan;
 	}
 	up_read(&shrinker_rwsem);
-	return ret;
+	return n;
 }

 /* Called without lock on whether page is mapped, so answer is unstable */
@@ -484,11 +479,6 @@ static int shrink_list(struct list_head 

 		BUG_ON(PageActive(page));

-		sc->nr_scanned++;
-		/* Double the slab pressure for mapped and swapcache pages */
-		if (page_mapped(page) || PageSwapCache(page))
-			sc->nr_scanned++;
-
 		if (PageWriteback(page))
 			goto keep_locked;

@@ -933,6 +923,7 @@ static void shrink_cache(struct zone *zo
 			goto done;

 		max_scan -= nr_scan;
+		sc->nr_scanned += nr_scan;
 		if (current_is_kswapd())
 			mod_page_state_zone(zone, pgscan_kswapd, nr_scan);
 		else
@@ -1251,7 +1242,6 @@ int try_to_free_pages(struct zone **zone
 	int total_scanned = 0, total_reclaimed = 0;
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct scan_control sc;
-	unsigned long lru_pages = 0;
 	int i;

 	delay_prefetch();
@@ -1269,7 +1259,6 @@ int try_to_free_pages(struct zone **zone
 			continue;

 		zone->temp_priority = DEF_PRIORITY;
-		lru_pages += zone->nr_active + zone->nr_inactive;
 	}

 	/* The added 10 priorities are for scan rate balancing */
@@ -1282,7 +1271,7 @@ int try_to_free_pages(struct zone **zone
 		if (!priority)
 			disable_swap_token();
 		shrink_caches(zones, &sc);
-		shrink_slab(sc.nr_scanned, gfp_mask, lru_pages);
+		shrink_slab(gfp_mask);
 		if (reclaim_state) {
 			sc.nr_reclaimed += reclaim_state->reclaimed_slab;
 			reclaim_state->reclaimed_slab = 0;
@@ -1386,8 +1375,6 @@ loop_again:
 	}

 	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
-		unsigned long lru_pages = 0;
-
 		all_zones_ok = 1;
 		sc.nr_scanned = 0;
 		sc.nr_reclaimed = 0;
@@ -1431,7 +1418,6 @@ loop_again:
 			zone->temp_priority = priority;
 			if (zone->prev_priority > priority)
 				zone->prev_priority = priority;
-			lru_pages += zone->nr_active + zone->nr_inactive;

 			shrink_zone(zone, &sc);

@@ -1440,7 +1426,7 @@ loop_again:
 				zone->all_unreclaimable = 1;
 		}
 		reclaim_state->reclaimed_slab = 0;
-		shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
+		shrink_slab(GFP_KERNEL);
 		sc.nr_reclaimed += reclaim_state->reclaimed_slab;
 		total_reclaimed += sc.nr_reclaimed;
 		total_scanned += sc.nr_scanned;

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 04/12] mm: balance zone aging in kswapd reclaim path

The kswapd reclaim has had one single goal:
	reclaim from zones to make their watermarks ok.

Now add another weak goal(it will not set all_zones_ok=0):
	reclaim from the least aged zone to help balance the aging rates.

Two major aspects of this algorithm:
- reclaim the least aged zone unless it catches up with the most aged zone
- reclaim for weaker watermark by calling watermark_ok() with classzone_idx=0

That garuantees reclaims-for-aging to be more than reclaims-for-watermark if
there is ever a big imbalance, thus eliminates the chance of growing gaps.

Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |   39 ++++++++++++++++++++++++++++++---------
 1 files changed, 30 insertions(+), 9 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -1356,6 +1356,8 @@ static int balance_pgdat(pg_data_t *pgda
 	int total_scanned, total_reclaimed;
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct scan_control sc;
+	struct zone *youngest_zone = NULL;
+	struct zone *oldest_zone = NULL;

 loop_again:
 	total_scanned = 0;
@@ -1371,11 +1373,20 @@ loop_again:
 		struct zone *zone = pgdat->node_zones + i;

 		zone->temp_priority = DEF_PRIORITY;
+
+		if (zone->present_pages == 0)
+			continue;
+
+		if (!oldest_zone)
+			youngest_zone = oldest_zone = zone;
+		else if (pages_more_aged(zone, oldest_zone))
+			oldest_zone = zone;
+		else if (pages_more_aged(youngest_zone, zone))
+			youngest_zone = zone;
 	}

 	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
 		unsigned long lru_pages = 0;
-		int first_low_zone = 0;

 		all_zones_ok = 1;
 		sc.nr_scanned = 0;
@@ -1387,21 +1398,31 @@ loop_again:
 		if (!priority)
 			disable_swap_token();

-		/* Scan in the highmem->dma direction */
-		for (i = pgdat->nr_zones - 1; i >= 0; i--) {
+		/*
+		 * Now scan the zone in the dma->highmem direction, stopping
+		 * at the last zone which needs scanning.
+		 *
+		 * We do this because the page allocator works in the opposite
+		 * direction.  This prevents the page allocator from allocating
+		 * pages behind kswapd's direction of progress, which would
+		 * cause too much scanning of the lower zones.
+		 */
+		for (i = 0; i < pgdat->nr_zones; i++) {
 			struct zone *zone = pgdat->node_zones + i;

 			if (!populated_zone(zone))
 				continue;

 			if (nr_pages == 0) {	/* Not software suspend */
-				if (zone_watermark_ok(zone, order,
-					zone->pages_high, first_low_zone, 0))
+				if (!zone_watermark_ok(zone, order,
+							zone->pages_high,
+							0, 0)) {
+					all_zones_ok = 0;
+				} else if (zone == youngest_zone &&
+						pages_more_aged(oldest_zone,
+								youngest_zone)) {
+				} else
 					continue;
-
-				all_zones_ok = 0;
-				if (first_low_zone < i)
-					first_low_zone = i;
 			}

 			if (zone->all_unreclaimable && priority != DEF_PRIORITY)

--
Wu Fengguang | 1 Dec 11:18
Picon

[PATCH 01/12] vm: kswapd incmin

Explicitly teach kswapd about the incremental min logic instead of just scanning
all zones under the first low zone. This should keep more even pressure applied
on the zones.

Signed-off-by: Nick Piggin <npiggin <at> suse.de>
Signed-off-by: Wu Fengguang <wfg <at> mail.ustc.edu.cn>
---

 mm/vmscan.c |  111 ++++++++++++++++++++----------------------------------------
 1 files changed, 37 insertions(+), 74 deletions(-)

--- linux.orig/mm/vmscan.c
+++ linux/mm/vmscan.c
@@ -1310,101 +1310,65 @@ loop_again:
 	}

 	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
-		int end_zone = 0;	/* Inclusive.  0 = ZONE_DMA */
 		unsigned long lru_pages = 0;
+		int first_low_zone = 0;
+
+		all_zones_ok = 1;
+		sc.nr_scanned = 0;
+		sc.nr_reclaimed = 0;
+		sc.priority = priority;
+		sc.swap_cluster_max = nr_pages ? nr_pages : SWAP_CLUSTER_MAX;

 		/* The swap token gets in the way of swapout... */
 		if (!priority)
 			disable_swap_token();

-		all_zones_ok = 1;
-
-		if (nr_pages == 0) {
-			/*
-			 * Scan in the highmem->dma direction for the highest
-			 * zone which needs scanning
-			 */
-			for (i = pgdat->nr_zones - 1; i >= 0; i--) {
-				struct zone *zone = pgdat->node_zones + i;
+		/* Scan in the highmem->dma direction */
+		for (i = pgdat->nr_zones - 1; i >= 0; i--) {
+			struct zone *zone = pgdat->node_zones + i;

-				if (!populated_zone(zone))
-					continue;
+			if (!populated_zone(zone))
+				continue;

-				if (zone->all_unreclaimable &&
-						priority != DEF_PRIORITY)
+			if (nr_pages == 0) {	/* Not software suspend */
+				if (zone_watermark_ok(zone, order,
+					zone->pages_high, first_low_zone, 0))
 					continue;

-				if (!zone_watermark_ok(zone, order,
-						zone->pages_high, 0, 0)) {
-					end_zone = i;
-					goto scan;
-				}
+				all_zones_ok = 0;
+				if (first_low_zone < i)
+					first_low_zone = i;
 			}
-			goto out;
-		} else {
-			end_zone = pgdat->nr_zones - 1;
-		}
-scan:
-		for (i = 0; i <= end_zone; i++) {
-			struct zone *zone = pgdat->node_zones + i;
-
-			lru_pages += zone->nr_active + zone->nr_inactive;
-		}
-
-		/*
-		 * Now scan the zone in the dma->highmem direction, stopping
-		 * at the last zone which needs scanning.
-		 *
-		 * We do this because the page allocator works in the opposite
-		 * direction.  This prevents the page allocator from allocating
-		 * pages behind kswapd's direction of progress, which would
-		 * cause too much scanning of the lower zones.
-		 */
-		for (i = 0; i <= end_zone; i++) {
-			struct zone *zone = pgdat->node_zones + i;
-			int nr_slab;
-
-			if (!populated_zone(zone))
-				continue;

 			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
 				continue;

-			if (nr_pages == 0) {	/* Not software suspend */
-				if (!zone_watermark_ok(zone, order,
-						zone->pages_high, end_zone, 0))
-					all_zones_ok = 0;
-			}
 			zone->temp_priority = priority;
 			if (zone->prev_priority > priority)
 				zone->prev_priority = priority;
-			sc.nr_scanned = 0;
-			sc.nr_reclaimed = 0;
-			sc.priority = priority;
-			sc.swap_cluster_max = nr_pages? nr_pages : SWAP_CLUSTER_MAX;
-			atomic_inc(&zone->reclaim_in_progress);
+			lru_pages += zone->nr_active + zone->nr_inactive;
+
 			shrink_zone(zone, &sc);
-			atomic_dec(&zone->reclaim_in_progress);
-			reclaim_state->reclaimed_slab = 0;
-			nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL,
-						lru_pages);
-			sc.nr_reclaimed += reclaim_state->reclaimed_slab;
-			total_reclaimed += sc.nr_reclaimed;
-			total_scanned += sc.nr_scanned;
-			if (zone->all_unreclaimable)
-				continue;
-			if (nr_slab == 0 && zone->pages_scanned >=
+
+			if (zone->pages_scanned >=
 				    (zone->nr_active + zone->nr_inactive) * 4)
 				zone->all_unreclaimable = 1;
-			/*
-			 * If we've done a decent amount of scanning and
-			 * the reclaim ratio is low, start doing writepage
-			 * even in laptop mode
-			 */
-			if (total_scanned > SWAP_CLUSTER_MAX * 2 &&
-			    total_scanned > total_reclaimed+total_reclaimed/2)
-				sc.may_writepage = 1;
 		}
+		reclaim_state->reclaimed_slab = 0;
+		shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
+		sc.nr_reclaimed += reclaim_state->reclaimed_slab;
+		total_reclaimed += sc.nr_reclaimed;
+		total_scanned += sc.nr_scanned;
+
+		/*
+		 * If we've done a decent amount of scanning and
+		 * the reclaim ratio is low, start doing writepage
+		 * even in laptop mode
+		 */
+		if (total_scanned > SWAP_CLUSTER_MAX * 2 &&
+		    total_scanned > total_reclaimed+total_reclaimed/2)
+			sc.may_writepage = 1;
+
 		if (nr_pages && to_free > total_reclaimed)
 			continue;	/* swsusp: need to do more work */
 		if (all_zones_ok)
@@ -1425,7 +1389,6 @@ scan:
 		if ((total_reclaimed >= SWAP_CLUSTER_MAX) && (!nr_pages))
 			break;
 	}
-out:
 	for (i = 0; i < pgdat->nr_zones; i++) {
 		struct zone *zone = pgdat->node_zones + i;

--
Andrew Morton | 1 Dec 11:33

Re: [PATCH 01/12] vm: kswapd incmin


Wu Fengguang <wfg <at> mail.ustc.edu.cn> wrote: > > Explicitly teach kswapd about the incremental min logic instead of just scanning > all zones under the first low zone. This should keep more even pressure applied > on the zones.
I spat this back a while ago. See the changelog (below) for the logic which you're removing. This change appears to go back to performing reclaim in the highmem->lowmem direction. Page reclaim might go all lumpy again. Shouldn't first_low_zone be initialised to ZONE_HIGHMEM (or pgdat->nr_zones - 1) rather than to 0, or something? I don't understand why we're passing zero as the classzone_idx into zone_watermark_ok() in the first go around the loop. And this bit, which Nick didn't reply to (wimp!). I think it's a bug. Looking at it, I am confused. In the first loop: for (i = pgdat->nr_zones - 1; i >= 0; i--) { struct zone *zone = pgdat->node_zones + i; ... if (!zone_watermark_ok(zone, order, zone->pages_high, 0, 0)) { end_zone = i; goto scan; } end_zone gets the value of the highest-numbered zone which needs scanning. Where `0' corresponds to ZONE_DMA. (correct?) In the second loop: for (i = 0; i <= end_zone; i++) { struct zone *zone = pgdat->node_zones + i; Shouldn't that be for (i = end_zone; ...; i++) or am I on crack? As kswapd is now scanning zones in the highmem->normal->dma direction it can get into competition with the page allocator: kswapd keep on trying to free pages from highmem, then kswapd moves onto lowmem. By the time kswapd has done proportional scanning in lowmem, someone has come in and allocated a few pages from highmem. So kswapd goes back and frees some highmem, then some lowmem again. But nobody has allocated any lowmem yet. So we keep on and on scanning lowmem in response to highmem page allocations. With a simple `dd' on a 1G box we get: r b swpd free buff cache si so bi bo in cs us sy wa id 0 3 0 59340 4628 922348 0 0 4 28188 1072 808 0 10 46 44 0 3 0 29932 4660 951760 0 0 0 30752 1078 441 1 6 30 64 0 3 0 57568 4556 924052 0 0 0 30748 1075 478 0 8 43 49 0 3 0 29664 4584 952176 0 0 0 30752 1075 472 0 6 34 60 0 3 0 5304 4620 976280 0 0 4 40484 1073 456 1 7 52 41 0 3 0 104856 4508 877112 0 0 0 18452 1074 97 0 7 67 26 0 3 0 70768 4540 911488 0 0 0 35876 1078 746 0 7 34 59 1 2 0 42544 4568 939680 0 0 0 21524 1073 556 0 5 43 51 0 3 0 5520 4608 976428 0 0 4 37924 1076 836 0 7 41 51 0 2 0 4848 4632 976812 0 0 32 12308 1092 94 0 1 33 66 Simple fix: go back to scanning the zones in the dma->normal->highmem direction so we meet the page allocator in the middle somewhere. r b swpd free buff cache si so bi bo in cs us sy wa id 1 3 0 5152 3468 976548 0 0 4 37924 1071 650 0 8 64 28 1 2 0 4888 3496 976588 0 0 0 23576 1075 726 0 6 66 27 0 3 0 5336 3532 976348 0 0 0 31264 1072 708 0 8 60 32 0 3 0 6168 3560 975504 0 0 0 40992 1072 683 0 6 63 31 0 3 0 4560 3580 976844 0 0 0 18448 1073 233 0 4 59 37 0 3 0 5840 3624 975712 0 0 4 26660 1072 800 1 8 46 45 0 3 0 4816 3648 976640 0 0 0 40992 1073 526 0 6 47 47 0 3 0 5456 3672 976072 0 0 0 19984 1070 320 0 5 60 35 --- 25-akpm/mm/vmscan.c | 37 +++++++++++++++++++++++++++++++++++-- 1 files changed, 35 insertions(+), 2 deletions(-) diff -puN mm/vmscan.c~kswapd-avoid-higher-zones-reverse-direction mm/vmscan.c --- 25/mm/vmscan.c~kswapd-avoid-higher-zones-reverse-direction 2004-03-12 01:33:09.000000000 -0800 +++ 25-akpm/mm/vmscan.c 2004-03-12 01:33:09.000000000 -0800 @@ -924,8 +924,41 @@ static int balance_pgdat(pg_data_t *pgda for (priority = DEF_PRIORITY; priority; priority--) { int all_zones_ok = 1; int pages_scanned = 0; + int end_zone = 0; /* Inclusive. 0 = ZONE_DMA */ - for (i = pgdat->nr_zones - 1; i >= 0; i--) { + + if (nr_pages == 0) { + /* + * Scan in the highmem->dma direction for the highest + * zone which needs scanning + */ + for (i = pgdat->nr_zones - 1; i >= 0; i--) { + struct zone *zone = pgdat->node_zones + i; + + if (zone->all_unreclaimable && + priority != DEF_PRIORITY) + continue; + + if (zone->free_pages <= zone->pages_high) { + end_zone = i; + goto scan; + } + } + goto out; + } else { + end_zone = pgdat->nr_zones - 1; + } +scan: + /* + * Now scan the zone in the dma->highmem direction, stopping + * at the last zone which needs scanning. + * + * We do this because the page allocator works in the opposite + * direction. This prevents the page allocator from allocating + * pages behind kswapd's direction of progress, which would + * cause too much scanning of the lower zones. + */ + for (i = 0; i <= end_zone; i++) { struct zone *zone = pgdat->node_zones + i; int total_scanned = 0; int max_scan; @@ -965,7 +998,7 @@ static int balance_pgdat(pg_data_t *pgda if (pages_scanned) blk_congestion_wait(WRITE, HZ/10); } - +out: for (i = 0; i < pgdat->nr_zones; i++) { struct zone *zone = pgdat->node_zones + i; _
Wu Fengguang | 1 Dec 12:40
Picon

Re: [PATCH 01/12] vm: kswapd incmin


On Thu, Dec 01, 2005 at 02:33:30AM -0800, Andrew Morton wrote: > I spat this back a while ago. See the changelog (below) for the logic > which you're removing. > > This change appears to go back to performing reclaim in the highmem->lowmem > direction. Page reclaim might go all lumpy again. > > Shouldn't first_low_zone be initialised to ZONE_HIGHMEM (or pgdat->nr_zones > - 1) rather than to 0, or something? I don't understand why we're passing > zero as the classzone_idx into zone_watermark_ok() in the first go around > the loop.
Sorry to note that I'm mainly taking its zone-range --> zones-under-watermark cleanups. The scan order is reverted back to DMA->HighMem in mm-balance-zone-aging-in-kswapd-reclaim.patch, and the first_low_zone logic is also replaced with a quite different one there. My thinking is that the overall reclaim-for-watermark should be weakened and just do minimal watermark-safeguard work, so that it will not be a major force of imbalance. Assume there are three zones. The dynamics goes something like: HighMem exhausted --> reclaim from it --> become more aged --> reclaim the other two zones for aging DMA reclaimed --> age leaps ahead --> reclaim Normal zone for aging, while HighMem is being reclaimed for watermark In the kswapd path, if there are N rounds of reclaim-for-watermark with all_zones_ok=0, there could be N+1 rounds of reclaim-for-aging with the additional 1 time of all_zones_ok=1. With this the force of balance outperforms the force of imbalance. In the direct path, there are 10 rounds of aging before unconditionally reclaim from all zones, that's pretty much force of balance. In summary: - HighMem zone is normally first exhausted and mostly reclaimed for watermark. - DMA zone is now mainly reclaimed for aging. - Normal zone will be mostly reclaimed for aging, sometimes for watermark. Thanks, Wu

Gmane