[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Virtual memory usage tuning.




I'm sorry to get somewhat back on topic by asking a Linux question,
but here goes...

According to the boot-prompt howto, one can tune some of the virtual
memory parameters with the 'swap=' boot parameter, which takes 8
arguments, which set:

        MAX_PAGE_AGE
        PAGE_ADVANCE
        PAGE_DECLINE
        PAGE_INITIAL_AGE
        AGE_CLUSTER_FRACT
        AGE_CLUSTER_MIN
        PAGEOUT_WEIGHT
        BUFFEROUT_WEIGHT

Other than this, the boot-prompt howto has this to say about the above
parameters:

   Interested hackers are advised to have a read of linux/mm/swap.c and
   also make note of the goodies in /proc/sys/vm.

Does anyone have any other information on the above parameters?  Does
anyone know of any straight forward documentation for the swapping
algorithm?

I want to adjust these parameters because I don't like how kernel
2.0.29 manages memory as compared to 1.2.13.  I find that under the
newer kernel, netscape will swap alot (once I get about 10-30 netscape
windows open), even when it's the most active process, and the system
doesn't get rid of much of the buffer cache - holding onto 8mb of it
even when netscape is starved for memory.

In case someone's interested in working out what the kernel's doing,
here's what I've figured out so far:

There are 4 files in /proc/sys/vm - bdflush, freepages, kswapd and
swapctl.  The last one seems to correspond to the global kernel
variable swap_control, which has the following structure:

      typedef struct swap_control_v5
      {
	   int	sc_max_page_age;       /* MAX_PAGE_AGE       20 */
	   int	sc_page_advance;       /* PAGE_ADVANCE        3 */
	   int	sc_page_decline;       /* PAGE_DECLINE        1 */
	   int	sc_page_initial_age;   /* PAGE_INITIAL_AGE    3 */
	   int	sc_max_buff_age;       /* MAX_BUFF_AGE       10 */
	   int	sc_buff_advance;       /* BUFF_ADVANCE        2 */
	   int	sc_buff_decline;       /* BUFF_DECLINE        2 */
	   int	sc_buff_initial_age;   /* BUFF_INITIAL_AGE    4 */
	   int	sc_age_cluster_fract;  /* AGE_CLUSTER_FRACT  32 */
	   int	sc_age_cluster_min;    /* AGE_CLUSTER_MIN     4 */
	   int	sc_pageout_weight;     /* PAGEOUT_WEIGHT   8192 */
	   int	sc_bufferout_weight;   /* BUFFEROUT_WEIGHT 8192 */
	   int 	sc_buffer_grace;       /* BUFFERMEM_GRACE  -200 */
	   int 	sc_nr_buffs_to_free;   /* NR_BUFFS_TO_FREE    1 */
	   int 	sc_nr_pages_to_free;   /* NR_PAGES_TO_FREE    1 */
	   enum RCL_POLICY sc_policy;  /* RCL_POLICY          1 */
      } swap_control_v5;
      typedef struct swap_control_v5 swap_control_t;
      extern swap_control_t swap_control;

In comments are names of #defines for above struct members in
swap_control global var.  Numbers are default values compiled in in
swap.c.

As for /proc/sys/vm/kswapd, it seems to correspond to the kernel
global variable kswapd_ctl.  It's defined in swapctl.h, and
initialized in vmscan.c:

      typedef struct kswapd_control_v1
      {
	   int	maxpages;    /*  4 */
	   int	pages_buff;  /* -1 */
	   int	pages_shm;   /* -1 */
	   int	pages_mmap;  /* -1 */
	   int	pages_swap;  /* -1 */
      } kswapd_control_v1;
      typedef kswapd_control_v1 kswapd_control_t;
      extern kswapd_control_t kswapd_ctl;


As for /proc/sys/vm/kswapd, it seems to correspond to
/proc/sys/vm/bdflush, and is defined in buffers.c:

union bdflush_param{
	struct {
		int nfract;  /* Percentage of buffer cache dirty to 
				activate bdflush */
		int ndirty;  /* Maximum number of dirty blocks to write out per
				wake-cycle */
		int nrefill; /* Number of clean buffers to try to obtain
				each time we call refill */
		int nref_dirt; /* Dirty buffer threshold for activating bdflush
				  when trying to refill buffers. */
		int clu_nfract;  /* Percentage of buffer cache to scan to 
				    search for free clusters */
		int age_buffer;  /* Time for normal buffer to age before 
				    we flush it */
		int age_super;  /* Time for superblock to age before we 
				   flush it */
		int lav_const;  /* Constant used for load average (time
				   constant */
		int lav_ratio;  /* Used to determine how low a lav for a
				   particular size can go before we start to
				   trim back the buffers */
	} b_un;
	unsigned int data[N_PARAM];
} bdf_prm = {{60, 500, 64, 256, 15, 30*HZ, 5*HZ, 1884, 2}};

I haven't found anything corresonding to /proc/sys/vm/freepages yet.

The bootprompt-howto also mentions the buff= cmd line.  These also
correspond to slots in swap_ctl.  One of them is also in the swap= cmd
line, and the remaining slots are unused.

Here's what I've figured out about the usage of the settable slots so
far:

    Slot                 Used in:
   MAX_PAGE_AGE      - touch_page (swapctl.h).
   PAGE_ADVANCE      - touch_page (swapctl.h).
   PAGE_DECLINE      - age_page (swapctl.h).
   PAGE_INITIAL_AGE  - EXPAND (page_alloc.c) and set_page_new (swapctl.h).
   AGE_CLUSTER_FRACT - AGE_CLUSTER_SIZE (swapctl.h).
   AGE_CLUSTER_MIN   - AGE_CLUSTER_SIZE (swapctl.h).
   PAGEOUT_WEIGHT    - swap_out (vmscan.c).
   BUFFEROUT_WEIGHT  - shrink_specific_buffers (buffer.c)

Here's part of the call tree for the above fcns (listing for each fcn
the places it's used.  If nothing's listed, I haven't checked it yet):

touch_page:
   age_buffer      (buffer.c).
   try_to_swap_out (vmscan.c).

age_page:
   age_buffer      (buffer.c).
   init_swap_timer (buffer.c).

EXPAND:
   RMQUEUE              (page_alloc.c).

set_page_new:
   Unused?

AGE_CLUSTER_SIZE:
   swap_out         (vmscan.c)

swap_out:

shrink_specific_buffers:


RMQUEUE:
   __get_free_pages  (page_alloc.c).

__get_free_pages:
   get_kmalloc_pages (kmalloc.c).
   __get_free_page   (mm.h, define).
   __get_dma_pages   (mm.h, define).
   BufPoolAdd        (buffers.c)
   srmmu_alloc_kernel_stack (arch/sparc/srmmu.c)

This is about as far as I've gotten...

-- 
Harvey J. Stein
Berger Financial Research
abel@netvision.net.il