Windows memory manager internals

Windows Memory/Cache
Manager Internals
Sisimon Soman

Locality Theory
• If access page/cluster n, high possibility to
access blocks near to n.
• All memory based computing system
working on this principle.
• Windows has registry keys to configure
pre-fetch how many blocks/pages.
• Application specific memory manager like
Databases, multimedia workload, have
application aware pre-fetching.

Virtual Memory Manager (VMM)
• Apps feels memory is infinity – magic
done by VMM.
• Multiple apps run concurrently with out
interfering other apps data.
• Apps feel the entire resource is mine.
• Protect OS memory from apps.
• Advanced app may need to share
memory. Provide solution to memory
sharing easily.

VMM Continued..
• VMM reserve certain amount of memory
to Kernel.
• 32 bit box , 2GB for Kernel and 2GB for
User apps.
• Specific area in Kernel memory reserved
to store process specific data like PDE,
PTE etc called Hyper Space

Segmentation and Paging
• X86 processor has segmentation and
paging support.
• Can disable or enable paging, but
segmentation is enabled by default.
• Windows uses paging.
• Since not able to disable segmentation, it
consider the entire memory for segments
(also called ‘flat segments’).

Paging
• Divide entire physical memory in to equal
size pages (4K size for x86 platforms).
This is called ‘page frames’ and list called
‘page frame database’ (PF DB).
• X86 platform PF DB contains 20bit
physical offset (remaining 12 bit to
address offset in the page).
• PF DB also contains flags stating,
read/write underway , shared page , etc.

VMM Continued..
• Upper 2GB Kernel space is common for
all process.
• What is it mean – Half of PDE is common
to all process !.
• Experiment – See the PDE of two process
and make sure half of the PDE is same

Physical to Virtual address
translation
• Address translation in both direction – When
write PF to pagefile, VMM need to update proper
PDE/PTE stating page is in disk.
• Done by
– Memory Management Unit (MMU) of the processor.
– The VMM help MMU.
• VMM keep the PDE/PTE info and pass to MMU
during process context switch.
• MMU translate virtual address to physical
address.

Translation Lookaside Buffer (TLB)
• Address translation is costly operation
• It happen frequently – when even touches virtual
memory.
• TLB keeps a list containing most frequent
address translations.
• The list is tagged by process ID.
• TLB is a generic OS concept - implementation is
architecture dependent.
• Before doing the address translation MMU
search TLB for the PF.

Address Translation
• In x86 32 bit address – 10 bits of MSB
points to the PTE offset in PDE. Thus PDE
size of process is 1024 bytes.
• Next 10 bits point to the PF starting
address in PTE. Thus each PTE contains
1024 bytes.
• Remaining 12 bits to address the location
in the PF. Thus page size is 4K.

What is a Zero Page
• Page frames not specific to apps.
• If App1 write sensitive data to PF1, and later VMM push
the page to page file, attach PF 1 to App2. App2 can see
these sensitive info.
• It’s a big security flaw, VMM keep a Zero Page list.
• Cannot clean the page while freeing memory – it’s a
performance problem.
• VMM has dedicated thread who activate when system
under low memory situation and pick page frames from
free PF list, clean it and push to zero page list.
• VMM allocate memory from zero page list.

Arbitrary Thread Context
• Top layer of the driver stack get the
request (IRP) in the same process
context.
• Middle or lower layer driver MAY get the
request in any thread context (Ex: IO
completion), the current running thread
context.
• The address in the IRP is specific to the
PDE/PTE in the original process context.

Arbitrary Thread Context
continued..
• How to solve the issue ?.
• Note the half of the PDE (Kernel area) is
common in all process.
• If some how map to the kernel memory
(Upper half of PDE), the buffer is
accessible from all process.

Mapping buffer to Kernel space
• Allocate kernel pool from the calling
process context, copy user buffer to this
Kernel space.
• Memory Descriptor List (MDL) – Most
commonly used mechanism to keep data
in Kernel space.

Memory Descriptor List (MDL)
• //
• // I/O system definitions.
• //
• // Define a Memory Descriptor List (MDL)
• //
• // An MDL describes pages in a virtual buffer in terms of physical pages. The
• // pages associated with the buffer are described in an array that is allocated
• // just after the MDL header structure itself.
• //
• // One simply calculates the base of the array by adding one to the base
• // MDL pointer:
• //
• // Pages = (PPFN_NUMBER) (Mdl + 1);
• //
• // Notice that while in the context of the subject thread, the base virtual
• // address of a buffer mapped by an MDL may be referenced using the following:
• //
• // Mdl->StartVa | Mdl->ByteOffset
• //

• typedef struct _MDL {
• struct _MDL *Next;
• CSHORT Size;
• CSHORT MdlFlags;
• struct _EPROCESS *Process;
• PVOID MappedSystemVa;
• PVOID StartVa;
• ULONG ByteCount;
• ULONG ByteOffset;
• } MDL, *PMDL;

MDL Continued..
• #define MmGetSystemAddressForMdlSafe(MDL, PRIORITY)
• (((MDL)->MdlFlags & (MDL_MAPPED_TO_SYSTEM_VA |
• MDL_SOURCE_IS_NONPAGED_POOL)) ?
• ((MDL)->MappedSystemVa) :
• (MmMapLockedPagesSpecifyCache((MDL),
• KernelMode,
• MmCached,
• NULL,
• FALSE,
• (PRIORITY))))

• #define MmGetMdlVirtualAddress(Mdl)
• ((PVOID) ((PCHAR) ((Mdl)->StartVa) + (Mdl)->ByteOffset))

Standby list
• To reclaim pages from a process, VMM first move pages
to Standby list.
• VMM keep it there for a pre-defined ticks.
• If process refer the same page, VMM remove from
standby list and assign to process.
• VMM free the pages from Standby list after the timeout
expire.
• Pages in standby list is not free, not belong to a process
also.
• VMM keep a min and max value for free and standby
page count. If its out of the limits, appropriate events will
signaled and adjust the appropriate lists.

Miscellaneous VMM Terms
• ZwAllocateVirtualMemory – allocate
process specific memory in lower 2GB

• Paged Pool

• Non Paged Pool

• Copy on write (COW)

Cache Manager concepts
• If disk heads run in the speed of super
sonic jets, Cache Manager not required.
• Disk access is the main bottleneck that
reduce the system performance. Faster
CPU and Memory, but disk is still in stone
age.
• Common concept in Operating Systems,
Unix flavor called ‘buffer cache’.

What Cache Manager does
• Keep the system wide cache data of
frequently used secondary storage blocks.
• Facilitate read ahead , write back to
improve the overall system performance.
• With write-back, cache manager combine
multiple write requests and issue single
write request to improve performance.
There is a risk associated with write-back.

How Cache Manager works
• Cache Manager implement caching using
Memory Mapping.
• The concept is similar to an App uses
memory mapped file.
• CreateFile(…dwFlagsAndAttributes ,..)
• dwFlagsAndAttributes ==
FILE_FLAG_NO_BUFFERING means I don’t want
cache manager.

How Cache Manager works..
• Cache Manager reserve area in higher 2GB (x86
platform) system area.
• The Cache Manager reserved page count adjust
according to the system memory requirement.
• If system has lots of IO intensive tasks, system
dynamically increase the cache size.
• If system under low memory situation, reduce
the buffer cache size.

How cached read operation works
User Space

Kernel Space
Cached Read (1)

Page Fault (4)
VMM

Do Memory Mapping (3)

Get the
Pages From Cache Manager
File System
CM (2)

Get the blocks from disk (5)

Disk stack
(SCSI/Fibre Channel)

How cached write operation works
User Space

Kernel Space
Cached Write (1)
Modified Page
Writer Thread VMM
of VMM
Write to disk Do Memory Mapping (3),
later(4) Copy data to VMM pages.
Copy Pages
to CM (2) Cache Manager
File System

Write the blocks to disk (5)

Disk stack
(SCSI/Fibre Channel)

Windows memory manager internals

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Windows memory manager internals

Similar to Windows memory manager internals (20)

More from Sisimon Soman

More from Sisimon Soman (12)

Windows memory manager internals