Memory Manager in Windows

Published on June 2016 | Categories: Documents | Downloads: 49 | Comments: 0 | Views: 315

of 34

Content

The Memory Manager in Windows Server 2003 and Windows Vista
®

Landy Wang Software Design Engineer Windows Kernel Team Microsoft Corporation

Outline Memory Manager (MM) improvements in Windows Server 2003 SP1
64-bit Windows features and Enhancements NUMA and large page support added Performance Enhancements Support for No Execute (NX) Capability

© 2005 Microsoft Corporation

2

Outline (con¶t) Memory Manager Improvements Planned for Windows Vista
Dynamic system address space Kernel page table pages allocated on demand Support for very large registries NUMA and large page support enhancements Advanced video model support I/O and Section Access Improvements Performance Improvements Terminal Server improvements Robustness and Diagnosability Improvements
© 2005 Microsoft Corporation

3

Server 2003 SP1 ± Windows 64

Windows 64-bit memory
8TB user address space 8TB kernel address space 128GB pools 128GB system page table entries (PTEs) 1TB system cache

Support for x64 platform added 4GB virtual address space added for 32-bit large address space aware applications
Further increases performance of WOW layer on both Itanium and x64 systems
© 2005 Microsoft Corporation

4

Server 2003 SP1 ± NUMA & Large Page Support

Large page support added for user images and pagefile-backed sections Large pages now also used in 32-bit, even when booted with /3GB switch, for
Kernel Page Frame Number (PFN) database Initial non-paged pool

Prior large page support (added in Server 2003) was for the following
User private memory Device driver image mappings Kernel, when not booted with /3GB switch
© 2005 Microsoft Corporation

5

Server 2003 SP1 ± NUMA & Large Page Support

Pages zeroed in parallel and in a node aware fashion during boot up
Reduces boot time on large NUMA systems

Physical pages initially consumed in top-down order, instead of bottom-up
Keeps more pages below 4GB available for drivers that require it

© 2005 Microsoft Corporation

6

Server 2003 SP1 ± Performance Enhancements

Working set management performance increases, especially in
Areas of large shared memory and when booted with /3GB switch

Premature self-trimming and linear hash table walks eliminated
Major perf increases for apps like Exchange & SAP

© 2005 Microsoft Corporation

7

Server 2003 SP1 ± Performance Enhancements

Pool tagging paths parallelized
Introduced shared acquire mode for spinlocks Employing for tag table updates

Expand hash table for tagging large pages
When we detect searches are occurring, instead of waiting for the table to be entirely filled

Overlapped asynchronous flushing for user requests to maximize I/O throughput Pagefiles zeroed in parallel instead of serially
Faster shutdown when ³zero my pagefile´ is set

© 2005 Microsoft Corporation

8

Server 2003 SP1 ± Performance Enhancements

Per-process working set lock used to synchronize PTE updates and working set list changes to an address space
System, session or process This lock converted from a mutex to a pushlock
Pushlocks support both shared and exclusive acquire modes Mutexes support only exclusive acquisitions

In conjunction with 2-byte interlocked operations this allows parallelization of many operations
MmProbeAndLockPages
Completely remove the PFN lock acquire from this very hot routine

MmUnlockPages VirtualQuery etc.
© 2005 Microsoft Corporation

9

Server 2003 SP1 ± Performance Enhancements

Major PFN lock reduction to improve scalability
Reducing time held Replacing acquisitions with lock-free or alternative lock sequences in many places & APIs

Translation look-aside buffer (TLB) optimizations

© 2005 Microsoft Corporation

10

Server 2003 SP1 ± Other MM Enhancements

Support for no execute (NX) capability New Win32 SetThreadStackGuarantee API
Allows user applications & the CLR to specify guaranteed stack space requirements Requirements honored even in low resource scenarios

Support for hot-patching a running system
Patch system without reboot to reduce down time Backported to Windows XP SP2

© 2005 Microsoft Corporation

11

Windows Vista ± Dynamic Address Space
System virtual address (VA) space allocated on-demand
Instead of at boot time based registry & configuration information Region sizes bounded only by VA limitations Applies to non-paged, paged, session space, mapped views, etc.

Kernel page tables allocated on demand
No longer preallocated at system boot, saves
1.5MB on x86 systems 3MB on PAE systems 16MB to 2.5GB on 64-bit machines

Boot with very large registries on 32-bit machines With and without /3GB switch
Important for large multipath LUN machines MM locates registry VA space used by boot loader & reuses it as dynamic kernel virtual address space
© 2005 Microsoft Corporation

12

Key Benefits of Dynamic Address Space
No registry editing & reboots to reconfigure systems due to resource imbalances Maximum resources available in wide range of scenarios, w/ no human intervention
Desktop heap exhaustion Terminal Server maximum scaling Large video clients /3GB SQL and Exchange machines Http servers, NFS servers, etc

Features enabled w/o reboot, yet have no cost if not used 64-bit systems grow to maximum limit regardless of underlying physical configuration
128GB paged pool, nonpaged pool 1TB system cache/system PTEs/special pool 128GB session pool 128GB session views (desktop heaps), etc
© 2005 Microsoft Corporation

13

Windows Vista ± Planned Enhancements for NUMA, Large System, Large Page Support
Initial nonpaged pool now NUMA aware, with separate VA ranges for each node Per-node look-asides for full pages Page table allocation for system PTEs, the system cache, etc. distributed across nodes
More even locality Avoids exhausting free pages from the boot node

NUMA-related APIs for device drivers
MmAllocateContiguousMemorySpecifyCacheNode MmAllocatePagesForMdlEx Default if no node is specified has been changed
From current processor to the thread¶s ideal processor

Zeroing of pages for these APIs bounds number of threads more intelligently
© 2005 Microsoft Corporation

14

Windows Vista ± Planned Enhancements for NUMA, Large System, Large Page Support
Win32® APIs that specify nodes for allocations & mapped views on per VAD & per section basis VirtualAllocExNuma CreateFileMappingExNuma MapViewOfFileExNuma

Scalable query
QueryWorkingSetEx

Higher perf for very physically sparse machines
Example: Hewlett-Packard Superdome
1TB gaps between chunks of physical memory

PFN database & initial nonpaged pool always mapped with large pages regardless of physical memory sparseness

© 2005 Microsoft Corporation

15

Windows Vista ± Planned Enhancements for NUMA, Large System, Large Page Support /3GB mode on 32-bit systems supports up to 64GB of RAM
Booting in /3GB mode on 32-bit systems now supports up to 64GB of RAM instead of just 16GB Booting without /3GB on 32-bit systems continues to support up to 128 GB of RAM

© 2005 Microsoft Corporation

16

Windows Vista ± Planned Enhancements for NUMA, Large System, Large Page Support Much faster large page allocations in kernel & user Support for cache-aligned pool allocation directives Data structures describing non-paged pool free list converted from linked list to bitmap
Reduced lock contention by over 50% Bitmaps can be searched opportunistically lock-free Costly combining of adjacent allocations on free no longer necessary

© 2005 Microsoft Corporation

17

Windows Vista ± New Video Model Support
Dramatically different video architecture in Windows Vista
More fully exploits modern GPUs & virtual memory

MM provides new mapping type
Rotate virtual address descriptors (VADs) Allow video drivers to quickly switch user views from regular application memory into Cached, non-cached, write combined AGP or video RAM mappings Allows video architecture to use GPU to rotate unneeded clients in and out on demand

First time Windows-based OS has supported fully pageable mappings w/ arbitrary cache attributes

© 2005 Microsoft Corporation

18

Windows Vista ± I/O Section Access Improvements
Pervasive prefetch-style clustering for all types of page faults and system cache read ahead Major benefits over previous clustering
Infinite size read ahead instead of 64k max Dummy page usage
So a single large I/O is always issued regardless of valid pages encountered in the cluster

Pages for the I/O are put in transition (not valid) No VA space is required
If the pages are not subsequently referenced, no working set trim and TLB flush is needed either

Further emphasizes that driver writers must be aware that MDL pages can have their contents change !

© 2005 Microsoft Corporation

19

Windows Vista ± I/O Section Access Improvements Significant changes in pagefile writing
Larger clusters up to 4GB Align near neighbors Sort by virtual address (VA) Reduced fragmentation Improved reads

Cache manager read ahead size limitations in thread structure removed Improved synchronization between cache manager and memory manager data flushing to maximize filesystem/disk throughput and efficiency

© 2005 Microsoft Corporation

20

Windows Vista ± I/O Section Access Improvements Mapped file writing and file flushing performance increases
Support for writes of any size up to 4GB instead of previous 64k limit per write Multiple asynchronous flushes can be issued, both internally and by the caller, to satisfy a single call

Pagefile fragmentation improvements
On dirty bit faults, we use interlocked queuing operation to free the pagefile space of the corresponding page Avoids PFN lock acquisitions Reduces needless pagefile fragmentation
© 2005 Microsoft Corporation

21

Windows Vista ± I/O Section Access Improvements Elimination of pagefile writes and potential subsequent re-reads of completely zero pages
Check pages at trim time to see if they are all zero Optimization used to make this nearly free
User virtual address used to check for the first and last ULONG_PTR being zero; if they both are, then After the page is trimmed, and TLB invalidated, a kernel mapping used to make the final check of the entire page Avoids needless scans & TLB flushes
We¶ve measured over 90% success rate with this algorithm

© 2005 Microsoft Corporation

22

Windows Vista ± I/O Section Access Improvements Access to large section performance increases
A subsection is the name of the data structure used to describe on-disk file spans for sections The subsection structure was converted
From a singly linked (i.e., linear walk required) To a balanced AVL tree Enables huge performance gain for sections mapping large files
User mappings & flushes, system cache mappings, flushes & purges, section-based backups, etc

Mapped page writer does flushing based on a sweep hand
Data is written out much sooner than the prior 5 minute ³flush everything´ model
© 2005 Microsoft Corporation

23

Windows Vista ± I/O Section Access Improvements Dependencies between modified writer & mapped writer removed to
Increase parallelism Reduce filesystem deadlock rules Provide the cache manager with a way to influence which portions of files get written first
To optimize disk seek as well as avoiding valid data length extension costs

© 2005 Microsoft Corporation

24

Windows Vista ± I/O Section Access Improvements
Core support for ³Superfetch´
Enables significantly faster app launch by deciding which pages should be prioritized Provides mechanisms to pre-fetch pages and prevents premature cannibalization Includes support for Per page priorities Access bit tracing Private page pre-fetching Section (including pagefile-backed) pre-fetching

© 2005 Microsoft Corporation

25

Windows Vista ± Fast S4 Support
Hibernation converted to use memory management mirroring facilities Hibernation time reduced by 2x, with 50% smaller hiber-file Resume time reductions

© 2005 Microsoft Corporation

26

Windows Vista ± Internal Data Structure and Algorithmic Performance Enhancements
Constant PFN lock time reduction always ongoing, has included areas like
User address space trimming and deletion MEM_RESET Page allocations
the PFN sharecount now uses interlocked updates instead of requiring the PFN lock, etc

Page faults Modified writes Page color generation MDL construction for fault I/Os, and so on

Translation look-aside buffer (TLB) optimizations

© 2005 Microsoft Corporation

27

Windows Vista ± Internal Data Structure and Algorithmic Performance Enhancements
The per-process address space lock used to synchronize creation/deletion/changes to user address spaces
This lock was converted from a mutex to a pushlock
Pushlocks support both shared and exclusive acquire modes Mutexes support only exclusive acquisitions

Allowed parallelization of many operations like VirtualAlloc, etc

VirtualAlloc support has been revamped to reduce
Conventional (non-AWE) allocations by over 30% AWE allocations by over 2500% (not a typo)

Address Windowing Extension (AWE) non-zeroed allocations are >10x faster than in SP1
Can now therefore be used for http responses, for example

© 2005 Microsoft Corporation

28

Windows ± Internal Data Structure and Algorithmic Performance Enhancements
PFN database contains information about all physical memory in the machine In the past, whenever a new page was needed:
The PFN spinlock was acquired New page removed from appropriate list chained through PFN database

This has been improved by adding a zero and free page SLIST for every NUMA node and page color Now obtain the page without needing the PFN lock in many instances where we need a single page
Demand zero faults, copy on write faults, etc For example, the fault processing path length is cut in half
Alleviates pressure on both the working set pushlock & PFN lock
© 2005 Microsoft Corporation

29

Windows Vista ± Terminal Services Improvements
Added Terminal Server session objects
Enables various components to have secure session IDs and implement compartment IDs, for example

Major overhaul of Terminal Server global-per-session image support
Eliminated multiple image control areas
To provide single image cache & fix flush/purge/truncate races Only the shared subsections themselves are now per-session, instead of the entire image Shared subsection use AVL tree instead of a linked list, for faster searches

Support for hot-patching of session-space drivers 64-bit Windows uses demand zero pages instead of pool for WOW64 page table bitmaps
© 2005 Microsoft Corporation

30

Windows Vista ± Additional Robustness and Diagnosability
Capability to mark system cache views as read only
Used by Registry to protect views from inadvertent driver corruption

Reduced data loss in the face of crashes
Flush all modified data to its backing store (local & remote) if we are going to bugcheck due to a failed inpage Only failed inpages of kernel and/or drivers are fatal
Failed inpages of user process code/data merely results in an inpage exception being handed to the application

Commit thresholds now reflected in global named events
Apps can use this to monitor the system

© 2005 Microsoft Corporation

31

Windows Vista ± Additional Robustness and Diagnosability .pagein debugger support for kernel/driver addresses added
Allows for viewing memory addresses which have been paged out to disk when debugging crashes

© 2005 Microsoft Corporation

32

Call to Action Consider these significant Memory Manager enhancements as you develop drivers for Windows Server 2003 and Windows Vista Use new APIs when available in Windows Vista

© 2005 Microsoft Corporation

33

© 2005 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

Memory Manager in Windows

Comments

Content

Sponsor Documents

Recommended