Memory Management
Process Isolation
The next question that comes to mind is, “How does Windows NT keep processes from seeing each other’s address space?” Again, the mechanism for achieving this design goal is simple. Windows NT maintains a separate page table directory for each process and based on the process in execution, it switches to the corresponding page table directory. As the page table directories for different processes point to different page tables and these page tables point to different physical pages and only one directory is active at a time, no process can see any other process’s memory. When Windows NT switches the execution context, it also sets the CR3 register to point to the appropriate page table directory. The kernel-mode address space is mapped for all processes, and all page table directories have entries for kernel address space. However, another feature of 80386 is used to disallow user-mode code from accessing kernel address space. All the kernel pages are marked as supervisor pages; therefore, user-mode code cannot access them.
Code Page Sharing in DLLs
For sharing code pages of a DLL, Windows NT maps corresponding page table entries for all processes sharing the DLL onto the same set of physical pages. For example, if process A loads X.DLL at address xxxx and process B loads the same X.DLL at address yyyy, then the PTE for xxxx in process A’s page table and the PTE for yyyy in process B’s page table point to the same physical page. Figure 4-2 shows two processes sharing a page via same page table entries. The DLL pages are marked as read-only so that a process inadvertently attempting to write to this area will not cause other processes to crash.
Note: This is guaranteed to be the case when xxxx==yyyy. However, if xxxx!=yyyy, the physical page might not be same. We will discuss the reason behind this later in the chapter.
Kernel address space is shared using a similar technique. Because the entire kernel space is common for all processes, Windows NT can share page tables directly. Figure 4-3 shows how processes share physical pages by using same page tables. Consequently, the upper half of the page table directory entries are the same for all processes.
Listing 4-1 shows the sample program that demonstrates this.
Listing 4-1: SHOWDIR.C
/* Should be compiled in release mode to run properly */
#include
#include
#include
#include "gate.h"
/* Global array to hold the page directory */
DWORD PageDirectory[1024];
This initial portion of the SHOWDIR.C file contains, apart from the header inclusion, the global definition for the array to hold the page directory. The inclusion of the header file GATE.H is of interest. This header file prototypes the functions for using the callgate mechanism. Using the callgate mechanism, you can execute your code in the kernel mode without writing a new device driver.
XREF: We discuss the callgate mechanism in Chapter 10.
For this sample program, we need this mechanism because the page directory is not accessible to the user-mode code. For now, it’s sufficient to know that the mechanism allows a function inside a normal executable to be executed in kernel mode. Turning on to the definition of the page directory, we have already described that the size of each directory entry is 4 bytes and a page directory contains 1024 entries. Hence, the PageDirectory is an array of 1024 DWORDs. Each DWORD in the array represents the corresponding directory entry.
/* C function called from the assembly stub */
void _stdcall CFuncGetPageDirectory()
{
DWORD *PageDir=(DWORD *)0xC0300000;
int i=0;
for (i=0; i<1024; i++) {
PageDirectory[i] = PageDir[i];
}
}
CfuncGetPageDirectory() is the function that is executed in the kernel mode using the callgate mechanism. This function simply makes a copy of the page directory in the user-mode memory area so that the other user-mode code parts in the program can access it. The page directory is mapped at virtual address 0xC0300000 in every process’s address space. This address is not accessible from the user mode. The CFuncGetPageDirectory() function copies 1024 DWORDs from the 0xC0300000 address to the global PageDirectory variable that is accessible to the user-mode code in the program.
/* Displays the contents of page directory. Starting
* virtual address represented by the page directory
* entry is shown followed by the physical page
* address of the page table
*/
void DisplayPageDirectory()
{
int i;
int ctr=0;
printf("Page directory for the process, pid=%x\n",
GetCurrentProcessId());
for (i=0; i<1024; i++) {
if (PageDirectory[i]&0x01) {
if ((ctr%3)==0) {
printf("\n");
}
printf("%08x:%08x ", i << 22,
PageDirectory[i] & 0xFFFFF000);
ctr++;
}
}
printf("\n");
}
The DisplayPageDirectory() function operates in user mode and prints the PageDirectory array that is initialized by the CfuncGetPageDirectory() function. The function checks the Least Significant Bit (LSB) of each of the entries. A page directory entry is valid only if the last bit or the LSB is set. The function skips printing invalid entries. The function prints three entries on every line or, in other words, prints a newline character for every third entry. Each directory entry is printed as the logical address and the address of the corresponding page table as obtained from the page directory. As described earlier, the first 10 bits (or the 10 Most Significant Bits [MSB]) of the logical address are used as an index in the page directory. In other words, a directory entry at index i represents the logical addresses that have i as the first 10 bits. The function prints the base of the logical address range for each directory entry. The base address (that is, the least address in the range) has the last 22 bits (or 22 LSBs) as zeros. The function obtains this base address by shifting i to the first 10 bits. The address of the page table corresponding to the logical address is stored in the first 20 bits (or 20 MSBs) of the page directory entry. The 12 LSBs are the flags for the entry. The function calculates the page table address by masking off the flag bits.
main()
{
WORD CallGateSelector;
int rc;
static short farcall[3];
/* Assembly stub that is called through callgate */
extern void GetPageDirectory(void);
/* Creates a callgate to read the page directory
* from Ring 3 */
rc = CreateCallGate(GetPageDirectory, 0,
&CallGateSelector);
if (rc == SUCCESS) {
farcall[2] = CallGateSelector;
_asm {
call fword ptr [farcall]
}
DisplayPageDirectory();
getchar();
/* Releases the callgate */
rc=FreeCallGate(CallGateSelector);
if (rc!=SUCCESS) {
printf("FreeCallGate failed, "
"CallGateSelector=%x, rc=%x\n",
CallGateSelector, rc);
}
} else {
printf("CreateCallGate failed, rc=%x\n", rc);
}
return 0;
}
The main() function starts by creating a callgate that sets up the GetPageDirectory() function to be executed in the kernel mode. The GetPageDirectory() function is written in Assembly language and is a part of the RING0.ASM file. The CreateCallGate() function, used by the program to create the callgate, is provided by CALLGATE.DLL. The function returns with a callgate selector.
XREF: The mechanism of calling the desired function through callgate is explained in Chapter 10.
We’ll quickly mention a few important points here. The callgate selector returned by CreateCallGate() is a segment selector for the given function: in this case, GetPageDirectory(). To invoke the function pointed by the callgate selector, you need to issue a far call instruction. The far call instruction expects a 16-bit segment selector and a 32-bit offset within the segment. When you are calling through a callgate, the offset does not matter; the processor always jumps at the start of the function pointed to by the callgate. Hence, the program only initializes the third member of the farcall array that corresponds to the segment selector. Issuing a call through the callgate transfers the execution control to the GetPageDirectory() function. This function calls the CfuncGetPageDirectory() function that copies the page directory in the PageDirectory array. After the callgate call returns, the program prints the page directory copied in the PageDirectory by calling the DisplayPageDirectory() function. The program frees the callgate before exiting.
Listing 4-2: RING0.ASM
.386
.model small
.code
include ..\include\undocnt.inc
public _GetPageDirectory
extrn _CFuncGetPageDirectory@0:near
;Assembly stub called from callgate
_GetPageDirectory proc
Ring0Prolog
call _CFuncGetPageDirectory@0
Ring0Epilog
retf
_GetPageDirectory endp
END
The function to be called from the callgate needs to be written in assembly language for a couple of reasons. First, the function needs to execute a prolog and an epilog, both of which are assembly macros, to allow paging in kernel mode. Second, the function needs to issue a far return at the end. The function leaves the rest of the job to the CFuncGetPageDirectory() function written in C.
If you compare the output of the showdir program for two different processes, you find that the upper half of the page table directories for the two processes is exactly the same except for two entries. In other words, the corresponding kernel address space for these two entries is not shared by the two processes.
Listing 4-3: First instance of SHOWDIR
Page directory for the process, pid=6f
00000000:01026000 00400000:00f65000 10000000:0152f000
5f800000:00e46000 77c00000:0076b000 7f400000:012cb000
7fc00000:0007e000 80000000:00000000 80400000:00400000
80800000:00800000 80c00000:00c00000 81000000:01000000
81400000:01400000 81800000:01800000 81c00000:01c00000
82000000:02000000 82400000:02400000 82800000:02800000
82c00000:02c00000 83000000:03000000 83400000:03400000
83800000:03800000 83c00000:03c00000 84000000:04000000
84400000:04400000 84800000:04800000 84c00000:04c00000
85000000:05000000 85400000:05400000 85800000:05800000
85c00000:05c00000 86000000:06000000 86400000:06400000
86800000:06800000 86c00000:06c00000 87000000:07000000
87400000:07400000 87800000:07800000 87c00000:07c00000
a0000000:0153d000 c0000000:00e5d000 c0400000:00c9e000
c0c00000:00041000 c1000000:00042000 c1400000:00043000
c1800000:00044000 c1c00000:00045000 c2000000:00046000
c2400000:00047000 c2800000:00048000 c2c00000:00049000
c3000000:0004a000 c3400000:0004b000 c3800000:0004c000
c3c00000:0004d000 c4000000:0004e000 c4400000:0000f000
c4800000:00050000 c4c00000:00051000 c5000000:00052000
c5400000:00053000 c5800000:00054000 c5c00000:00055000
c6000000:00056000 c6400000:00057000 c6800000:00058000
c6c00000:00059000 c7000000:0005a000 c7400000:0005b000
c7800000:0005c000 c7c00000:0005d000 c8000000:0005e000
c8400000:0005f000 c8800000:00020000 c8c00000:00021000
c9000000:00022000 c9400000:00023000 c9800000:00024000
c9c00000:00025000 ca000000:00026000 ca400000:00027000
ca800000:00028000 cac00000:00029000 cb000000:0002a000
cb400000:0002b000 cb800000:0002c000 cbc00000:0002d000
cc000000:0002e000 cc400000:0002f000 cc800000:002f0000
ccc00000:002f1000 cd000000:002f2000 cd400000:002f3000
cd800000:002f4000 cdc00000:002f5000 ce000000:002f6000
ce400000:00037000 ce800000:00038000 cec00000:00039000
cf000000:0003a000 cf400000:0003b000 cf800000:0003c000
cfc00000:0003d000 d0000000:0003e000 d0400000:0003f000
d0800000:00380000 d0c00000:00301000 d1000000:00302000
d1400000:00303000 d1800000:00304000 d1c00000:00305000
d2000000:00306000 d2400000:00307000 d2800000:00308000
d2c00000:00309000 d3000000:0030a000 d3400000:0030b000
d3800000:0030c000 d3c00000:0030d000 d4000000:0030e000
d4400000:0004f000 d4800000:00310000 d4c00000:00311000
e1000000:00315000 e1400000:010fe000 fc400000:0038d000
fc800000:0038e000 fcc00000:0038f000 fd000000:00390000
fd400000:00391000 fd800000:00392000 fdc00000:00393000
fe000000:00394000 fe400000:00395000 fe800000:00396000
fec00000:00397000 ff000000:00398000 ff400000:00399000
ff800000:0039a000 ffc00000:00031000
Listing 4-4: Second instance of SHOWDIR
Page directory for the process, pid=7d
00000000:00fa1000 00400000:00fa0000 10000000:0110a000
5f800000:015ac000 77c00000:01a73000 7f400000:013ac000
7fc00000:0145e000 80000000:00000000 80400000:00400000
80800000:00800000 80c00000:00c00000 81000000:01000000
81400000:01400000 81800000:01800000 81c00000:01c00000
82000000:02000000 82400000:02400000 82800000:02800000
82c00000:02c00000 83000000:03000000 83400000:03400000
83800000:03800000 83c00000:03c00000 84000000:04000000
84400000:04400000 84800000:04800000 84c00000:04c00000
85000000:05000000 85400000:05400000 85800000:05800000
85c00000:05c00000 86000000:06000000 86400000:06400000
86800000:06800000 86c00000:06c00000 87000000:07000000
87400000:07400000 87800000:07800000 87c00000:07c00000
a0000000:0153d000 c0000000:00d94000 c0400000:01615000
c0c00000:00041000 c1000000:00042000 c1400000:00043000
c1800000:00044000 c1c00000:00045000 c2000000:00046000
c2400000:00047000 c2800000:00048000 c2c00000:00049000
c3000000:0004a000 c3400000:0004b000 c3800000:0004c000
c3c00000:0004d000 c4000000:0004e000 c4400000:0000f000
c4800000:00050000 c4c00000:00051000 c5000000:00052000
c5400000:00053000 c5800000:00054000 c5c00000:00055000
c6000000:00056000 c6400000:00057000 c6800000:00058000
c6c00000:00059000 c7000000:0005a000 c7400000:0005b000
c7800000:0005c000 c7c00000:0005d000 c8000000:0005e000
c8400000:0005f000 c8800000:00020000 c8c00000:00021000
c9000000:00022000 c9400000:00023000 c9800000:00024000
c9c00000:00025000 ca000000:00026000 ca400000:00027000
ca800000:00028000 cac00000:00029000 cb000000:0002a000
cb400000:0002b000 cb800000:0002c000 cbc00000:0002d000
cc000000:0002e000 cc400000:0002f000 cc800000:002f0000
ccc00000:002f1000 cd000000:002f2000 cd400000:002f3000
cd800000:002f4000 cdc00000:002f5000 ce000000:002f6000
ce400000:00037000 ce800000:00038000 cec00000:00039000
cf000000:0003a000 cf400000:0003b000 cf800000:0003c000
cfc00000:0003d000 d0000000:0003e000 d0400000:0003f000
d0800000:00380000 d0c00000:00301000 d1000000:00302000
d1400000:00303000 d1800000:00304000 d1c00000:00305000
d2000000:00306000 d2400000:00307000 d2800000:00308000
d2c00000:00309000 d3000000:0030a000 d3400000:0030b000
d3800000:0030c000 d3c00000:0030d000 d4000000:0030e000
d4400000:0004f000 d4800000:00310000 d4c00000:00311000
e1000000:00315000 e1400000:010fe000 fc400000:0038d000
fc800000:0038e000 fcc00000:0038f000 fd000000:00390000
fd400000:00391000 fd800000:00392000 fdc00000:00393000
fe000000:00394000 fe400000:00395000 fe800000:00396000
fec00000:00397000 ff000000:00398000 ff400000:00399000
ff800000:0039a000 ffc00000:00031000
Thursday, November 22, 2007
Memory Management--Part 2
Posted by CABA LAM at 11:28 AM
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment