DLL_PRELOAD: Solarish LD_PRELOAD for Windows NT

by Andy Polyakov <appro@fy.chalmers.se>


Abstract. Solaris (as well as practically every other Unix dialect) provides a mechanism for pre-loading and executing of code of user choice upon process start-up, before the execution control is passed to program entry-point. This gives you an option to intercept calls to other shared libraries (most notably libc.so:-) and modify their behavior or simply study the way application interacts with the system. To make the story short I find this functionality essential enough to devote an effort to implementing similar one for Windows NT.


Basically there're two [viable] options for pre-loading code into Win32 application:

I find neither method suitable for "every-day" usage. Indeed! AppInit_DLLs affects every single program (well, most of them, see below), not to mention that it's only read upon OS boot (i.e. you have to reboot each time you modify the value), which doesn't exactly facilitates the development... Injecting DLL after the program is started on the other hand might be just too late to be interesting, not to mention that it requires external wrapper program...

But what if we implement a compact, light-weight, dumb and robust DLL which we register as the only "AppInit_DLL" module and let this module pre-load other DLLs based on per-process environmental factors? I mean it could evaluate an environment variable (I picked %DLL_PRELOAD%:-), some per-executable registry key in HKCU (I picked Software\Microsoft\Windows NT\CurrentVersion\DLL_PRELOAD\*) and finally corresponding key in the HKLM hive and load listed DLLs... Of course you have to remember that AppInit_DLLs key is interpreted by USER32.DLL's initialization procedure. Meaning that the target application has to be linked with USER32.DLL (you can verify this by running DEPENDS.EXE or DUMPBIN.EXE /DEPENDENTS). This means that we can play the trick primarily on GUI applications, but for my purposes it was more than enough. Alternative could be to modify the executable and inject a reference to USER32.DLL... Note that I'm not referring to modification of the machine code, but only the import table where the libraries the executable is linked with are listed.

And so the journey began. First challenge. The DllMain reference page claims:

Warning The entry-point function (of our DLL_PRELOAD-ing module) should perform only simple initialization tasks. It must not call the LoadLibrary or LoadLibraryEx function (or a function that calls these functions), because this may create dependency loops in the DLL load order. This can result in a DLL being used before the system has executed its initialization code.

The first reason (dependency loops) is hardly real (DLLs are permitted to be linked with other DLLs and the run-time linker has to be able to deal with it), but the second part (running before internal data structures being initialized) is more than real and has to be respected. Unfortunately we don't know much about or have control over the order in wich DLLs are initialized (consult this MSJ article for additional information). In other words I can't just do whatever I was so excited about to do directly in DllMain and has to postpone it somehow...

As you might remember the idea is to pre-load libraries before execution control is passed to program entry point. Entry point! Unfortunately adjusting the pointer to it in PE header in memory doesn't work (at least not under W2K). Apparently the system makes a copy of the value before I get the chance to poke with it:-( So I had no other choice but to modify the code at entry point. I.e. replace the machine instructions at the program's entry point with a call to procedure of own design which upon return restores the original instructions and returns to the original entry point (yes, one more time). Or in other words:

DLL_PRELOAD.c
/* * Copyright (c) 2001-2004 Andy Polyakov <appro@fy.chalmers.se> * * Build with: * * cl -Ox -GD -GF -Zl -MD -LD DLL_PRELOAD.c advapi32.lib kernel32.lib * * See http://fy.chalmers.se/~appro/nt/DLL_PRELOAD/ for further details. * */ #ifndef _DLL #error "_DLL is not defined." #endif #ifdef _WIN64 #pragma comment(linker,"/entry:DllMain") #pragma comment(linker,"/base:0x7E000000") #pragma comment(linker,"/merge:.rdata=.text") #else #pragma comment(linker,"/entry:DllMain@12") #pragma comment(linker,"/base:0x7F000000") #pragma comment(linker,"/align:0x2000") #pragma comment(linker,"/section:.text,erw") #pragma comment(linker,"/merge:.rdata=.text") #pragma comment(linker,"/merge:.data=.text") #endif #define UNICODE #define _UNICODE #if defined(WIN32) && !defined(_WIN32) #define _WIN32 #endif #include <windows.h> #include <winbase.h> #include <winnt.h> #include <winreg.h> #include <tchar.h> #define SIZEOF_A(a) (sizeof(a)/sizeof(a[0])) /* * Implement some trivia in order to be excused from linking with MSVCRT. */ static void _memmove (void *dst, void *src, size_t n) { unsigned char *d=dst,*s=src; while (n--) *d++ = *s++; return; } static size_t _lstrlen (TCHAR *str) { int len=0; while (*str) { str++, len++; } return len; } static TCHAR *_lstrrchr (TCHAR *str,TCHAR c) { TCHAR *p = NULL; while (*str) { if (*str == c) p = str; str++; } return p; } static TCHAR *_lstrchr (TCHAR *str,TCHAR c) { TCHAR *p = NULL; while (*str) { if (*str == c) { p = str; break; } str++; } return p; } static TCHAR *_lstrncpy (TCHAR *dst,TCHAR *src,size_t n) { TCHAR *ret=dst; while(--n && *src) { *dst++ = *src++; } *dst=_T('\0'); return ret; } static HINSTANCE loaded_libs [256]; static int n_loaded_libs = 0; static void _load_lib (TCHAR *str,DWORD regtype) { HINSTANCE h; if (n_loaded_libs < SIZEOF_A(loaded_libs)) #if 1 { TCHAR path[MAX_PATH]; DWORD len; if (regtype==REG_EXPAND_SZ) { len=ExpandEnvironmentStrings (str,path,SIZEOF_A(path)); if (len==0 || len>SIZEOF_A(path)) { OutputDebugString (_T("Pathname is too long, ignoring...")); return; } str=path; } if (h=LoadLibrary (str)) loaded_libs [n_loaded_libs++] = h; } #else { TCHAR path[MAX_PATH],*bgn,*end,*p=path; size_t sz=SIZEOF_A(path),l; while ( sz>1 && (bgn=_lstrchr(str,_T('%'))) && (end=_lstrchr(bgn+1,_T('%'))) ) { *bgn=_T('\0'); _lstrncpy(p,str,sz); *bgn=_T('%'); l=_lstrlen(p); p+=l, sz-=l; *end=_T('\0'); GetEnvironmentVariable(bgn+1,p,sz); *end=_T('%'); l=_lstrlen(p); p+=l, sz-=l; str=end+1; } if (p==path) p=str; else _lstrncpy(p,str,sz), p=path; if (h=LoadLibrary (p)) loaded_libs [n_loaded_libs++] = h; } #endif else OutputDebugString (_T("Too many DLLs to load, ignoring...")); return; } static int _load_libs (TCHAR *str,DWORD regtype) { TCHAR *p; while (p = _lstrchr (str,_T(';'))) { if (p[1]==_T('-')) return -1; p[0] = _T('\0'); _load_lib (str,regtype); p[0] = _T(';'); str = p+1; } if (str[0]==_T('-')) return -1; _load_lib (str,regtype); return 0; } static int _load_by_key (HKEY hkey,TCHAR *val) { TCHAR str[1024]; LONG sz,ret; DWORD regtype; sz = sizeof(str); ret = RegQueryValueEx (hkey,val,NULL,&regtype,(BYTE *)str,&sz); if (ret == ERROR_FILE_NOT_FOUND) return 1; /* retry */ else if (ret == ERROR_SUCCESS) { if (regtype==REG_SZ || regtype==REG_EXPAND_SZ) sz /= sizeof(str[0]), sz = sz>=SIZEOF_A(str) ? SIZEOF_A(str)-1 : sz, str[sz]=_T('\0'), ret=_load_libs (str,regtype); } return ret; } static BOOL SkipHKCU() /* most notably when running as local system */ { HANDLE htoken=NULL; struct { TOKEN_USER tu; BYTE sid[64]; } token; DWORD sz; void *localsyssid; BOOL ret; if (!OpenProcessToken (GetCurrentProcess(),TOKEN_QUERY,&htoken)) return TRUE; ret=GetTokenInformation (htoken,TokenUser,&token,sizeof(token),&sz); CloseHandle (htoken); if (!ret) return TRUE; { SID_IDENTIFIER_AUTHORITY ntauth = SECURITY_NT_AUTHORITY; if (!AllocateAndInitializeSid (&ntauth, 1,SECURITY_LOCAL_SYSTEM_RID, 0,0,0,0,0,0,0,&localsyssid)) return TRUE; } ret=EqualSid (localsyssid,token.tu.User.Sid); FreeSid(localsyssid); return ret; } #define MY_ENVNAM _T("DLL_PRELOAD") #define MY_REGKEY \ _T("Software\\Microsoft\\Windows NT\\CurrentVersion\\DLL_PRELOAD") #if defined(_M_IX86) #define I_BUNDLE 8 /* (2*sizeof(unsigned long)) */ #elif defined(_M_IA64) #define I_BUNDLE 64 /* 3 bundles + function pointer + gp */ #elif defined(_M_AMD64) #define I_BUNDLE 16 /* mov rax,imm64; call *rax */ #endif static unsigned char saved_code [I_BUNDLE]; void _pre_load (void *entry_code) { DWORD acc; TCHAR str[256],exe[64],*s; int ret; HKEY hkey,hkey1; /* * Put the original machine code back to entry point. We managed * to make the code segment writable once, we should manage fine * now too, so I don't care about the return value... */ VirtualProtect (entry_code,I_BUNDLE,PAGE_EXECUTE_WRITECOPY,&acc); _memmove (entry_code,saved_code,I_BUNDLE); VirtualProtect (entry_code,I_BUNDLE,acc,&acc); FlushInstructionCache (GetCurrentProcess(),entry_code,I_BUNDLE); /* * Load libraries */ if (!GetModuleFileName (NULL,str,SIZEOF_A(str))) return; (s = _lstrrchr (str,_T('\\'))) ? s++ : (s=str); _lstrncpy (exe,s,SIZEOF_A(exe)); if (RegOpenKeyEx (HKEY_LOCAL_MACHINE,MY_REGKEY,0,KEY_QUERY_VALUE,&hkey) == ERROR_SUCCESS) { if ((ret=_load_by_key (hkey,_T("")))>=0 && /* all apps!!! */ (ret=_load_by_key (hkey,exe))>0 ) { if (RegOpenKeyEx (hkey,exe,0,KEY_QUERY_VALUE,&hkey1) == ERROR_SUCCESS) ret=_load_by_key (hkey1,_T("")), RegCloseKey(hkey1); } RegCloseKey (hkey); } if (ret<0) return; /* '-' was spotted, back off */ if (SkipHKCU()) return; if (RegOpenKeyEx (HKEY_CURRENT_USER,MY_REGKEY,0,KEY_QUERY_VALUE,&hkey) == ERROR_SUCCESS) { if (_load_by_key (hkey,exe)) { if (RegOpenKeyEx (hkey,exe,0,KEY_QUERY_VALUE,&hkey1) == ERROR_SUCCESS) _load_by_key (hkey1,_T("")), RegCloseKey(hkey1); } RegCloseKey (hkey); } if ((ret = GetEnvironmentVariable (MY_ENVNAM,str,SIZEOF_A(str))) > 0) if (ret < SIZEOF_A(str)) _load_libs (str,REG_SZ); else OutputDebugString (_T("DLL_PRELOAD: Environment variable is too long")); return; } #if defined(_M_IA64) void _my_entry_code(),_my_cutin_code(); #elif defined(_M_AMD64) void _my_entry_code(); #elif defined(_M_IX86) #pragma optimize("",off) static void _my_entry_code (unsigned long arg) { _asm { pushfd pushad lea eax,arg mov ebx,[eax-4] ; fetch the return address sub ebx,I_BUNDLE ; and rebias it back to the mov [eax-4],ebx ; original entry point push ebx ; NB!!! it's _pre_load that is call _pre_load ; responsible for putting the add esp,4 ; original machine code back!!! popad popfd } } #pragma optimize("",on) #endif BOOL WINAPI DllMain (HINSTANCE h, DWORD reason, LPVOID junk) { DWORD acc; HMODULE hmod; IMAGE_DOS_HEADER *dos_header; IMAGE_NT_HEADERS *nt_headers; void *entry_code; int i; switch (reason) { case DLL_PROCESS_ATTACH: DisableThreadLibraryCalls(h); if ((hmod = GetModuleHandle (NULL)) == NULL) { OutputDebugString (_T("DLL_PRELOAD: Unable to GetModuleHandle(NULL)")); return FALSE; } dos_header = (IMAGE_DOS_HEADER *)hmod; /* Sanity check */ if (dos_header->e_magic != IMAGE_DOS_SIGNATURE) { OutputDebugString (_T("DLL_PRELOAD: DOS Signature mismatch")); return FALSE; } nt_headers = (IMAGE_NT_HEADERS *)((char *)hmod + dos_header->e_lfanew); /* More sanity checks */ if (nt_headers->Signature != IMAGE_NT_SIGNATURE) { OutputDebugString (_T("DLL_PRELOAD: NT/PE Signature mismatch")); return FALSE; } /* * Find the entry point and distance to _my_entry_code() */ entry_code = ((unsigned char *)hmod + nt_headers->OptionalHeader.AddressOfEntryPoint); #if defined(_WIN64) && defined(_M_IA64) if (nt_headers->FileHeader.Machine != IMAGE_FILE_MACHINE_IA64) { OutputDebugString (_T("DLL_PRELOAD: Unsupported architecture")); return FALSE; } entry_code = *(void **)entry_code; #elif defined(_WIN64) && defined(_M_AMD64) if (nt_headers->FileHeader.Machine != IMAGE_FILE_MACHINE_AMD64) { OutputDebugString (_T("DLL_PRELOAD: Unsupported architecture")); return FALSE; } #elif defined(_WIN32) && defined(_M_IX86) if (nt_headers->FileHeader.Machine != IMAGE_FILE_MACHINE_I386) { OutputDebugString (_T("DLL_PRELOAD: Unsupported architecture")); return FALSE; } #else #error "Unsupported architecture." #endif _memmove (saved_code,entry_code,I_BUNDLE); if (VirtualProtect (entry_code,I_BUNDLE,PAGE_EXECUTE_WRITECOPY,&acc) == 0) { OutputDebugString (_T("DLL_PRELOAD: Unable to modify the code segment")); return FALSE; } #if defined(_M_IA64) /* * Pick first three bundles at _my_entry_code. Note that * _my_entry_code as seen by C code is not a pointer to first * instruction, but a pointer to a structure comprising * pointer to instruction and corresponding gp value. */ _memmove (entry_code,*(void**)_my_entry_code,48); /* ...append _my_cutin_code's descriptor. */ _memmove ((void**)entry_code+48/sizeof(void*),_my_cutin_code,16); #elif defined(_M_AMD64) { unsigned long *machine_code=entry_code; void *proc64 = _my_entry_code; /* * Replace first 16 bytes at entry point with * "nop; nop; mov $proc64,%rax; nop; nop; call *%rax" */ machine_code [0] = 0xB8489090U; machine_code [1] = (unsigned long)proc64; machine_code [2] = (unsigned long)(proc64>>32); machine_code [3] = 0xD0FF9090U; } #elif defined(_M_IX86) { unsigned long *machine_code=entry_code; size_t e8_offset; /* * Replace first 8 bytes at entry point with * "lea 0(%esi),%esi; call _my_entry_point" intructions */ e8_offset = ( (size_t)_my_entry_code - ((size_t)entry_code + I_BUNDLE) ); machine_code [0] = 0xE800768DU; machine_code [1] = e8_offset; } #endif VirtualProtect (entry_code,I_BUNDLE,acc,&acc); FlushInstructionCache (GetCurrentProcess(),entry_code,I_BUNDLE); break; case DLL_PROCESS_DETACH: for (i=n_loaded_libs-1;i>=0;i--) FreeLibrary (loaded_libs[i]); n_loaded_libs=0; break; } return TRUE; }

As Win64 compilers don't support inline assembler, you have to separately assemble and link in this assembler module into IA-64 binary and this one - into AMD64 one.

Security consideration! If you choose to deploy this module to enforce some per-application policy, do assign absolute path to the AppInit_DLLs value, e.g. C:\WINNT\DLL_PRELOAD.DLL, so that it can not be bypassed by another module with the same name residing elsewhere on the %PATH% or even in current working directory. I have to recommend C:\WINNT and not say C:\WINNT\system32 because of 32-char limitation on AppInit_DLLs value length. Windows XP and later offer alternative solution to this problem. If HKLM\System\CurrentControlSet\Control\Session Manager\SafeDllSearchMode is set to 1, then current working directory is searched after the system one. Yet another option is to register DLL_PRELOAD as a "Known DLL," in which case the module installed in the system directory shall be loaded. Special note about Win64. Keep in mind that you want native and legacy applications to load different modules, %SystemRoot%\system32\DLL_PRELOAD.DLL and %SystemRoot%\SysWOW64\DLL_PRELOAD.DLL respectively, so that special care has to be taken to make sure search pathes are separated.

To facilitate policy-oriented deployment, keys are actually evaluated in reverse order (opposite to one depicted in the beginning), in paticular HKLM is searched first. In addition if you terminate the key value in HKLM with -, e.g. C:\App\Policy.dll;-, then processing stops and no other modules will be preloaded. Well, another [untrusted] module loaded by application at later point might have an opportunity to bypass the policy, unless you explicitely address this problem in some manner, e.g. by making sure names passed to LoadLibrary are absolute and trusted pathes.


What's next? Well, just use your imagination:-) For example... Have you ever ran into a program that insists on opening a file read-write in place you don't want to relax ACL and it does it even if it has no intention to write to it? Intercept CreateFile and mask off the write bits! Ran into application with pre-compiled path to non-existing location? Rebias all references to a point in a file system (say C:\TMP to be redirected to %TEMP% for the target application) by overriding same API call. There is a catch though! The problem is that import tables, so called .idata segments, are private to DLLs and if we want to intercept all calls to certain API routine, then we have to examine and fix-up all these tables. The latter implies that all the modules has to be loaded and processed by run-time linker before we can proceed with our "surveillance" activities. Any other options? It should be noted that intercepting so called native API calls at KERNEL32 "level" is way more efficient. At least it's a single point to patch, namely reference to NtCreateFile in KERNEL32's .idata segment. Well, you would have to purchase an appropriate book to code such modules, but you do get excused from listing the loaded modules and spying on the LoadLibrary itself to cover up for the cases when the application in question (or a DLL on its behalf) dynamically loads extra DLLs during the course of execution. Anyway, here is a template module which simply logs the filenames being opened/created.

LogNtCreateFile.c
/* * Copyright (c) 2003 Andy Polyakov <appro@fy.chalmers.se> * * Build with: * * cl -Ox -GD -GF -Zl -MD -LD LogNtCreateFile.c kernel32.lib * * See http://fy.chalmers.se/~appro/nt/DLL_PRELOAD/ for further details. * */ #ifndef _DLL #error "_DLL is not defined." #endif #ifdef _WIN64 #pragma comment(linker,"/entry:DllMain") #pragma comment(linker,"/merge:.rdata=.text") #else #pragma comment(linker,"/entry:DllMain@12") #pragma comment(linker,"/section:.text,erw") #pragma comment(linker,"/merge:.rdata=.text") #pragma comment(linker,"/merge:.data=.text") #endif #define UNICODE #define _UNICODE #if defined(WIN32) && !defined(_WIN32) #define _WIN32 #endif #define _WIN32_WINNT 0x0500 #include <windows.h> #include <winbase.h> #include <winnt.h> #ifdef _WIN64 /* October 2002 Platform SDK is screwed up */ #define _RUNTIME_FUNCTION _RUNTIME_FUNCTION_ #define RUNTIME_FUNCTION RUNTIME_FUNCTION_ #define PRUNTIME_FUNCTION PRUNTIME_FUNCTION_ #endif #include <winternl.h> #include <tchar.h> typedef NTSTATUS (WINAPI *NtCreateFile_T) ( OUT PHANDLE FileHandle, IN ACCESS_MASK DesiredAccess, IN POBJECT_ATTRIBUTES ObjectAttributes, OUT PIO_STATUS_BLOCK IoStatusBlock, IN PLARGE_INTEGER AllocationSize OPTIONAL, IN ULONG FileAttributes, IN ULONG ShareAccess, IN ULONG CreateDisposition, IN ULONG CreateOptions, IN PVOID EaBuffer OPTIONAL, IN ULONG EaLength ); static NtCreateFile_T _NtCreateFile=NULL,*__NtCreateFile=NULL; static NTSTATUS WINAPI NtCreateFile_ ( OUT PHANDLE FileHandle, IN ACCESS_MASK DesiredAccess, IN POBJECT_ATTRIBUTES ObjectAttributes, OUT PIO_STATUS_BLOCK IoStatusBlock, IN PLARGE_INTEGER AllocationSize OPTIONAL, IN ULONG FileAttributes, IN ULONG ShareAccess, IN ULONG CreateDisposition, IN ULONG CreateOptions, IN PVOID EaBuffer OPTIONAL, IN ULONG EaLength ) { OutputDebugStringW(ObjectAttributes->ObjectName->Buffer); return (*_NtCreateFile)( FileHandle, DesiredAccess, ObjectAttributes, IoStatusBlock, AllocationSize, FileAttributes, ShareAccess, CreateDisposition, CreateOptions, EaBuffer, EaLength); } typedef NTSTATUS (WINAPI *NtOpenFile_T) ( OUT PHANDLE FileHandle, IN ACCESS_MASK DesiredAccess, IN POBJECT_ATTRIBUTES ObjectAttributes, OUT PIO_STATUS_BLOCK IoStatusBlock, IN ULONG ShareAccess, IN ULONG OpenOptions ); static NtOpenFile_T _NtOpenFile=NULL,*__NtOpenFile=NULL; static NTSTATUS WINAPI NtOpenFile_ ( OUT PHANDLE FileHandle, IN ACCESS_MASK DesiredAccess, IN POBJECT_ATTRIBUTES ObjectAttributes, OUT PIO_STATUS_BLOCK IoStatusBlock, IN ULONG ShareAccess, IN ULONG OpenOptions ) { OutputDebugStringW(ObjectAttributes->ObjectName->Buffer); return (*_NtOpenFile)( FileHandle, DesiredAccess, ObjectAttributes, IoStatusBlock, ShareAccess, OpenOptions); } static int _lstricmp(const char *s1, const char *s2) { char c1,c2; int ret; while (c1=*s1, c2=*s2, c1&&c2) { c1|=0x20, c2|=0x20; /* lower the case */ if (ret=c1-c2) return ret; s1++, s2++; } return c1-c2; } BOOL WINAPI DllMain (HINSTANCE h, DWORD reason, LPVOID junk) { DWORD acc; HMODULE hmod; IMAGE_DOS_HEADER *dos_header; IMAGE_NT_HEADERS *nt_headers; IMAGE_DATA_DIRECTORY *dir; IMAGE_IMPORT_DESCRIPTOR *idesc; IMAGE_THUNK_DATA *thunk; static void *page; static size_t plen; switch (reason) { case DLL_PROCESS_ATTACH: DisableThreadLibraryCalls(h); if (!(hmod=GetModuleHandle(_T("NTDLL.DLL")))) { OutputDebugString(_T("NTDLL.DLL not found?")); return FALSE; } _NtCreateFile=(NtCreateFile_T)GetProcAddress(hmod,"NtCreateFile"), _NtOpenFile=(NtOpenFile_T)GetProcAddress(hmod,"NtOpenFile"); if (!(hmod=GetModuleHandle(_T("KERNEL32.DLL")))) { OutputDebugString(_T("KERNEL32.DLL not found?")); return FALSE; } dos_header = (IMAGE_DOS_HEADER *)hmod; nt_headers = (IMAGE_NT_HEADERS *)((char *)hmod + dos_header->e_lfanew); dir=&nt_headers->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT]; idesc=(IMAGE_IMPORT_DESCRIPTOR *)((char *)hmod + dir->VirtualAddress); while (idesc->Name) { if (!_lstricmp((char *)hmod+idesc->Name,"NTDLL.DLL")) break; idesc++; } if (!idesc->Name) { OutputDebugString(_T("Can't locate NTDLL.DLL import descriptor")); return FALSE; } for (thunk=(IMAGE_THUNK_DATA *)((char *)hmod+idesc->FirstThunk); thunk->u1.Function && thunk->u1.Function!=(ULONG_PTR)_NtCreateFile; thunk++) ; __NtCreateFile=(NtCreateFile_T *)thunk; for (thunk=(IMAGE_THUNK_DATA *)((char *)hmod+idesc->FirstThunk); thunk->u1.Function && thunk->u1.Function!=(ULONG_PTR)_NtOpenFile; thunk++) ; __NtOpenFile=(NtOpenFile_T *)thunk; if ((void *)__NtOpenFile > (void *)__NtCreateFile) page=__NtCreateFile, plen=(size_t)__NtOpenFile-(size_t)page+sizeof(void(*)()); else page=__NtOpenFile, plen=(size_t)__NtCreateFile-(size_t)page+sizeof(void(*)()); if (!VirtualProtect (page,plen,PAGE_EXECUTE_READWRITE,&acc)) { OutputDebugString(_T("Unable to unlock Thunk Table")); return FALSE; } *__NtOpenFile=NtOpenFile_; *__NtCreateFile=NtCreateFile_; VirtualProtect (page,plen,acc,&acc); break; case DLL_PROCESS_DETACH: VirtualProtect (page,plen,PAGE_EXECUTE_READWRITE,&acc); *__NtOpenFile=_NtOpenFile; *__NtCreateFile=_NtCreateFile; VirtualProtect (page,plen,acc,&acc); break; } return TRUE; }

It's possible to bypass the filter by making an explicit native API call. If you strive to catch even such cases, see next module for suitable technic.

Here is another cool snippet demonstrating another sofisticated interception trick. This module so to say sells one-way VM tickets:-) It first modifies machine code at NtProtectVirtualMemory entry point, then sets up a filter rejecting all attempts to manipulate page access permissions on KERNEL32.DLL, NTDLL.DLL and itself and finally throws away the key. The latter means that when the module is done, not even the module itself will be able to tamper with filter, as it will require modification of page access permissions, while such attempts will be scuruplously rejected. Digest this:

VirtualProtect.c
/* * Copyright (c) 2003 Andy Polyakov <appro@fy.chalmers.se> * * Build with: * * cl -Ox -GD -GF -Zl -MD -LD VirtualProtect.c user32.lib kernel32.lib * * See http://fy.chalmers.se/~appro/nt/DLL_PRELOAD/ for further details. * */ #ifndef _DLL #error "_DLL is not defined." #endif #pragma comment(linker,"/entry:DllMain@12") #pragma comment(linker,"/section:.text,erw") #pragma comment(linker,"/merge:.rdata=.text") #pragma comment(linker,"/merge:.data=.text") #define UNICODE #define _UNICODE #if defined(WIN32) && !defined(_WIN32) #define _WIN32 #endif #define _WIN32_WINNT 0x0500 #include <windows.h> #include <winbase.h> #include <winnt.h> #include <winternl.h> #include <tchar.h> typedef NTSTATUS (WINAPI *NtProtectVirtualMemory_T) ( IN HANDLE ProcessHandle, IN OUT PVOID *BaseAddress, IN OUT PULONG ProtectSize, IN ULONG NewProtect, OUT PULONG OldProtect ); static int magicvalue=-1,passthrough=0; static struct { void *addr0,*addr1; char *msg; } sentinel[3]; static void WINAPI NtProtectVirtualMemory__( IN HANDLE ProcessHandle, IN OUT PVOID *BaseAddress, IN OUT PULONG ProtectSize, IN ULONG NewProtect, OUT PULONG OldProtect ) { /* return whatever next function returns */ } static int NtProtectVirtualMemory_( void *caller, IN HANDLE ProcessHandle, IN OUT PVOID *BaseAddress, IN OUT PULONG ProtectSize, IN ULONG NewProtect, OUT PULONG OldProtect ) { int i; void *p; if (passthrough) { size_t page1=(size_t)caller & ~4095, page2=(size_t)NtProtectVirtualMemory_ & ~4095; passthrough=0; if (page1 == page2) { OutputDebugString(_T("VirtualProtect Sentinel: letting myself through...")); return magicvalue; } } for (p=*BaseAddress,i=0;i<3;i++) { if (p>=sentinel[i].addr0 && p<sentinel[i].addr1) { volatile void **ra=&caller-1; OutputDebugStringA(sentinel[i].msg); *ra=NtProtectVirtualMemory__; /* wow! */ return STATUS_ACCESS_VIOLATION; } } return magicvalue; } static _VirtualLockup( HINSTANCE hmod, LPVOID entry, SIZE_T dwSize, DWORD flNewProtect, PDWORD lpflOldProtect) { NtProtectVirtualMemory_T func=(NtProtectVirtualMemory_T)entry; SIZE_T len; HANDLE pid=GetCurrentProcess(); passthrough=1; (*func) (pid,&entry,&dwSize,flNewProtect,lpflOldProtect); len=(size_t)sentinel[2].addr1-(size_t)sentinel[2].addr0; passthrough=1; /* * This locks up the whole module, no variable * modifications will be possible. */ (*func) (pid,&hmod,&len,PAGE_EXECUTE_READ,lpflOldProtect); } BOOL WINAPI DllMain (HINSTANCE self, DWORD reason, LPVOID junk) { DWORD acc; HMODULE hmod; unsigned char *entry; MSGBOXPARAMS m; IMAGE_DOS_HEADER *dos_header; IMAGE_NT_HEADERS *nt_headers; m.cbSize = sizeof(m); m.hwndOwner = NULL; m.lpszCaption = _T("VirtualProtect Sentinel"); m.dwStyle = MB_OK; m.hInstance = NULL; m.lpszIcon = IDI_ERROR; m.dwContextHelpId = 0; m.lpfnMsgBoxCallback = NULL; m.dwLanguageId = MAKELANGID(LANG_ENGLISH,SUBLANG_ENGLISH_US); switch (reason) { case DLL_PROCESS_ATTACH: DisableThreadLibraryCalls(self); hmod = self; dos_header = (IMAGE_DOS_HEADER *)hmod; nt_headers = (IMAGE_NT_HEADERS *)((char *)hmod + dos_header->e_lfanew); sentinel[2].addr0=hmod; sentinel[2].addr1=(char *)hmod+nt_headers->OptionalHeader.SizeOfImage; sentinel[2].msg="VirtualProtect Sentinel: protecting myself"; if (!(hmod=GetModuleHandle(_T("KERNEL32.DLL")))) { m.lpszText = _T("KERNEL32.DLL not found?, aborting..."); MessageBoxIndirect(&m); ExitProcess(0); return FALSE; } dos_header = (IMAGE_DOS_HEADER *)hmod; nt_headers = (IMAGE_NT_HEADERS *)((char *)hmod + dos_header->e_lfanew); sentinel[1].addr0=hmod; sentinel[1].addr1=(char *)hmod+nt_headers->OptionalHeader.SizeOfImage; sentinel[1].msg="VirtualProtect Sentinel: protecting KERNEL32.DLL"; if (!(hmod=GetModuleHandle(_T("NTDLL.DLL")))) { m.lpszText = _T("NTDLL.DLL not found?, aborting..."); MessageBoxIndirect(&m); ExitProcess(0); return FALSE; } dos_header = (IMAGE_DOS_HEADER *)hmod; nt_headers = (IMAGE_NT_HEADERS *)((char *)hmod + dos_header->e_lfanew); sentinel[0].addr0=hmod; sentinel[0].addr1=(char *)hmod+nt_headers->OptionalHeader.SizeOfImage; sentinel[0].msg="VirtualProtect Sentinel: protecting NTDLL.DLL"; entry=(unsigned char*)GetProcAddress(hmod,"NtProtectVirtualMemory"); if (!entry || entry[0] != 0xb8) /* mov eax,imm32 */ { m.lpszText = _T("Unexpected instruction, aborting..."); MessageBoxIndirect(&m); ExitProcess(0); return FALSE; } if (!VirtualProtect (entry,5,PAGE_EXECUTE_READWRITE,&acc)) { m.lpszText = _T("Unable to unlock code page, aborting..."); MessageBoxIndirect(&m); ExitProcess(0); return FALSE; } magicvalue=*((unsigned int *)(entry+1)); entry[0] = 0xe8; /* call relative */ *((unsigned int *)(entry+1)) = (unsigned char *)NtProtectVirtualMemory_ -(entry+5); FlushInstructionCache(GetCurrentProcess(),entry,5); _VirtualLockup (self,entry,5,acc,&acc); break; case DLL_PROCESS_DETACH: return FALSE; break; } return TRUE; }

Well, there actually is a simple way around it: just implement your own gate to NtProtectVirutalMemory's kernel-space counterpart. So that this module has to complemented with one making sure modules loaded with LoadLibrary are all trusted. Alternative is to design a kernel module which makes sure the system calls are originating from within NTDLL.DLL. It should be noted that this code can't be used together with LogNtCreateFile.c sample, latter causes former to [naturally!] crash at program exit. Well, last two snippets are basically templates assigned to free your imagination, not some kind of production code anyway:-)

As you might have noticed last module is not ported to Win64/IA-64 [yet]. There actually is a way to intercept [native] calls without modifying the machine code and coding an assembler shim. As already pointed out (in DLL_PRELOAD.c commentary) a pointer to a Win64 function is not an address of its first instruction, but a pointer to a structure comprising the pointer to instruction and accompanying global pointer value... Yes, I mean that it's perfectly possible to intercept a call by modifying this structure instead of the machine code. Unfortunately, internal calls, those made from within same .DLL, slips through, but on the pros side you can intercept external KERNEL32.DLL calls as effectively as native ones (recall the reason how did native API come into picture in first place).

A closing note. Some of you might wonder can't I just wrap KERNEL32.DLL or NTDLL.DLL to intercept the calls? You know, create a fake DLL which would redirect all the procedures to the original library, but the one(s) of interest? Well, you should have known better:-) KERNEL32.DLL and NTDLL.DLL are "Known DLLs," they are pre-linked upon system boot and just appear pre-mapped into the process address space, they simply can't be wrapped. Well, you can make them less "known," but at the cost of performance at applications start-up. Then wrapping is not a universal panacea. Most notably if your application refers to a resource in target DLL, then you're in trouble.

Happy Hacking...