Saturday, 20 February 2016

Tracking Down the Root Cause of a Windows File Handling Bug

This blog post is about a bug in the Windows Explorer shell (useless from a security perspective I believe) that I thought I'd document. I'll explain the bug then go through how I tracked down the code responsible for the bug. Hopefully it serves as a brief tutorial on how you'd go about doing the same thing for other issues.

The Bug

The Windows Explorer shell has supported the concept of Shortcut files for as long as it's been around. These are your traditional LNK files. The underlying Windows operating system has no concept of these as being shortcuts to other files, it's all treated specially by Explorer. 

Since Vista the shell has supported another link format, NTFS symbolic links. You might think I'm slightly crazy at this point, surely symbolic links are just treated as an other file which would just happen to point to another file? While that would make more sense it seems that the developers in Explorer really did implement special support, and no only that they got it wrong as we'll see with this bug.

NTFS Symbolic Links use the Reparse Point feature of NTFS to change a file or directory into a symbolic link which the kernel will follow when opening the file. Under the hood the data structure set in the Reparse Point looks like the following:

typedef struct _REPARSE_DATA_BUFFER { ULONG ReparseTag; USHORT ReparseDataLength; USHORT Reserved; USHORT SubstituteNameOffset; USHORT SubstituteNameLength; USHORT PrintNameOffset; USHORT PrintNameLength; ULONG Flags; WCHAR PathBuffer[1]; } REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER;

This is a variable length structure and contains two strings, the Substitute Name and the Print Name. Why two names? Well the first the native NT path which represents the target of the symbolic link, this will be something like \??\C:\TargetFile. While the Print Name is normally set to what the user "thinks" the path is, so in this case C:\TargetFile. The rationale behind this is the NT name is ugly and somewhat unexpected for a user so if a program has to show the target the Print Name shows what the user might expect. However there's no requirement for these two to match in anyway. We can test this out using my Symbolic Link Testing Tools (available on Github here). The CreateNtfsSymbolicLink tool allows you to specify an arbitrary Print Name value (which the built-in MKLINK tool does not). Let's try it out (note you need to be an administrator):

Nothing too surprising, both links point to cmd.exe, but for the second one I've changed the Print Name to point to calc.exe instead. You can see the Print Names by just doing a directory listing. If you execute these files from the shell you'll find they both run cmd.exe as shown in the screenshot.

Now let's look at these files in the Explorer shell:

Hopefully you can immediately see the problem? And it's not just the icons, if you double click in Explorer link1.exe you get cmd.exe, link2.exe instead runs calc.exe. Whoops. It's pretty clear that the shell must be explicitly handling NTFS Symbolic Links as if they are Shortcut files and then picking the Print Name over the actual target file in the link. 

This feature does have some nice properties, for example the symbolic link can have any extension you like, so your link can have a .pdf extension but double clicking it will run cmd.exe (regardless of the extension you use). But then you could do that anyway with a LNK file as Explorer removes the .lnk extension. It might have been useful to attack a sandbox which calls ShellExecute on a file, but first checks the file name extension for allowed files. However as you need Administrator privileges to use this it's not especially useful in practice.

Tracking Down the Root Cause

Okay so we can guess what it's doing, at least let's track down the buggy code just to confirm it. We'd probably want to do this if we were sending an actual bug report to Microsoft (which I'm not of course, but they're more than welcome to fix a 9 year old bug if they like). Whenever I encounter a file based issue my go to is Process Monitor to find out the code responsible for handling the file contents.

To aid in this we need to configure Process Monitor to support symbol loading which you can do through the menu Options > Configure Symbols. If you go there you'll see the following dialog:

You'd assume that everything is already set-up for you, but if you try and get a stack trace from a monitored event you'll be disappointed. The version of the dbghelp library which ships with Windows (even Windows 10) doesn't support pulling symbols from a remote symbol server, so it's only useful for applications you've compiled yourself. We can remedy that though by installing WinDBG from the SDK or WDK and using it's copy of dbghelp.dll. If you've installed the Windows 10 SDK then you'll find it under %PROGRAMFILES(x86)%\Windows Kits\10\Debuggers\x64 for 64 bit platforms. Select the DLL and you should be good to go.

So we can set a filter on link2.exe and see what's processing it, we're primarily looking for an event doing a FileSystemControl operation to read the Reparse Point data with FSCTL_GET_REPARSE_POINT

Okay good the expected event is there, now if we open that event we can look at the stack tab see the culprit.

Well CShellLink::_LoadFromSymLink sounds very much like the culprit we're looking for, it's the last call before going into DeviceIoControl which ends up reading the Reparse Point information. Let's finally confirm by disassembling it in your application of choice. If you use IDA Pro it should try and load the public symbol file using the DIA library. We end up with something which looks like:

HRESULT CShellLink::_LoadFromSymLink(LPCWSTR pszInputPath) { PREPARSE_DATA_BUFFER ReparseBuffer = // Allocate reparse buffer HANDLE hFile = CreateFileW(pszInputPath, FILE_READ_EA, ..., FILE_FLAG_OPEN_REPARSE_POINT); size_t PathLength = 0; offset_t PathOffset = NULL; WCHAR pszPath[MAX_PATH]; if (hFile != INVALID_HANDLE_VALUE) { DeviceIoControl(hFile, FSCTL_GET_REPARSE_POINT, 0, 0, ReparseBuffer, ...); if (ReparseBuffer->ReparseTag == IO_REPARSE_TAG_SYMLINK) { PathLength = ReparseBuffer->PrintNameLength >> 1; PathOffset = (ReparseBuffer->PrintNameOffset >> 1) + 10; } else if (ReparseBuffer->ReparseTag == IO_REPARSE_TAG_MOUNT_POINT) { PathLength = ReparseBuffer->PrintNameLength >> 1; PathOffset = (ReparseBuffer->PrintNameOffset >> 1) + 8; } else { return E_FAIL; } StringCchCopyN(pszPath, MAX_PATH, (WCHAR*)ReparseBuffer + PathOffset, PathLength); _SetSimplePIDL(&pszPath); _ResetDirty(); } return S_OK; }

We can see the bug here pretty clearly, it's using the PrintName value. I guess it might be intentional as you can see this code also supports normal mount points and has the same issue. Fortunately for mount points there seems to be no way of directly tricking Explorer to parse the directory as anything else, but this might only trick another application which uses the ShellLink CoClass directly.

Anyway I hope this is useful as a very brief tutorial on how to find where vulnerable code lies in Windows, at least when dealing with files. It's a shame that this bug wasn't more serious, but fortunately the fact that Symbolic Links need administrator permissions might have worked in Microsoft's favour.