At the begining, all I wanted was to learn and implement a DLL unhooking technique called Perun’s Fart, originally created by SEKTOR7. I got ambitious and then I wanted to add some basic evasion techniques I learned in order to create a complete POC, ready to execute a Meterpreter or CobalStrike shellcode.
Night Walker is a project which includes various AV/EDR bypass techniques such as NTDLL unhooking, function call obfuscation, shellcode encryption, CreateThread and APC injection, IAT hooking, heap encryption, parent process id spoofing, AMSI patching, ETW patching.
I tested it against Kaspersky (premium), Defender, Defender ATP, SentinelOne, and I successfully bypassed them. I also tried to use NT APIs whenever possible.
In some cases, like SentinelOne, there were some alerts in the console.
For my french friends, you can check Processus blog about AV/EDR bypass, he has a good explanation in french of most of the techniques described here and also assisted me during the development of this project.
NTDLL Unhooking - PerunsFart
As mentioned before, I heard about this technique on SEKTOR7 blog. I did some googling and I found this awesome blog from dosxuz123. He even helped me when I had some issues trying to reproduce this technique, but using Nt APIs such as NtAllocateVirtualMemory instead of Virtualalloc.
Instead of copying only the syscall stubs as he did, I copied the whole .text section. I got this idea from here. Here is the final overview of what I did.
First, I created a suspended process using
Then, I retrieved and stored the size of NTDLL module in
dllSize. After that, I allocated a NULL buffer with the size of NTDLL at
pRemoteCode in my local process. I continued by reading NTDLL in the suspended process and copying it inside our previously allocated buffer,
pRemoteCode. Now that we have the clean NTDLL in our local process, we just have to do some copy-paste. The unhook function is responsible for copying the clean .text section to the hooked one.
But in order to understand this, you need to know how to get to the .text section of a DLL module. Open PE-Bear, and load
ntdll.dll. Looking at the figure below, the Sections headers is after the Optional header field of the NT headers structure.
We could manually parse the DOS header all the way down to the Section headers, but in
winnt.h, there’s a macro called IMAGE_FIRST_SECTION, which help accessing the first section header in the array of section headers of a PE file. Here’s the definition of this macro:
It essentially adds the size of the whole NT headers to the base address of the IMAGE_NT_HEADERS to get the address of the first section header.
There’s another macro in
winnt.h called IMAGE_SIZEOF_SECTION_HEADER. It represents the size of each section header in a PE file, which is usefull since we are going to loop through multiple sections until we found the one we want. So, in order to get a pointer to the Section header, here’s what we are going to do:
unhook function will look like this:
First I manually went from the DOS header to the NT headers. From there I used the
IMAGE_SIZEOF_SECTION_HEADER macros to loop through the sections headers until I found one with the name .text.
After I found it, I changed the protection of the hooked .text section to RWX. Then, I copied the clean .text section to the hooked one and I finally restored the protection. We now have an unhooked NTDLL.
It’s a good idea to patch amsi within our executable if we plan to use powershell post exploitation scripts. We are going to apply a single byte patch in the AmsiScanBuffer function inside amsi.dll. by doing so, each time amsi will scan a powershell script, AmsiScanBuffer will always return false. The original code is from MrUn1k0d3r and I only modified it to use NT APIs and a custom GetProcAddress.
ETW Patching - NtTraceEvent
Patching ETW events won’t do anything against kernel callbacks and mini filters, or anything working from the kernel.
Our goal here is to try to limit the events sent by our process to ETW (Event Tracing for windows).
Instead of patching
EtwEventWrite, we are going to patch directly
NtTraceEvent, which is the last function called from userland in regard to event registration.
The patch is simply a
ret (return) instruction, 0xc3, when the function is called. Again, the original code is from MrUn1k0d3r.
Within a PE file, there’s an array of data structures, one per imported DLL. Each of these structures gives the name of the imported DLL and points to an array of function pointers. The array of function pointers is known as the import address table (IAT). Each imported API has its own reserved spot in the IAT where the address of the imported function is written by the Windows loader.
IAT hooking is a technique used to replaced the function pointers specified in the IAT by the address of another function we want to execute. When the IAT is hooked, the program go as follows:
- The program calls
- The program looks up the
CreateRemoteThreadaddress in the IAT.
- Because the IAT has been tampered with, the
CreateRemoteThreadaddress in the IAT is pointing to a rogue
- The program jumps to the
HookedCreateRemThrretrieved in step 3.
HookedCreateRemThrintercepts the CreateRemoteThread parameters and executes some malicous code.
HookedCreateRemThrcalls the legitimate
kernel32!CreateRemoteThreadroutine. (I cheated and ended up calling NtCreateThreadEx instead :)
In order to successfully apply our hook inside the IAT, we first need to know how to access the Import Address Table with C/C++.
Open again Pe-Bear, but this time load calc.exe. Hover your mouse where it says imports and you’ll see that it says
Data Directory: Imports. This just means that the imports come from the Data Directory array at index 1. The Data Directory is an array of data structures located in the PE file’s Optional Header.
There’s is a WinAPI function called
ImageDirectoryEntryToDataExthat helps us retrieve the address of a directory entry. We just have to pass the base address of our current program, and the directory entry index we wish to retrieve, which in our case is 1, the IAT.
Once we are in the import table, a nice trick from SEKTOR7 is to use the original function address as a reference to search directly inside the IAT.
PIMAGE_THUNK_DATA represents a pointer to an entry in the First Thunk. The definition of PIMAGE_THUNK_DATA in winnt.h is as follows:
The u1 union is used to represent different types of data that can be stored in the First Thunk. Since we are using the original function address as a reference, we are interested in
u1.function because it holds the memory addres of the function we are looking for. Here is the final code:
OPSEC: In an environment with Microsoft Defender for Endpoint, my payload executed successfully, but there was a medium alert regarding the fact I queued a user APC. Probably a kernel notification when a program uses
The main advantage of this technique is that our shellcode get executed by the main thread of a suspended program, which meand EDR didn’t have time to hook it yet. Here’s a high level overview of the technique:
- Create a program in suspended state.
- Allocate memory for our shellcode inside the suspended program.
- We copy our shellcode to the buffer
- We queue a user APC pointing to our shellcode memory inside suspended progam.
Heap Encryption while sleeping
This is mainly for CobalStrike shellcode. the idea here is to encrypt the heap which contains CS configuration while we are sleeping.
Please refer to the original blog which explains everything in depth
Let see how this basic shellcode runner performed against AV and EDR.
Completely bypassed using Earlybird or a combination of CreateRemoteThread + IAT hooking.
Completely bypassed using a combination of CreateRemoteThread + IAT hooking. Earlybird didn’t work here.
Completely bypassed using Earlybird.
Defender for Endpoint (MDE)
The shellcode is excuted and keep running, but there’s an alert regarding the usage of QueueUserAPC function.
The program get blocked if we try to patch ETW, but when we remove the ETW patching, it gets executed successfully.
The shellcode is executed and keeps running, but there’s the following alert:
This is a shellcode runner I developpped to learn more about PerunsFart technique, but ended up adding more functionality.
It enhanced my knowledge about windows API, malware development, debugging and C pointers.
An evolution from this is to look at techniques such as Indirect syscall with dynamic syscall ID and try to implement them. Also, retrieving the shellcode from a remote server instead of hardcoding it inside our program should be fun to implement. Using Systemfunction033 for shelcode encryption and decryption is also better opsec. All of these already have a public poc out there so do your research and have fun.
People you need to follow who are doing an amazing job.
- Processus - Also assisted me and helped with the testing and development in this project.
- dosxuz123 - Helped me with some issues I had.
- AliceCliment - Check her amazing latest blog here.
- TheD1rkMtr - He’s constantly releasing really cool projects about malware dev.