Intro

This article is about reversing a rust pre-compilation macro that obfuscates strings in the compiled binary.
Another point that will be discussed is the development of a BinaryNinja plugin. I’m writing the article as I go along,
to capture the process as closely as possible, with its changes of direction, disappointments and achievements :)
The purpose of this story is not to provide a turnkey plugin, but rather to explain my reverse engineering methodologies and thoughts,
as well as to progress in the use of the Binary Ninja API. No more spoiler, let’s go!

First attempt

The target

Litcrypt is a Rust proc macro that obfuscates text using a basic XOR method.
Strings are xored at compile time to hide them, and then are “decrypted” at runtime.

The code

I wrote a dumb crackme program to study the macro, here is the source code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#[macro_use]
extern crate litcrypt;
use_litcrypt!();

use std::env;
use std::process;
use std::process::ExitCode;

fn banner()
{
println!("{}", lc!("Welcome to the FlagKeeper\n"));
println!("{}", lc!("Can you find the hidden secret ?"));
}

fn main()
{
// Get command line arguments
let args: Vec<String> = env::args().collect();

// Check if there are enough arguments
if args.len() != 3 {
println!("{}", lc!("Usage: ./delitcrypt <int> <int>"));
std::process::exit(1);
}

// Parse command line arguments into integers
let num1: i32 = args[1].parse().expect("Invalid argument 1");
let num2: i32 = args[2].parse().expect("Invalid argument 2");

// Compare integers
if num1 == num2 + 1300 + 37 {
println!("{}", lc!("Congratz, here is your flag:"));
println!("{}", lc!("FLAG{X0r_Isn7_S@f3}"));
}
else {
println!("{}", lc!("You failed..."));
}
}

We can see the use of lc! macro function.

Let’s build the binary and check if strings are obfuscated.

As expected, strings are not visible in the compiled binary.

Reversing

Having a look at the main function, we quickly identify the decrypt function.

Knowing that this is a simple xor routine, it is unnecessary to reverse the function, but we need to figure out which data is xored.
By analyzing the function arguments, we notice that five args are involved. We can reasonably deduce the usefulness of each of the parameters:

  • The first one is a stack variable, probably to store the result string pointer
  • The second one is an initialized data pointer, and seems to be the “encrypted” string
  • The third one is an integer and might be the encrypted string length Yes it is :)
  • Two last parameters should be the xor key and its length
    1
    2
    >>> bv.read(0x430c6,5)
    b'!jx15'
    Let’s xor this data and see what happens:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    >>> def xor_bytes_with_key(data, key):
    ... result = bytearray(len(data))
    ... for i in range(len(data)):
    ... result[i] = data[i] ^ key[i % len(key)]
    ... return bytes(result)
    ...
    >>>
    >>> xor_bytes_with_key(bv.read(0x43001,0x1b),bv.read(0x430c6,5))
    b'Welcome to the FlagKeeper\\n'
    From this point, it is possible to decrypt each string. Time to automate :)

Plugin dev

The strategy for this plugin consists in several steps:

  • Get the full “decrypt_bytes” symbol and deal with rust name mangling
  • Find all references to this symbol
  • Parse operands and decrypt the strings
  • Display the strings

Catch the symbol

To obtain the symbol, the plugin will get all symbols from the binary, and test if “decrypt_bytes” is part of it. I don’t know if this is the best way to get the symbol object but, well, it actually works, feel free to PR if you have any other idea ¯\(ツ)

1
2
3
4
5
6
7
8
9
10
>>> syms = bv.get_symbols()
>>> type(syms)
<class 'list'>
>>> target_sym = next((sym for sym in syms if "decrypt_bytes" in sym.name ), None)
>>> target_sym
<FunctionSymbol: "delitcrypt::litcrypt_internal::decrypt_bytes::h4a7ea04fe8edb027" @ 0x9010>
>>> target_sym.full_name
'delitcrypt::litcrypt_internal::decrypt_bytes::h4a7ea04fe8edb027'
>>> target_sym.name
'_ZN10delitcrypt17litcrypt_internal13decrypt_bytes17h4a7ea04fe8edb027E'

I use an “if in” statement to get the symbol. This is made in case the decrypt function changes its prototype in the future Litecrypt version or whatever.
From the symbol object, we can easily obtain all cross-reference with the powerfull Binary Ninja API :D

Get references

As said above, the API can be used to obtain each code reference to the decrypt_bytes symbol:

1
2
3
4
>>> refs = bv.get_code_refs(target_sym.address)
>>>
>>> refs
<generator object BinaryView.get_code_refs at 0x7f530c041470>

The Next step is to iterate over the generator object, and get all operands.

Operands

To get operand value, it is necessary to use an intermediate language level.

1
2
3
4
5
6
>>> current_ref.hlil.operands
[<HighLevelILConstPtr: delitcrypt::litcrypt_internal::decrypt_bytes::h4a7ea04fe8edb027>, [<HighLevelILAddressOf: &var_98>, <HighLevelILConstPtr: &data_43001>, <HighLevelILConst: 0x1b>, <HighLevelILConstPtr: "!jx15called `Result::unwrap()` o…">, <HighLevelILConst: 5>]]
>>> current_ref.mlil.operands
[[], <MediumLevelILConstPtr: 0x9010>, [<MediumLevelILVar: rdi>, <MediumLevelILConstPtr: 0x43001>, <MediumLevelILConst: 0x1b>, <MediumLevelILConstPtr: "!jx15called `Result::unwrap()` o…">, <MediumLevelILConst: 5>]]
>>> current_ref.llil.operands
[<LowLevelILConstPtr: 0x9010>]

Comparing the different representations, the HLIL seems to be a good candidate, as it is the easiest to parse.

Decrypt

First we get the encrypted string:

1
2
3
4
>>> s1 = bv.read(s1,current_ref.hlil.operands[1][2].value.value)
>>>
>>> s1
b'v\x0f\x14RZL\x0fXEZ\x01\x1e\x10T\x15g\x06\x19V~D\x0f\x08TG}\x04'

Then grab the key in the same way:

1
2
3
4
5
6
7
8
>>> s2 = current_ref.hlil.operands[1][3].value.value
>>>
>>> s2
274630
>>> s2 = bv.read(s2,current_ref.hlil.operands[1][4].value.value)
>>>
>>> s2
b'!jx15'

And here we go, we can now xor the extracted bytes:

1
2
>>> xor_bytes_with_key(s1,s2)
b'Welcome to the FlagKeeper\\n'

And get the string in plain text \o/

Set the string somewhere

Now, the decrypted string must be linked to the decrypt function call. I made the choice to put is as comment among the decompiled code.

This is a success!

Let’s try it in real life now

Life is hard

After some research on github, I found some projects using litcrypt.

I now realise that the library is widely used for EDR bypass and malware development, and this point leads to many others:

  • Binaries might be stripped
  • Must be mainly compiled for windows
  • The xor key must be consistent (let’s assume it’s at least 16 bytes)

A new strategy

Here is the new attack plan :

  • Cross compile from linux
  • Use at least a 16-byte xor key
  • Strip the binary

Compilation

Compiling the original binary again, but with a 32-byte key, and with a window target:

1
2
3
4
5
6
7
8
9
10
ghozt@maze:~/research/delitcrypt$ export LITCRYPT_ENCRYPT_KEY="IsThisKeyReallythirty2BytesLong?"
ghozt@maze:~/research/delitcrypt$ echo $LITCRYPT_ENCRYPT_KEY
IsThisKeyReallythirty2BytesLong?
ghozt@maze:~/research/delitcrypt$ cargo build --release --target x86_64-pc-windows-gnu
warning: unused manifest key: package.strip
Compiling delitcrypt v0.1.0 (/home/ghozt/research/delitcrypt)
Finished release [optimized] target(s) in 0.16s
ghozt@maze:~/research/delitcrypt$ strip target/x86_64-pc-windows-gnu/release/delitcrypt.exe
ghozt@maze:~/research/delitcrypt$ file target/x86_64-pc-windows-gnu/release/delitcrypt.exe
target/x86_64-pc-windows-gnu/release/delitcrypt.exe: PE32+ executable (console) x86-64 (stripped to external PDB), for MS Windows, 10 sections

Reverse

Opening the fresh compiled program in Binary Ninja, and all has changed…

How to identify the decrypt_bytes function, without any symbols? What do we know?

  • The encrypted strings as well as the xor key are stored in .rodata (for ELF, .rdata for EXE) section
  • The function performs a xor operation

Let’s try to manually find the function, and then try to automate the research.

Spot the needle

Scrolling in the .rdata section, we can quickly identify data that are not really printable strings:

This data has only one cross-reference to a function which signatures match with the decrypt_bytes one, but is different from the one we saw in the Linux compiled binary, with a short xor key. I thus deduce that the key size changes the function signature.

I also compile the same binary on a Windows machine to see if the signature and functions are compiled in the same way, and that’s the case.
From this, we can retrieve our “decrypt_bytes” function!

Automate

Now, the idea is to dig in Binary Ninja API, and think about a way to automatically identify the function, get all refs, parse parameters, and recover strings.

First, we need to list all data_vars, and get references for each one of them:

1
2
3
4
5
6
7
8
9
10
11
12
>>> for data_var in bv.data_vars:
... l = list(bv.get_code_refs(data_var))
... if l:
... l
[<ref: x86_64@0x14000f304>, <ref: x86_64@0x14001bed3>, <ref: x86_64@0x14001bedc>, <ref: x86_64@0x14001bef9>]
[<ref: x86_64@0x14001beec>]
[<ref: x86_64@0x14001bef7>]
[<ref: x86_64@0x14001bf04>]
[<ref: x86_64@0x14001bf1a>]
[<ref: x86_64@0x140001017>]
[<ref: x86_64@0x1400018d0>, <ref: x86_64@0x140001979>]
.....

Works fine!

Now, let’s get the detailed instructions of each code reference. I made the choice to use the LLIL:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
...
<LowLevelILSetReg: rcx = [rbx {0x140028348}].q>
<LowLevelILSetReg: rax = 0x140028348>
<LowLevelILRet: <return> jump(pop)>
<LowLevelILSetReg: rbx = rax>
<LowLevelILIf: if ([rax {0x140028350}].q == 0) then 43 @ 0x14001bc5c else 46 @ 0x14001bc3e>
<LowLevelILSetReg: rcx = rax>
<LowLevelILCall: call(0x14001bec0)>
<LowLevelILSetReg: rax = [rbx {0x140028350}].q>
<LowLevelILSetReg: rax = 0x140028350>
<LowLevelILRet: <return> jump(pop)>
<LowLevelILStore: [rsp + 0x98].q = 1>
<LowLevelILStore: [rsi].o = xmm0>
...

Fine! The operation we are looking for is a XOR, thus it will be an instance of LowLevelILOperation of type LLIL_XOR

It is possible to reach the operation of an operand by accessing the reference operands operation.
As an example:

1
2
>>> list(bv.get_code_refs(0x14001e4b8))[0].llil.operation
<LowLevelILOperation.LLIL_SET_REG: 1>

For a xor operation, two operands are involved, an ILRegister, and a LowLevelILXor.
Logically, the xor operation will be the second operand.

Let’s put it all together:

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> for data_var in bv.data_vars:
... l = list(bv.get_code_refs(data_var))
... if l:
... for r in l:
... if len(r.llil.operands) == 2:
... if not type(r.llil.operands[1]) is dict: # Sometime, operands can be dict
... if r.llil.operands[1].operation == LowLevelILOperation.LLIL_XOR:
... print("FOUND XOR for reference {}".format(r))
...
...
FOUND XOR for reference <ref: x86_64@0x140001b77>
FOUND XOR for reference <ref: x86_64@0x14001a897>
>>>

In the plugin code, with some error management:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def find_xor(bv):
for data_var in bv.data_vars:
# Get references for the current data_var
data_var_references = list(bv.get_code_refs(data_var))
# Need to check if there is at least one reference
if data_var_references:
for ref in data_var_references:
try:
rllil = ref.llil
except:
print("Cannot get llil for {}".format(ref))
continue
if not rllil:
continue
if len(rllil.operands) == 2:
if not type(ref.llil.operands[1]) is dict:
if ref.llil.operands[1].operation == LowLevelILOperation.LLIL_XOR:
print("FOUND XOR operation on {} at {}".format(hex(data_var),hex(ref.address)))

At this point, a list of all data that are involved in a xor operation can be obtained. This code may be useful for malware analysis (or CTF). Indeed, XOR is largely used in maldev because of its low-entropy generation.

The next step is to get the caller functions for these references and check its signature. To perform this, some prerequisites are needed:

  • Function that contains the xor instruction
  • Get all references to this function
  • Get HLIL for one of the references
  • Check the signature

Using a wonderful python oneline:

1
2
>>> list(bv.get_code_refs(xored_data_ref[0].function.symbol.address))[0].hlil
<HighLevelILCall: sub_140001ad0(&var_78, &data_14001e448, 0x1b)>

If we decompose:

1
2
refs = bv.get_code_refs(ref.function.symbol.address) # get all references for the function that call the targeted xored data
refs[0].hlil # Get the HighLevelIL of one of the reference

First, we can verify that the current reference HLIL is of type binaryninja.highlevelil.HighLevelILCall

1
2
3
4
>>> type(f_hlil)
<class 'binaryninja.highlevelil.HighLevelILCall'>
>>> type(f_hlil) is binaryninja.highlevelil.HighLevelILCall
True

This seems to be fine. But some false positives can remain. Now let’s check the function signature.

If we check the operands:

1
2
>>> f_hlil.operands
[<HighLevelILConstPtr: sub_140001ad0>, [<HighLevelILAddressOf: &var_78>, <HighLevelILConstPtr: &data_14001e448>, <HighLevelILConst: 0x1b>]]

It results in a list containing:

  • The function pointer
  • A list of parameters (This what we have to check)

The plugin needs to check the operands list length, then get the second element and verify its size

1
2
3
4
5
>>> if len(f_hlil.operands) == 2:
... if len(f_hlil.operands[1]) == 3:
... print("Found matching number of arguments for {}".format(f_hlil))
...
Found matching number of arguments for sub_140001ad0(&var_78, &data_14001e448, 0x1b)

Many false positives should now be eliminated. A last check can be added on the parameters type.

  • A stack address
  • A data pointer
  • A constant

Here are the HLILOperation types:

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> for op in ops:
... op.operation
...
<HighLevelILOperation.HLIL_ADDRESS_OF: 25>
<HighLevelILOperation.HLIL_CONST_PTR: 28>
<HighLevelILOperation.HLIL_CONST: 26>

>>> ops[0].operation == HighLevelILOperation.HLIL_ADDRESS_OF
True
>>> ops[1].operation == HighLevelILOperation.HLIL_CONST_PTR
True
>>> ops[2].operation == HighLevelILOperation.HLIL_CONST
True

This should eliminate a lot of false positives.

Let’s try an all-in-one plugin script :) We use the previous “Xor detection” code as code base. Now the signature must match the decrypt_bytes one:

1
2
3
if ops[0].operation == HighLevelILOperation.HLIL_ADDRESS_OF and ops[1].operation == HighLevelILOperation.HLIL_CONST_PTR and ops[2].operation == HighLevelILOperation.HLIL_CONST:
print("delitcrypt decrypt_bytes functions found !")
ref.function.name = "decrypte_bytes"

Here is the plugin function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def do_the_job(bv):
for data_var in bv.data_vars:
# Get references for the current data_var
data_var_references = list(bv.get_code_refs(data_var))
# Need to check if there is at least one reference
if data_var_references:
for ref in data_var_references:
try:
rllil = ref.llil
except:
print("Cannot get llil for {}".format(ref))
continue
if not rllil:
continue
if len(rllil.operands) == 2:
if not type(ref.llil.operands[1]) is dict:
if ref.llil.operands[1].operation == LowLevelILOperation.LLIL_XOR:
print("FOUND XOR operation on {} at {}".format(hex(data_var),hex(ref.address)))
refs = list(bv.get_code_refs(ref.function.symbol.address))
f_hlil = refs[0].hlil
if not type(f_hlil) is binaryninja.highlevelil.HighLevelILCall:
continue
if len(f_hlil.operands) == 2:
if len(f_hlil.operands[1]) == 3:
#print("Found matching number of arguments for {}".format(f_hlil))
#print("Checking parameters ...")
ops = f_hlil.operands[1]
if ops[0].operation == HighLevelILOperation.HLIL_ADDRESS_OF and ops[1].operation == HighLevelILOperation.HLIL_CONST_PTR and ops[2].operation == HighLevelILOperation.HLIL_CONST:
#print("delitcrypt decrypt_bytes functions found !")
ref.function.name = "decrypt_bytes"
key = bv.get_data_var_at(data_var)
key.name = "xor_key"
break
print("The End")

It is now possible to decrypt the string (manually at the moment).

In further developments, I will try to automate the whole process and handle remaining errors:)

Real life example

Let’s try this on a real life example. This project will be used.
After a minute, the plugin locates the xor key as well as the decrypt_bytes function \o/

We are thus able to decrypt the strings manually

1
2
3
4
>>> s1 = bv.read(0x1401491e1,32)
>>> s2 = bv.read(0x1401490e5,0x1401491e0-0x1401490e5)
>>> xor_bytes_with_key(s2,s1)
b'\nUSAGE:\n Elevator.exe [OPTIONS] <COMMAND>\n\nARGS:\n <COMMAND> Command line to run.\n\nOPTIONS:\n -h, --help Print help information.\n -n, --new-console Set CREATE_NEW_CONSOLE flag for the new process.'

Thanks

  • Thanks @anvie for the go ahead on this research :)
  • Thanks to CryptID & Yorin & Dazax for proofreading this article :)
  • Thanks to franb for debug and the moral support :D
  • Thanks to the Binary Ninja community for API support on slack :)
⬆︎TOP