Decoding AutoIT XOR Encryption Obfuscation

Recently, I started looking at an AutoIt sample that contained some heavy obfuscation. After I ran the compiled binary through Exe2Aut, I was able to review the underlying script and I saw a lot of this stuff:

Hello World

Starting with this small snippet of code:

", "2")))

This is an example of obfuscation/encryption used in AutoIt malware.

Starting from the inside out:

The “hsbduoekdcbl” function is used to XOR. The following is a hex string that is XORed with the pro
vide key of “2”


XORing again it reveals the following string:


The “ergsduf” function appears to be used to convert the hex string into text.

This is of course another hex string which reveals:

DllCall("kernel32", "ptr", "VirtualAlloc", "dword", "0", "dword", BinaryLen($IEZU) + BinaryLen($OOP), "dword", "0x3000", "dword", "0x40")["0"]

The “poe” function seems like it just there to execute this deobfuscated code.

Lets look at this snippet again but in Python this time:

import re
snippet =  '$poe($ergsduf(hsbduoekdcbl("327A3636344134413631343334413441303A3030344034373530344734373441'
snippet += '31313130303030413032303035323536353030303041303230303734343B3530353635373433344136333441344134443431'
snippet += '3030304130323030343635353444353034363030304130323030313230303041303230303436353534443530343630303041'
snippet += '30323630343B344734333530353B364134373447303A3036363B363737433737303B3032304030323630343B344734333530'   
snippet += '353B364134373447303A3036364436443732303B3041303230303436353534443530343630303041303230303132353A3131'   
snippet += '31323132313230303041303230303436353534443530343630303041303230303132353A313631323030303B374030303132'
snippet += '30303746", "2")))'

print("We will use ([0-9A-F]+) to capture the Hex and (\d+) to capture the XOR key")
signature = '\$poe\(\$ergsduf\(hsbduoekdcbl\("([0-9A-F]+)", "(\d+)"\)\)\)'

matches = re.findall(signature,snippet)
print("The first capture group collects the ciphertext:\n",matches[0][0])
print("The second collects the XOR key:",matches[0][1])

We will use ([0-9A-F]+) to capture the Hex and (\d+) to capture the XOR key:

’$poe($ergsduf(hsbduoekdcbl(”([0-9A-F]+)”, “(\d+)”)))’

This returns:

[('327A36363441344136...omitted', '2')]

The first capture group collects the ciphertext: 327A36363441344136…omitted The second collects the XOR key: 2

With this match we can now perform the operation and return the data we need…

for match in matches:
    ciphertext = bytearray.fromhex(match[0])
    key        = int(match[1])
    firsthex  = ''
    for character in ciphertext:
        firsthex += chr(character ^ key)
    cleartext = firsthex[2:]
    cleartext = bytearray.fromhex(cleartext)
DllCall("kernel32", "ptr", "VirtualAlloc", "dword", "0", "dword", BinaryLen($IEZU) + BinaryLen($OOP), "dword", "0x3000", "dword", "0x40")["0"]

Additionally, lets use RegEx to replace the original text so we can ignore the obfuscation altogether…

decoded = re.sub('\$poe\(\$ergsduf\(hsbduoekdcbl\("([0-9A-F]+)", "(\d+)"\)\)\)',cleartext.decode('utf8'),snippet)
DllCall("kernel32", "ptr", "VirtualAlloc", "dword", "0", "dword", BinaryLen($IEZU) + BinaryLen($OOP), "dword", "0x3000", "dword", "0x40")["0"]

Lets kick this up a notch and make it scale to cover the entire file instead of just this snippet:

import re

with open('autoit.txt', 'r') as autoitfile:
    data =
newdata = ''
#loop through each line of the script
for line in data.split('\n'):
    signature = '\$poe\(\$ergsduf\(hsbduoekdcbl\("([0-9A-F]+)", "(\d+)"\)\)\)'
    regexmatches = re.findall(signature,line)
    if regexmatches:
        for match in regexmatches:
            ciphertext = bytearray.fromhex(match[0])
            key        = int(match[1])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            cleartext = firsthex[2:]
            cleartext = bytearray.fromhex(cleartext).decode('utf8')
            #replace original broke, with shiny new woke
            line = re.sub(signature,re.escape(cleartext), line)
        newdata += ''.join([line,'\n'])
        newdata += ''.join([line,'\n'])
print('Decoded',numofmatches, 'lines')
with open('newautoit.txt','w') as newautoitfile:

Decoded 11 lines

What did this do? Well line 462 of this script in the original autoit.txt file reads like this:

33393335353434343434353433423331323033423331333535383534344034343338", "1")))

in the new file we have replaced it’s actual code:

DllStructSetData($EUUE, 1, $IEZU)

Our signature is fairly specifc to a certain combination of functions: poe, ergsduf, and hsbduoekdcbl BUT there are others in this mess that do the same thing.

If $qnqhwwqbvarvaoqrrvcrunt = 6957 Then
    31333337333747374737423734373037443333334233313333323133333338", "1")))
    Local $mxboopmlxnjz = $poe($v1(fcoupewxiogl("307832343637364436453734364532303236323032323435343  
    6343634363334343533373334333434363337333034333337333833353232", "0")))

From performing the same steps we can see that poe, e, fcoupewxiogl performs the same way so lets update our signature to capture others:

signature  = '\$poe\(\$(v1|e|gfbvo|rfo|fooi4e|ergsduf|erzgf|ayudergfv|sdvfi|zeyc|grfeus|ufd)'
signature += '\((fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)\)\)'

Here is an updated script with all of the redundant variables that perform the same actions on the encrypted strings:

import re

with open('autoit.txt', 'r') as autoitfile:
    data =
numofmatches = 0
newdata = ''
for line in data.split('\n'):
    signature = '\$poe\(\$(v1|e|gfbvo|rfo|fooi4e|ergsduf|erzgf|ayudergfv|sdvfi|zeyc|grfeus|ufd)\((fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)\)\)'
    regexmatches = re.findall(signature,line)
    if regexmatches:
        numofmatches += 1

        for match in regexmatches:
            #we'll need to adjust the arrays we are calling because we are capture more fields
            #('e', 'fcoupewxiogl', '347C31343336324232373231333733373037333133303242303D33303134303D3030363D', '4')
            ciphertext = bytearray.fromhex(match[2])
            key        = int(match[3])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            cleartext = firsthex[2:]
            cleartext = bytearray.fromhex(cleartext).decode('utf8')
            #replace original broke, with shiny new woke
            line = re.sub(signature,re.escape(cleartext), line)
        newdata += ''.join([line,'\n'])
        newdata += ''.join([line,'\n'])
print('Decoded',numofmatches, 'lines')
with open('newautoit.txt','w') as newautoitfile:

Decoded 303 lines

BAM! 303 lines of deobfuscated code!

Are we done? Nah! How is more obfuscated stuff?!

If $yufanpeqeyqcweddlyi == 8.37539918579835 Then
    Dim $payloadexist = $poe($erzgf(fcoupewxiogl("3078343636393643363534353738363937333734373332383234373036313739364336463631363435303631373436383239", fcoupewxiogl("32", "2"))))

Some of these do not decrypt because they break our RegEx pattern.
UGH! What to do? Well if we look at them… the can theoretically be ignored. Why? This is an XOR routine that XORing the Hex value 32 with it’s ascii counterpart “2”.
Since when we XOR anything with itself, we will get 0 for these. So this equates to:

$poe($erzgf(fcoupewxiogl("3078343636393643363534353738363937333734373332383234373036313739364336463631363435303631373436383239", "0" )))

Taking it a step further, since XORing something by 0 doesn’t change the value, this technically doesn’t change our data but since it dorks with our pattern, we will account for it. Here we are going to use a while loop to regex, decode, rinse, repeat cycle until the nested decryption goes away.

import re

with open('autoit.txt', 'r') as autoitfile:
    data =
numofmatches = 0
newdata = ''
signature = '\$poe\(\$(v1|e|gfbvo|rfo|fooi4e|ergsduf|erzgf|ayudergfv|sdvfi|zeyc|grfeus|ufd)\((fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)\)\)'
nestedsignature = '(fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)'

for line in data.split('\n'):
    regexmatches = re.findall(signature,line)
    nestedmatches = re.findall(nestedsignature,line)
    if regexmatches:
        numofmatches += 1
        for match in regexmatches:
            #we'll need to adjust the arrays we are calling because we are capture more fields
            #('e', 'fcoupewxiogl', '347C31343336324232373231333733373037333133303242303D33303134303D3030363D', '4')
            ciphertext = bytearray.fromhex(match[2])
            key        = int(match[3])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            cleartext = firsthex[2:]
            cleartext = bytearray.fromhex(cleartext).decode('utf8')
            #replace original broke, with shiny new woke
            line = re.sub(signature,re.escape(cleartext), line)
        newdata += ''.join([line,'\n'])
    elif nestedmatches:
        while nestedmatches:
            numofmatches += 1
            ciphertext = bytearray.fromhex(nestedmatches[0][1])
            key        = int(nestedmatches[0][2])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            if firsthex[:2] == "0x":
                cleartext = firsthex[2:]
                #Lets try to decode the message to utf8
                    cleartext = bytearray.fromhex(cleartext).decode('utf8')
                    #if it doesn't work because it in binary data, ignore and just give me the hex.
                    cleartext = firsthex
                cleartext = firsthex
            line = re.sub(nestedsignature,'"'+re.escape(cleartext)+'"', line)
            nestedmatches = re.findall(nestedsignature,line)

        newdata += ''.join([line,'\n'])

print('Decoded',numofmatches, 'lines')
with open('newautoit.txt','w') as newautoitfile:

Decoded 390 lines

There we go! 390 lines we didn’t need to decode by hand.
Plus with minor tweaks, hopefully this could be used for the next sample and maximize my ROI.

While I will be the first to admit, this decryption could perhaps be tackled in a more efficient manner by ignoring all of the names of the functions, I wanted to make the output as clean as possible for readability and highlight the techniques you can use to solve these sort of problems in the future.

Well, that is it. thanks for reading!