Decoding AutoIT XOR Encryption Obfuscation
Recently, I started looking at an AutoIt sample that contained some heavy obfuscation. After I ran the compiled binary through Exe2Aut, I was able to review the underlying script and I saw a lot of this stuff:
Starting with this small snippet of code:
$poe($ergsduf(hsbduoekdcbl("327A3636344134413631343334413441303A303034403437353034473437344131313130  
303030413032303035323536353030303041303230303734343B353035363537343334413633344134413444343130303041  
3032303034363535344435303436303030413032303031323030304130323030343635353444353034363030304130323630  
343B344734333530353B364134373447303A3036363B363737433737303B3032304030323630343B344734333530353B3641  
34373447303A3036364436443732303B3041303230303436353534443530343630303041303230303132353A313131323132  
313230303041303230303436353534443530343630303041303230303132353A313631323030303B37403030313230303746  
", "2")))
This is an example of obfuscation/encryption used in AutoIt malware.
Starting from the inside out:
The “hsbduoekdcbl” function is used to XOR. The following is a hex string that is XORed with the pro
vide key of “2”
327A3636344134413631343334413441303A3030344034373530344734373441313131303030304130323030353235363530  
30303041303230303734343B3530353635373433344136333441344134443431303030413032303034363535344435303436  
303030413032303031323030304130323030343635353444353034363030304130323630343B344734333530353B36413437  
3447303A3036363B363737433737303B3032304030323630343B344734333530353B364134373447303A3036364436443732  
303B3041303230303436353534443530343630303041303230303132353A3131313231323132303030413032303034363535  
34443530343630303041303230303132353A313631323030303B37403030313230303746
XORing again it reveals the following string:
0x446C6C43616C6C28226B65726E656C3332222C2022707472222C20225669727475616C416C6C6F63222C202264776F7264  
222C202230222C202264776F7264222C2042696E6172794C656E282449455A5529202B2042696E6172794C656E28244F4F50  
292C202264776F7264222C2022307833303030222C202264776F7264222C20223078343022295B2230225D
The “ergsduf” function appears to be used to convert the hex string into text.
This is of course another hex string which reveals:
DllCall("kernel32", "ptr", "VirtualAlloc", "dword", "0", "dword", BinaryLen($IEZU) + BinaryLen($OOP), "dword", "0x3000", "dword", "0x40")["0"]
The “poe” function seems like it just there to execute this deobfuscated code.
Lets look at this snippet again but in Python this time:
import re
snippet =  '$poe($ergsduf(hsbduoekdcbl("327A3636344134413631343334413441303A3030344034373530344734373441'
snippet += '31313130303030413032303035323536353030303041303230303734343B3530353635373433344136333441344134443431'
snippet += '3030304130323030343635353444353034363030304130323030313230303041303230303436353534443530343630303041'
snippet += '30323630343B344734333530353B364134373447303A3036363B363737433737303B3032304030323630343B344734333530'   
snippet += '353B364134373447303A3036364436443732303B3041303230303436353534443530343630303041303230303132353A3131'   
snippet += '31323132313230303041303230303436353534443530343630303041303230303132353A313631323030303B374030303132'
snippet += '30303746", "2")))'
print("We will use ([0-9A-F]+) to capture the Hex and (\d+) to capture the XOR key")
signature = '\$poe\(\$ergsduf\(hsbduoekdcbl\("([0-9A-F]+)", "(\d+)"\)\)\)'
matches = re.findall(signature,snippet)
print(matches)
print("\n")
print("The first capture group collects the ciphertext:\n",matches[0][0])
print("The second collects the XOR key:",matches[0][1])
We will use ([0-9A-F]+) to capture the Hex and (\d+) to capture the XOR key:
’$poe($ergsduf(hsbduoekdcbl(”([0-9A-F]+)”, “(\d+)”)))’
This returns:
[('327A36363441344136...omitted', '2')]
The first capture group collects the ciphertext: 327A36363441344136…omitted The second collects the XOR key: 2
With this match we can now perform the operation and return the data we need…
for match in matches:
    ciphertext = bytearray.fromhex(match[0])
    key        = int(match[1])
    firsthex  = ''
    for character in ciphertext:
        firsthex += chr(character ^ key)
    cleartext = firsthex[2:]
    cleartext = bytearray.fromhex(cleartext)
    print(cleartext.decode('utf8'))
DllCall("kernel32", "ptr", "VirtualAlloc", "dword", "0", "dword", BinaryLen($IEZU) + BinaryLen($OOP), "dword", "0x3000", "dword", "0x40")["0"]
Additionally, lets use RegEx to replace the original text so we can ignore the obfuscation altogether…
decoded = re.sub('\$poe\(\$ergsduf\(hsbduoekdcbl\("([0-9A-F]+)", "(\d+)"\)\)\)',cleartext.decode('utf8'),snippet)
print(decoded)
DllCall("kernel32", "ptr", "VirtualAlloc", "dword", "0", "dword", BinaryLen($IEZU) + BinaryLen($OOP), "dword", "0x3000", "dword", "0x40")["0"]
Lets kick this up a notch and make it scale to cover the entire file instead of just this snippet:
import re
with open('autoit.txt', 'r') as autoitfile:
    data = autoitfile.read()
newdata = ''
#loop through each line of the script
for line in data.split('\n'):
    signature = '\$poe\(\$ergsduf\(hsbduoekdcbl\("([0-9A-F]+)", "(\d+)"\)\)\)'
    regexmatches = re.findall(signature,line)
    if regexmatches:
        for match in regexmatches:
            ciphertext = bytearray.fromhex(match[0])
            key        = int(match[1])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            cleartext = firsthex[2:]
            cleartext = bytearray.fromhex(cleartext).decode('utf8')
            #replace original broke, with shiny new woke
            line = re.sub(signature,re.escape(cleartext), line)
        newdata += ''.join([line,'\n'])
    else:
        newdata += ''.join([line,'\n'])
print('Decoded',numofmatches, 'lines')
with open('newautoit.txt','w') as newautoitfile:
    newautoitfile.write(newdata)
Decoded 11 lines
What did this do? Well line 462 of this script in the original autoit.txt file reads like this:
$poe($ergsduf(hsbduoekdcbl("31793535374237423432363536333634373236353432373436353535373036353730
33393335353434343434353433423331323033423331333535383534344034343338", "1")))
in the new file we have replaced it’s actual code:
DllStructSetData($EUUE, 1, $IEZU)
Our signature is fairly specifc to a certain combination of functions: poe, ergsduf, and hsbduoekdcbl BUT there are others in this mess that do the same thing.
If $qnqhwwqbvarvaoqrrvcrunt = 6957 Then
    $poe($e(fcoupewxiogl("31793535374237423532373037423742333933333743373436333744373437423232323333
    4437353742374233333342333133333733374737473742373437303744333333423331333334363747363632373235353437  
    4437303733374237343436374736363237323535373632343337343735373836333734373236353738374737443333334233  
    31333337333747374737423734373037443333334233313333323133333338", "1")))
Else
    Local $mxboopmlxnjz = $poe($v1(fcoupewxiogl("307832343637364436453734364532303236323032323435343  
    6343634363436333634333330333034333337333833353436333434363435343634363436343633353433333434323336343  
    5333634363433333733383335343633383436343534363436343634363337333733363435333433343336343334333337333  
    8333534363433343634353436343634363436333634333337333333353433333434363433333733383335333033303436343  
    6343634363436343633363433333633353333333333333332343333373338333533303334343634363436343634363436333  
    2343533363334333634333336343333383338333833353330333834363436343634363436343634333337333433353435333  
    0333733353337333333363335333733323433333733343335343533343333333333333332333234353336333433363336343  
    3333733343335343533383336343333363433333833383334333534353431343333373338333533303433343634363436343  
    6343634363334343533373334333434363337333034333337333833353232", "0")))
From performing the same steps we can see that poe, e, fcoupewxiogl performs the same way so lets update our signature to capture others:
signature  = '\$poe\(\$(v1|e|gfbvo|rfo|fooi4e|ergsduf|erzgf|ayudergfv|sdvfi|zeyc|grfeus|ufd)'
signature += '\((fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)\)\)'
Here is an updated script with all of the redundant variables that perform the same actions on the encrypted strings:
import re
with open('autoit.txt', 'r') as autoitfile:
    data = autoitfile.read()
numofmatches = 0
newdata = ''
for line in data.split('\n'):
    signature = '\$poe\(\$(v1|e|gfbvo|rfo|fooi4e|ergsduf|erzgf|ayudergfv|sdvfi|zeyc|grfeus|ufd)\((fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)\)\)'
    regexmatches = re.findall(signature,line)
    if regexmatches:
        numofmatches += 1
        for match in regexmatches:
            #we'll need to adjust the arrays we are calling because we are capture more fields
            #('e', 'fcoupewxiogl', '347C31343336324232373231333733373037333133303242303D33303134303D3030363D', '4')
            ciphertext = bytearray.fromhex(match[2])
            key        = int(match[3])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            cleartext = firsthex[2:]
            cleartext = bytearray.fromhex(cleartext).decode('utf8')
            #replace original broke, with shiny new woke
            line = re.sub(signature,re.escape(cleartext), line)
        newdata += ''.join([line,'\n'])
    else:
        newdata += ''.join([line,'\n'])
print('Decoded',numofmatches, 'lines')
with open('newautoit.txt','w') as newautoitfile:
    newautoitfile.write(newdata)
Decoded 303 lines
BAM! 303 lines of deobfuscated code!
Are we done? Nah! How is more obfuscated stuff?!
If $yufanpeqeyqcweddlyi == 8.37539918579835 Then
    Dim $payloadexist = $poe($erzgf(fcoupewxiogl("3078343636393643363534353738363937333734373332383234373036313739364336463631363435303631373436383239", fcoupewxiogl("32", "2"))))
Else
Some of these do not decrypt because they break our RegEx pattern.
UGH! What to do? Well if we look at them… the can theoretically be ignored. Why? This is an XOR routine that XORing the Hex value 32 with it’s ascii counterpart “2”. 
Since when we XOR anything with itself, we will get 0 for these. So this equates to:
$poe($erzgf(fcoupewxiogl("3078343636393643363534353738363937333734373332383234373036313739364336463631363435303631373436383239", "0" )))
Taking it a step further, since XORing something by 0 doesn’t change the value, this technically doesn’t change our data but since it dorks with our pattern, we will account for it. Here we are going to use a while loop to regex, decode, rinse, repeat cycle until the nested decryption goes away.
import re
with open('autoit.txt', 'r') as autoitfile:
    data = autoitfile.read()
numofmatches = 0
newdata = ''
signature = '\$poe\(\$(v1|e|gfbvo|rfo|fooi4e|ergsduf|erzgf|ayudergfv|sdvfi|zeyc|grfeus|ufd)\((fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)\)\)'
nestedsignature = '(fcoupewxiogl|hsbduoekdcbl)\("([0-9A-F]+)", "(\d+)"\)'
for line in data.split('\n'):
    regexmatches = re.findall(signature,line)
    nestedmatches = re.findall(nestedsignature,line)
    if regexmatches:
        numofmatches += 1
        for match in regexmatches:
            #we'll need to adjust the arrays we are calling because we are capture more fields
            #('e', 'fcoupewxiogl', '347C31343336324232373231333733373037333133303242303D33303134303D3030363D', '4')
            ciphertext = bytearray.fromhex(match[2])
            key        = int(match[3])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            cleartext = firsthex[2:]
            cleartext = bytearray.fromhex(cleartext).decode('utf8')
            #replace original broke, with shiny new woke
            line = re.sub(signature,re.escape(cleartext), line)
            #print(cleartext)
        newdata += ''.join([line,'\n'])
    elif nestedmatches:
        while nestedmatches:
            numofmatches += 1
            ciphertext = bytearray.fromhex(nestedmatches[0][1])
            key        = int(nestedmatches[0][2])
            firsthex  = ''
            for character in ciphertext:
                firsthex += chr(character ^ key)
            if firsthex[:2] == "0x":
                cleartext = firsthex[2:]
                #Lets try to decode the message to utf8
                try:
                    cleartext = bytearray.fromhex(cleartext).decode('utf8')
                except:
                    #if it doesn't work because it in binary data, ignore and just give me the hex.
                    cleartext = firsthex
            else:
                cleartext = firsthex
            line = re.sub(nestedsignature,'"'+re.escape(cleartext)+'"', line)
            nestedmatches = re.findall(nestedsignature,line)
    else:
        newdata += ''.join([line,'\n'])
print('Decoded',numofmatches, 'lines')
with open('newautoit.txt','w') as newautoitfile:
    newautoitfile.write(newdata)
Decoded 390 lines
There we go! 390 lines we didn’t need to decode by hand.
Plus with minor tweaks, hopefully this could be used for the next sample and maximize my ROI.
While I will be the first to admit, this decryption could perhaps be tackled in a more efficient manner by ignoring all of the names of the functions, I wanted to make the output as clean as possible for readability and highlight the techniques you can use to solve these sort of problems in the future.
Well, that is it. thanks for reading!
