Figure 1: Original obfuscated malicious code of Locky ransomware sample. (high res image)
Removing unnecessary or unused variables/codes
First method to analyze malicious scripts is to remove unused variables or functions, if present, from the script. For example, at line number 3 in Figure 2 below, the variable name “relevant” is just declared once and never used anywhere inside the script. We can use an editor such as Notepad++’s “Find + Count” operation to get the count of the number of times the variable has been used in the script (select variable name, press CTRL + F and click Count).
Figure 2: Find and count the variable to eliminate unused variables (high res image)
Similar to what we have shown above, we will delete all such unused variables, functions, codes from the main script step by step:
- Lines 9 and 10 as shown in Figure 2 contain function name a() which is not used anywhere.
- The functions between line 18 and line 50 as shown in below Figure 2-a are never used anywhere.
Figure 2-a: Unused function codes defined inside malicious script (high res image)
- Similarly, function variable “Native” is declared and defined between line 127 and line 147 as shown in Figure 3 and is not used anywhere other than a declaration and a definition.
Figure 3: Function variable to declared and defined but not used anywhere in following script (high res image)
- Similarly, searching forward line by line, you can remove unused codes from the original script.
After eliminating the junk code, we have main malicious code with most of the strings obfuscated using different methods. At line 194 as shown in Figure 1, school() function is being called with 2 parameters. We can quickly observe that the first parameter is HTTP URL which is split using + operator and the other parameter is a random string name called “yVrLrAwIvU”. The main section of the URL is Unicode encoded and it’s easy to decode using any of the tools available on the internet such as the Converter tool. The decoded URL string using this tool is shown in Figure 4.
Figure 4: Unicode decoding of URL string using a Converter tool (high res image)
Base64 Strings Pattern
By going through code step-by-step, we see some random strings used along with the function “paprikash4()” as shown in Figure 5.
Figure 5: Random strings calling one of the function paprikash4() (high res image)
In general Base64 strings are made up of characters from A-Z, a-z, 0-9 and “+ “and “/,” with “=” as a padding character. We may use this format and attempt to interpret the string at line 170 in Figure 5 as Base64. The string “V2luZGdezd3mona93cyBTY3JpcHQgdezd3monaSG9zdA=dezd3mona=” is passed to function “.paprikash4()”. However, if we decode this string using the converter tool, we receive the error shown in Figure 6.
Figure 6: Converter throws error since the string is not a Base64 string (high res image)
To identify the error, we need to read code for the function “paprikash4()” located at line number 56 as shown in Figure 7.
Figure 7: Function paprikash4 () code (high res image)
At line number 60 in Figure 7 above, when the string is passed to this function, the code first replaces the sequence of characters “dezd3mona” with empty value and then rest of the code evaluates it as a Base64 string. So if we recall the string that had the error in Figure 6, “V2luZGdezd3mona93cyBTY3JpcHQgdezd3monaSG9zdA=dezd3mona=”, it does contain the “dezd3mona” string 3 times. So if we remove all these occurrences from the main string, we will get the string “V2luZG93cyBTY3JpcHQgSG9zdA==”. Now we can convert this string from Base64 to text using the Converter tool as shown in Figure 8. This is another obfuscation technique commonly used to hide Base64 strings.
Figure 8: Converter easily converted to Base64 string into text string (high res image)
This worked and decoded our string into text. The attacker deliberately inserted string “dezd3mona” inside Base64 strings at random places. So in order to get plain text strings back, we need to take the following five steps:
- Find the string “dezd3mona” used anywhere in the script and replace it with an empty string, using Find and Replace in Notepad++ as shown in Figure 9.
Figure 9: Find and replace hard-coded pattern string (high res image)
- Replace or substitute every Base64 string into text strings as shown in Figure 10.
Figure 10: Convert Base64 strings into text strings and substitute into respective variable’s value (high res image)
We need to find all 11 occurrences of strings using “paprikash()” function and replace them with plain text strings.
- Evaluate the variable “chosen” (3 minus 2 equals 1) at line 17 as shown in Figure 2-a earlier and substitute its value.
- Find variable “weasel” at line number 157 as shown in Figure 9 earlier and replace with its value “E” and also evaluate “+” operator in the occurrence.
- Substitute variable “errant” reference used with its value 0 as shown in Figure 2 earlier.
Figure 11: Comma separated strings inside opening and closing brackets evaluates last string (high res image)
The comma operator inside opening and closing brackets evaluates each of its operands (from left to right) and returns the value of the last operand. So in the above case, the expression will return last string “exe”. So the above expression now becomes,
The + operator combines the text of one or more strings and returns a new string so final string becomes “.exe”.
In this way we will first evaluate all the expressions used by comma operator and substitutes final strings in line 133, line 149, line 150 and so on in Figure 11. The new code is shown in Figure 11-a.
Figure 11-a: Script code after evaluating comma separated strings inside brackets.
Evaluating Other Functions
The script has a function named “paprikash()” at line 11 as shown in Figure 12.
Figure 12: Function paprikash() code (high res image)
The array “druberri” referred at line 133 in Figure 11-a has one element which calls the function “paprikash()” as shown below:
This function takes string as a parameter and replaces matching key with a value that is defined in the “unlike” variable in line 5 from Figure 12. For example, “:” (colon) is replaced by ”.” (dot) or “381” is replaced by “X”. Substituting these characters as per array, the string passed to this function becomes
and after evaluating “+” operator, the final strings becomes
Similarly, at line numbers 149 and 150 in Figure 11-a, we can see references to function “paprikash2()” being called by two variable assignments as follows:
casque = (“PkAkGdUV”).paprikash2();
inflammation = (“sWVCpYbGGt” + “gotWpR”).paprikash2();
Now let’s look at “paprikash2()” function code shown in Figure 13.
Figure 13: Function paprikash2() code (high res image)
This function just returns a character at first position of the main string. So, the variables “casque” and “inflammation” after evaluation become:
casque = “P”;
inflammation = “s”;
Accordingly, we will substitute all the returned strings using the function “paprikash2()” throughout the script.
Evaluating Remaining Variables
Figure 14: Variables evaluated from array elements (high res image)
After using the above methods to evaluate and assigning these values to respective variables, the decoded variables are shown in Figure 15.
Figure 15: Decoded variables values taken from array (high res image)
We are almost done evaluating all the variables and expressions from the script. Now it is time to clean the script by removing new lines, replacing variable values, joining strings split by “+” operator etc. The snippet of final deobfuscated malicious script is shown in Figure 16.
Figure 16: Final deobfuscated malicious script code in human readable form (high res image)
By following the deobfuscated script code shown in Figure 16 we can easily track and understand the malicious activity being performed by the script.