Monday, October 28, 2019

More about my 2019.HACK.LU Keynote talk

As promised, this is my additional notes and review about my Keynote talk in 2019.HACK.LU (link). My keynote talk title is very long actually, but it explained the description of the whole slides clearly. What was presented is about TODAY's Linux post exploitation, process injection, fileless execution from infrastructures and components that has been supporting those activities, based on the real incidents handled within these past seven years of MalwareMustDie project, from the blue-teamer point of view. The goal is for forming a better defense on our Linux platform and some methods for IR in handling those incidents.

About 2019.HACK.LU

HACK.LU is a great conference, thank you for having me this year, I could interact with a lot of infosec community who I already know but haven't met them until now, and I could also get along with old friends in the community too. 2019.HACK.LU had a warm environment, that I can trust and I met with a lot of good people from various IT security background, from various countries. It has important atmosphere to support an open and productive discussion between great professionals in the event. And a nice comparison to make between what we have in Asia/Japan to Europe.

It's not a new thing, but I can't help to notice that several new acquaintances looks surprised meeting me at the first time :) It was like "who is this guy?" or "Really?", So, guys, this is who I am! Let's don't judge people by their appearance, free your minds..like a pointer, so you can use it in an unlimited possibilities!

As you all maybe know by now that I am working in the IR field on handling malware intrusion cases for supporting forensics in daily basis, I don't have much chances going out to international conference to hang out with others, and instead, recently I have returned to write on this blog again after three years full on doing assessments of our infrastructures and fixing several OPSEC aspect we had (along with daily work matters, and I work hard too). But as MMD, our research never stops, and one of research that we are doing has just been presented in the 2019.HACK.LU. You can see those on our tweets, our Linux malware repository, our quick analysis screenshot posts and other media.

About the talk & some extra reading takeaways

As the lead of MMD, from our side, officially we don't openly share any slides or video for this talk due to the security purpose. It's because, within these past three (more) years we have figured that more of the "dark-siders" are reading our security posts more than whitehats, and that's never good. So this time I don't want to give adversaries much luxury in learning from, and, accessing our research material(s) that easily, this is explaining the TLP we put on the slides, it's "TLP Amber for our ranks" as the sharing policy. But I think the conference attendees and other good people has already received access to the material too.

The presentation is full of contents, very thorough actually, but there was so limited time to explain each concept in details on the stage because I was focusing to cover them all within 45 minutes. So, after doing several one-on-one explanation sessions to the audiences afterward, I think many of you should go look deeper beforehand into several Linux function's source codes that may trigger a "mis-using possibility" that may cause any malicious scheme to run a code or program.

For those who have the interest, I am urging you to look at the methods I presented further (in order to learn or to mitigate them more in kill-chain basis), by reading more about the related references that is needed to better understand the concepts written in the slides and its possibility to other parts of code that hasn't be described too.
Therefore herewith I listed below some reference for the above purpose:

  1. Commands listed in slide between page 26 and page 38
  2. ELF execution references that I presented in R2CON2018
  3. Details listed in slide page 43, 93, and page 96
  4. GTFO_Bins in Linux & its privilege escalation
  5. ELF binary compilation using PIE
  6. SE Linux, what can it actually does & doesn't do.
  7. Linux ALSR methods & its affect to process execution (incl.: forking, threading, libs calling).
  8. Buffer exploitation (smashing stack) & modify ret addr for code execution.
  9. Linux shared objects, how their libraries can be abused
  10. ptrace() (how the gdb and dbx are using it)
  11. /proc/{pid}/maps, /proc/{pid}/mem
  12. memfd_create()
  13. shm_open()
  14. dlopen_mode()
  15. LD_PRELOAD
  16. (I may add some gems that I forgot)

You can use the examples of three cases in the slide to test and assess the understanding of your IR team in handling Linux intrusion, and maybe regenerate several cases I presented by your own too. It is always bringing much advantages to you for doing more practice, trust me, many practice makes perfect!

Several questions and answers received in 2019.HACK.LU

I received a lot of questions (and cool tributes after keynote talk too, thank you for those). For the questions I tried to list them all in a list as per below, these items may contain one of your questions too, so please read them all before you start to ask your question via the internet.

Q: What is the bottom line of the presentation?
A: The point of my presentation is for all of us to be aware more that Linux deserves better implementation in security than we are having now (seriously, we don't have much options in protecting Linux OS compare to what we have in Windows, just saying), and the kill-chain methods to prevent us from post exploitation in the Linux platform can be applied better after we understand and after we learn to make a breakdown process in steps and its intrusion assessment possibility in the platform. The talk I did was just the start point for this level of awareness.

Q: So the talk you did was like the PoC of the threat?
A: No, the presentation is explaining the the bad activities that have already happened in the internet as attempts to do part of post-exploitation components (mostly process injections) on incidents that has been reported to us. There are more of information that is not being applied "in the wild", but I am not going to those information in the any open forum. The information I presented is the things that all blue-teamers should know, since the incidents for those concepts are actually had happened or had been implemented in some adversaries frameworks.

Q: Are there other concepts in the Linux libraries that can be used for the code injection?
A: Yes there are. not necessarily execution from the vulnerability sector, Linux is also rich of libraries to be used for the internal and development purpose. For example, the libc's dlopen_mode for example, it is used mostly to load libraries between libraries, but most of the people tend to just using Linux then understanding its source. Due to its open source state, the malicious misusing of Linux libraries can be performed by the adversaries without having much difficulties if the skill-sets are there, and the worst part is the combination of several good libraries to pivot ALSR for example, which can be used as combination to perform any kind of undetected malicious intent. We should start to open our eyes for these.

Q: About codes you wrote in the presentation can they be used?
A: I don't write any usable codes in the presentation, those are codes for explaining the concept I presented, I tweaked much of them in some (confusing, hopefully) ways. But coders know what I meant and they are following the points very well judging the discussion we had after keynote, so I'd said good luck for the copy-pasta adversary who would use the codes I presented.

Q: Are you suggesting that we should do the hot forensics instead of the cold one in Linux?
A: The idea of the talk is not pushing you to do IR works "on-memory", but to make all of us aware of the steps taken in the process injection by break them down into steps. By understanding this we can understand what can be more improve to prevent the intrusion like this to happen in the future, by killing the chains, and so on. "On-memory" analysis is do-able, with a ton of limitation of course, so it is not the top priority as the solution, but it would be very nice if you can do it, the examples in some cases presented are explaining how. Nevertheless, cold forensics is a must do too, however please keep it in mind that dealing cold forensics, like in dealing with with XFS/ZFS file system, is not as easy as what you think it is, and many more obstacle in that. So the best solution is applying kill-chain methods by some scheme, implementing more strict ALSR, making users using SELinux, applying backup, make sure that the memory can be dumped (did you ever test it?). The analysis can be done perfectly if the sysadmin is resourceful enough.

Q: Are you trying to push this project into the next level?
A: Yes, but for strictly IR/CERT/CSIRT related folks only, in example, we made the plan to do the workshop or hackathon topic for this subject in the next IR conference, and over there the repository of this project that we privately maintained will be announced. We don't go public for that matter, I will strictly transfer the know how to IR folks for incident handling.

Q: How to make sure that our boxes is not infected by code injection?
A: The best practice is to audit your box frequently, and for that you need to really knowing what's running on your box. Make sure there is no obsolete services or weak authentication and so on. What I can see in MMD handled incidents are, mostly coming from CGI vulnerabilities, services and its authentication vulnerabilities or leaks of access, these are what causing the intrusion at most.
Once the adversary gets into your system, it will be harder to figure. If a process runs in a weird way, do not hesitate to assess them, when you think it is necessary, unplug the box from the network. Many cool sysadmins gave us access to the boxes after they secured their data beforehand and together we examined the suspected processes.

Q: Why are you using radare2? Why not using other tools? What is the advantages of using radare2 instead of others for this matter?
A: I use a lot of RE tools, don't get me wrong, but I am in love with radare as tool since 2007 (or maybe 2006.. from my FreeBSD ports). radare2 is free for the license and for the concept in using it, it is just right tooling framework for the UNIX people, and I am using r2 daily for a lot of UNIX platforms and RTOS on many architectures. It is maybe hard for the first timers, but it is a very nice tooling once you know it. radare2 is the only one RE tool that support forensics and having four de-compiling engines to the binaries from wide range of platform. Its license is making me easier to lecture about RE or forensics to others.

Q: Why are you working under FreeBSD in analyzing Linux?
A: Why not? All OS is good, my first desktop is Solaris Sparc (Ultra1), BSD is cool. OSX is awesome. Linux is good, it's very popular and used in many devices, which is good too, this is just a flip-side of that situation, that's it. Remember, do stuff that you think you're good at, that includes to use your own favorite OS. And radare2 can work perfect in any OS at any kind of CPU, so I really have no problem about OS.

Q: How to contact you or passing something to you for analysis?
A: For the contact. You can send DM to our account in twitter at @malwaremustd1e, try to explain first about yourself and the situation, we will try follow for the urgency with some advise or action. For the sample to analyze please use this URL to send your sample to us. We will try to take a look based on our priority.

Q: Do you have other more Linux security research to share in events?
A: Yes, I have one more open topics and two close/sensitive ones for Linux, feel free to contact if you want us to present on your events, we will pick a right subject. But I think I will not talk in conference about threat actor(s) anymore, but more into academia or educational sections, or to support RE, IR or DFIR works from now on.

Salutation and thank you

I firstly thank HackLU, CIRCL and LAC/LACERT for they are the one who are really paying the bills for this talk to happen. I can't thank enough to the friends in FIRST.org, it's a small world, and we really are doing wonderful things, and we'll see you over there soon.

Thank's to MMD core team and good people supported to the presentation materials, I can't mention them enough in here. Thank you radare2 for the great RE tool. To all friends in Luxembourg whom we finally met, it's a pleasure, and looking forward to see you again, for the old friends, thank you for your great support always, for bearing with me in the baby-foot game or power-point karaoke or several conversation difficulties, you guys rock!

Below is the kind words feedback received in twitter and I am planning to put it in a frame to hang on the wall to remember you guys always, thank you! :)

Please support our Linux security infosec sharing projects in MalwareMustDie, there are many more hot posts are coming to our blog. Please help so we can also "re-generate" our resources to next generation as per seen in the below diagram.

As we are also leaners for many aspects, please do not hesitate to teach us on how to secure our sharing further.

For all of the blue-teamers out there, you're not alone, what we're doing is important and no matter how hard it is (see the picture below) just take it easy! Please hang in there with faith.

Thank you very much for your reading time! Stay safe!

#MalwareMustDie!

Saturday, September 28, 2019

MMD-0064-2019 - Linux/AirDropBot

Prologue

There are a lot of botnet aiming multiple architecture of Linux basis internet of thing, and this story is just one of them, but I haven't seen the one was coded like this before.

Like the most of other posts of our analysis reports in MalwareMustDie blog, this post has been started from a friend's request to take a look at a certain Linux executable malicious binary that was having a low (or no) detection, and at that time the binary hasn't been categorized into a correct threat ID.

This time I decided to write the report along with my style on how to reverse engineering this sample, which is compiled in the MIPS processor architecture.

So I was sent with this MIPS 32bit binary ..


cloudbot-mips: ELF 32-bit MSB executable, MIPS, MIPS-I 
version 1 (SYSV), statically linked, stripped

..and according to its detection report in the Virus Total hash it is supposed to be a "Mirai-like" or Mirai variant malware, (thank's to good people for uploading the sample to VirusTotal). But the fact after my analysis is saying differently, these are not Mirai, Remaiten, GafGyt (Qbot/Torlus base), Hajime, Luabots, nor China series DDoS binaries or Kaiten (or STD like). It is a newly coded Linux malware picking up several idea and codes from other known malware, including Mirai.

This sample is just one of a series of badness, my honeypots, OSINT and a given information was leading me into 26 types of samples that are meant to pwned series of internet of thing (IoT) devices running on Linux OS, and this MIPS-32 ELF binary one I received is just one of the flocks.

If you see the filenames you can guess some of those binaries are meant to aim specific IoT/router platforms and not only for several randomly cross-compiled architecture supported result. This type of binaries seem to be started appearing in the early August, 2019, in the internet.

Below is the additional list of the compiled binaries meant to run on several non-Intel CPU running Linux operating systems, they can affect network devices like routers, bridges, switches, and other the small internet of things that we may already use on daily basis:

m68k-68xxx.cloudbot:   32-bit MSB Motorola m68k, 68020, version 1 (SYSV), statically linked
hnios2.cloudbot:       32-bit LSB Altera Nios II, version 1 (SYSV), dynamically linked
hriscv64.cloudbot:     64-bit LSB UCB RISC-V, version 1 (SYSV), dynamically linked
microblazebe.cloudbot: 32-bit MSB Xilinx MicroBlaze 32-bit RISC, version 1 (SYSV), statically linked
microblazeel.cloudbot: 32-bit LSB version 1 (SYSV), statically linked,
sh-sh4.cloudbot:       32-bit LSB Renesas SH, version 1 (SYSV), statically linked.
xtensa.cloudbot:       32-bit LSB Tensilica Xtensa, version 1 (SYSV), dynamically linked.
arcle-750d.cloudbot:   32-bit LSB ARC Cores Tangent-A5, version 1 (SYSV), statically linked.
arc.cloudbot:          32-bit LSB ARC Cores Tangent-A5, version 1 (SYSV), dynamically linked.

(The hashes are all recorded in the "Hashes" section of this post)

Binary Analysis

Since I was asked to look into the MIPS sample so I started with it. The binary analysis is showing a symbol striping result, but we can still get some executable section's information, compiler setting/trace that's showing how it should be run, and some information regarding of the size for the section/program headers, but it's all just too few isn't it? Still this analysis is good for getting information we need for supporting dynamic analysis (if needed) afterward. I personally love to solve malware stuff as statically as possible.

I don't think I will get much information on the early stage (binary analysis) with this ELF binary, except what had already known, such as cross-compiling result, not packed, and headers and entry0 are in place, so I'm good for conducting the next analysis step.

For file attributes I extracted them using forensics tools included in Tsurugi Linux commands, which are also not showing special result too, except of what has been recorded from the infected box. So I was taking several checks further I run some several ELF pattern signatures I know, with running it against my collection of Yara rules and ClamAV signature to match it to previous threat database that I have, and this is only to make me understand why several false-positive results came up in other Anti Virus product's detection. The malware yet is having several interesting strings but they are still too generic to be processed to identify the threat without reading its assembly further.

So my "practical binary analysis" result for this MIPS binary is going to be it, nothing much.

Some methods on MIPS-32 static analysis to dissect this sample with radare2:)

So this is the fun part, the binary analysis with radare2 ;). no cutter GUI, no fancy huds, just an old-schooler way with command line, visual mode and graph in a r2shell.

I think there is really no such precise step by step "cookbook" on how to to use radare2 during analyzing something, and basically radare2 is enriched in design coded by several coders for any kind of users to use it freely with many flavor and options or purpose in binary analysis, once you get into it you'll just get use to use it since radare2 will eventually adapting to your methods, and before you know it you are using it forever.

My line of work from day one is UNIX operating systems, I use radare2 since the name is "radare" compiled from FreeBSD ports in between years of 2006 to 2007, and I mostly use command line basis on every radare shell on my VT100x/VT200x terminal emulation variants I use afterwards, this is kind of building my reversing forms with radare2 until now. The command line base.

But first, let's make sure you are setting"mips" and "32" in radare2 environment of assembly architecture (arc) and bits for this binary, then try to recognize the "main function", which is in "0x4016a0" at the pattern/location that's different than Intel basis assembly like shown in the picture below:

Next, I may just run following commands to be sure that it can be reversed well. It is a simple command for only showing how many Linux syscall is used, and this will work after the radare2 parse and analyze the binary to the analysis database.

PS: If you know what you're doing, an simpler/easier way for the MIPS 32bit to seek where the syscall codes placed is by grepping the assembly code with the hex value of "0x0000000c" like below, the same result should come up:

In my case on dealing with Linux or UNIX binaries, I have to know first what syscalls are used (that kernel uses for making basic operations), "syscall" is used to request a service from kernel. Any good or bad program are using those (if they need to run on that OS), so syscalls have to be there. For me, the syscalls is important and its amount will tell you how big the work load will be, ..then the rest is up to you and radare2 to extract them, the more of those syscalls, the merrier our RE life will be, without knowing these syscalls there's no way we can solve such stripped binary :)

In a Linux MIPS architecture, where assembly and register (reduced registers due to small space) is different than PC's Intel ones (MISP is RISC, Intel is CISC, RISC is for a CPU that is designed based on simple orders to act fast, many networking devices are on RISC for this reason). Linux OS in some MIPS platform can be configured to run either in big or in little endian mode too, you have to be careful about the endianness in reversing MIPS, like this MIPS binary is using big endian, also binaries for SGI machines, but some machines like Loongson 3 are just like Intel or PPC works in little endian, several Linux OS is differing their package for supporting each endianness with "mips" (big) or "mipsel" (little) in their MIPS port. Information on the target machines for each sample can help to recognize the endianness used.

In MIPS the way "syscall" used is also have its own uniqueness. Basically, a designated service code for a syscall must be passed in $v0 register, and arguments are passed in other registers. A simple way in assembly code to recognize a syscall is as per below snipped code:

li $v0, 0x1
add $a0, $t0, $zero
syscall

Explanation: The "0x1" is stored in the "$v0" register (it doesn't have to be assembly command "li" but any command in MIPS assembly in example "addliu", etc, can be used for the same effect), which means the service code used to print integer. The next line is to perform a copy value from the register "$t0" to "$a0" (register where argument is usually saved).
Finally (the third line) the syscall code is there, with these components altogether one "syscall" can be executed.

We can apply the above concept in the previously grep syscall result. The objective is to recognize the address of its syscall wrapper function for this stripped binary analysis purpose. For example, at the second result at "0x004019d0" there's a syscall number, and by radare2 you go to that location with seek (s) command and using visual mode we can figure the function name in no time. I will show you how.

Let's fix the screen for it as per below so we can be at the same page:
I marked the line where it is assigning "0xfa2" value to "$v0", and "0xfa2" is the number registered for "fork" syscall in Linux MIPS 32bit OS, that's also saying 0xfa2 is syscall number of sys_fork (system call for fork comnmand), if you scroll up a bit you can see the function name "fcn.004019a0", which is the "wrapper function" for this "syscall fork" or "sys_fork". The syscall command will accept the passed syscall number stored in "$v0" to be translated in the syscall table to pass it through the OS specific registered syscall name alongside with the arguments needed to perform the further desired syscall operation.

Noted that the syscall number can always be confirmed in designated Linux OS in the file with the below formula, and more information on register assignment on MIPS architecture that explains syscalls calling conventions can be read in ==>[link].

/usr/include/{YOUR_ARCH}/asm/unistd_{YOUR_BIT}.h

The manual of syscall [link] is a good reference explaining syscall wrapper in libc. Quoted:

"Usually, system calls are not invoked directly: 
instead, most system calls have corresponding C library wrapper 
functions which perform the steps required (e.g., trapping to kernel 
mode) in order to invoke the system call.  

Thus, making a system call looks the same as invoking a
normal library function.

In many cases, the C library wrapper function does nothing more than:

*  copying arguments and the unique system call number to the
   registers where the kernel expects them;

*  trapping to kernel mode, at which point the kernel does the real
   work of the system call;

*  setting errno if the system call returns an error number when the
   kernel returns the CPU to user mode.

However, in a few cases, a wrapper function may do rather more than
this, for example, performing some preprocessing of the arguments
before trapping to kernel mode, or postprocessing of values returned
by the system call.  Where this is the case, the manual pages in
Section 2 generally try to note the details of both the (usually GNU)
C library API interface and the raw system call.  Most commonly, the
main DESCRIPTION will focus on the C library interface, and
differences for the system call are covered in the NOTES section."

Using this method, in no time you'll get the full list of the syscall function's used by this malware as per following table that I made for myself during this analysis:

The rest is up to you on how to make it easy to name the strings for each "syscall" for your purpose, I go by the above strings naming since it is fit to my RE platform, I suggest you refer to Linux syscall base on naming them [link].

The next step is, you may need to change all function name in radare2 according to this "syscall table". Using the visual mode and analyze function name (afn) command is the faster way to do it manually, or you can script that too, radare2 can be used with varied of methods, anything will do as long as we can get the job's done. In my case I like to use these radare2 shell macro based on table I made for myself:

    :
s 0x0402060; af; afn ____connect; pdf |head 
s 0x0401CF0; af; afn ____write; pdf |head 
s 0x04019B0; af; afn ____fork; pdf |head 
    :

The result is as per seen in the below screenshot:

Up to this way, we'll have all of the syscalls back in place :) Don't worry, you'll do this faster if you get used to it.

The result looks cool enough for me to read the radare2 graph on examining how this MIPS binary further goes..

The next step is a generic way on reversing a stripped binary, by defining the functions that is not part of Libc but likely coded by malware coder. For this task, you have to check the rest of the function and seek whether the XREF doesn't go to any of syscall wrapper functions, make sure that function itself is not the main() function, init_proc() nor init_term() functions, and that goes to the below leftover list, just naming it to anything you think it is fit with to what it does.

In my case I named them this way:

Then we can put the correct function name into the binary using the same macro I showed you previously, then we are pretty much completed in making this binary so readable... hold on, but read it from where? Where to start?

To pick a good place to start to start reversing, this command will help you to pick some juicy spots, all the extractable strings will be dumped and we can pick one interesting one to start, and go up to build the big picture.:)

Actually symbols are giving us much better options, but right now we don't have anything else that is readable enough to start..

You can start to trace this binary from these text address reference and then go up to the call in the main function that supports it. For example, by using the visual mode you can seek the XREF of each text to see how it is called from which function and you can trail them further after that. This isn't going to be difficult to read since you have all functions back in place.

The picture below is showing how the "air dropping" is referred to the caller function.

That's it. These methods I shared are useful methodology in analyzing Linux MIPS-32 binary especially stripped ones like the one I have now. I think you're good enough to go to complete your own analysis by yourself too. Please just tried those methods if you don't have any other better ways and don't be afraid if other RE tools can't make you read the MIPS-32 binary well, just fire the radare2 with the tips written above, and everything should be okay :)

We go on with the malware analysis of this binary and its threat then..

What does this MIPS-32 binary do?

Practically. the MIPS binary is bot that is having a mission to infect the host it was dropped into (note: so it needs a dropping scheme to go to the infected host beforehand), making a malicious process called "cloudprocess", send message of "airdopping clouds" through the standard output (that can be piped later on). It is recording its "PID" and fork its process for the further step. The message of "airdropping clouds" is the reason why I called this malware as "AirDropBot" eventhough the coder prefer to use "Cloudbot", which there is also a legitimate good software that uses that name too as their brand.

Upon successful forking it will extract the what the coder so-called "encrypted array", it's ala Mirai table crypted keywords in its concept, but it is different in implementation., I must guess that it could be originally coded to avoid XOR operation which is the worst Mirai bug in the history :) but this "encrypt_array" is just ending up to an encoded obfuscation function :) - Anyhow the value from this "decrypted" coded is used for further malware process.

Then the malware tries to connect to the C2 which its IP address is hard-coded in the binary, on a success connection attempt to C2 server, it will parse the commands sent by the C2 to perform three weaponized functions on the binary to perform TCP, and UDP DDoS attack with either using the specific hex-coded payload, or the latter on is using a custom pattern so-called "hex-attack" that sends DoS packet in a hex escape strings format to the targeted host.

I will break it down to more details in its specific functions in the next sections.

The "encryption" (aka the obfuscation)

The challenge was the "encryption" part, it was I used radare2 with ESIL to see the "encrypted" variables, as per snipped below as PoC:

The decryption is by [shift-1] as per shown in the cascade loop shown in every encoded strings.

If we want to translate this decryoter scheme, it may look something like this (below), I break it up in 3 functions but in assembly it is all in a function and cascaded to each strings to be decoded:

int encrypt_array()
{ 
  array_splitter("xxxx");
  array_splitter("yyyy");
   :
}
int array_splitter(char *src)
{
  strcpy(var_char_buffer, src);
  char_decrypter(var_char_buffer);
    array_counter++
  return;
}
int char_decrypter(char *src2)
{
  int i; strcpy(dstring, src2);
  for ( i = 0; strlen(dstring) > i; ++i )
   // {redacted shift -1 logic to dstring} //
  strcpy(j, dstring);
  return j++
}

The result for the "decryption" can be shown as per below, using ESIL with the fake stack can be used to emulate this with the same result, so you don't need to get into the debug mode:

The last four strings:

/proc/
/maps
/cmdline 
/status 
/exe
...are used for taking information (process name) from the infected Linux box, that will be used for the malware other functions like "killing" processes, etc. The other decrypted strings are used for infecting purpose (known credentials for telnet operation), and also for other botnet operation related.

Understanding the "decrypter" logic used is important because the same decrypter is used again to decode the C2 sent commands to the active bots before parsed and executed.

The C2, its commands and bot offensive activity

What happened after decryption (encrypt_array) of these strings is, the binary gets into the loop to call the "connecting" function per 5 seconds. If I try to write C code based on this stage it's going to be like below snipcode:

Within each loop, when it calls "connecting" function it will try to connect the C2 which is defined a struct sockaddr "addr", pointing to port number (htons) 455 (0x1c7) and IP: "179.43.149[.]189".

When connected to C2, it will listen and receive the data sent by C2, to perform decryption and then to send its decryption result (as per previous logic) to the "command parsing" function, that's having "cmd_parse" sub-function inside. The "command parsing" is delimiting received command with the white space " " for the "cmd_parse" to grep three possible keywords of "udp", "tcp", and "hex", which in next paragraph those keywords will be explained further.

Below is the loop when the command from C2 is received (listened) inside the "connecting" function in radare2:

Now we come into the offensive capability of this bot binary. The "udp" keyword will trigger the execution of "udpattack" function, "tcp" will execute "tcpattack" and so does the "hex" for executing the "hexattack" function. Each of the trigger keywords are followed by arguments that are passed to its related attack function, it emphasizes that a textual basis DoS attack command line starting with udp, tcp or hex, following by the targets or optional attack parameters are pushed from the C2 to the AirDropBots. Based on experience, the C2 CLI interface of recent DDoS botnets is having such interface matched to this criteria.

TCP and UDP is having the same payload packet in binary is as per below:

...that is sent from tcpattack() and udpattack() in TCP and UDP different socket connection from the target sent by C2.

The hexattack is having a different payload that looks like this:

One last command is is "killyourself" (taken from decrypted table that was saved in a var) that will stop the scanning function fork with the flow more or less like this:

result = strstr(var_parsed_cmd, "killyourself");
  if ( result )
   { kill(scanner_fork_PID, 9);
     exit(0);
   }
return result;

..and the kill function above is executing "kill -9" by calling int kill(__pid_t pid, int sig).

As additional, in the older version, there is also another C2 command called: "http" that will execute "httpattack" function that is using HTTP to perform L7 DoS attack using the combination of User-Agents, but in this sample series I don't see such function.

Is there any difference between MIPS and other binaries?

Oh yes it has. The Intel and ARM version (or to binary that is having a scanner function) is interestingly having more functions. If I go to details on each functions for Intel binary maybe I will not stop writing this post, so I will summary them below with a pseudo code snips if necessary.

1. The "array_kill_list" function

This function is used to kill process that matched to these strings:

It seems this is how the bot herder gets rid of the competitor if they're in the same infected Linux box.
This "array_kill_list" is accessed from killer() function that is being executed before going to "connecting" loop in the main for Intel version.

The killer function is having multiple capability to stop unwanted processes too, it will be too long to describe it one by one but in simple C code and comments as per picture below will be enough to get the idea:

2. The scanner, the spreader via exploit

The bot herder is aiming Lynksys tmUnblock.cgi of a known router's brand, the vulnerability that has to be patched since published 5 years ago. For this purpose, in intel and ARM binaries right after killer() function it runs scanner() function, targeting randomized formed IP addresses, using a hard-coded "payload" data, spoofed its origin by faking the HTTP request headers (for "tcp" or "http" flood), which is aiming TCP port 8080 with the code translated from assembly to simplified C code looks like below:

This scanner is having four pattern of payloads which I quickly paste it below for your reference if you are either receiving or researching this attack:

Maybe one of the thing that I may suggest for this bot's scanner functionality is what it seems like a spoof capability. I examined into low level for code generation of about this part and found what the send syscall performed when AirDrop bot make scanning with exploit is interesting :) please take a look yourself of what has been recorded as per below snipcodes:

On those "scanner" function supported binary, the spreading scheme is executed with targeting random generated IP addresses by calling sub-function "get_random_ip" right after the the C2 has been attempted to call, and is using the same socket for multiple effort to infect Linksys CGI vulnerability. Below is the record in re-production this activity:

3. The "singleInstance" function

This is a code to make sure that there is no duplication of "cloudprocess" process that runs after a device getting infected. It's a simple code to kill -KILL the PID of detected double instance. You can easily reverse and examine it by yourself.

Below is the example ARM-32 assembly code for this function with my comments in it just in case:

for the right side of code, if I write that in C it's going to be something like this, more or less:

BONUS: AirDropBot and the custom ELF packer case

As per other ELF badness produced by botnet adversaries in the internet, the AirDropBot is having binary that is packed with custom packer too.

The below file [link] is one good real example of AirDropBot ELF in packed mode, the VirusTotal detection is like below:

This sample is spotted in the wild a while ago on trying to infect one of my honeytraps. The "file" result looks like this:

x86.cloudbot: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, stripped

The binary is packed and by reading the assembly flow in the packer codes we can tell it is a UPX-like packer. It looks like this:

If you follow my presentation in R2CON2018 in the last part (the main course) about unpacking with radare2 for an unknown packer, the same method can be applied for you to get the OEP by implementing several "bp" on the unpacker processes. There are slides and video for that, use this link for some more information: [link]
That is exactly the method I applied to unpack this ELF.

Then next, after you bp to part where packed code copied to the base memory defined in the LOAD0 section, I will share "my way to" easily extract the unpacked ELF afterward:

ELF file headers is having enough information to be rebuilt, let's use it, assuming the header table is the last part of the ELF the below formula is more or less describing the size of the unpacked object:

// formula:

e_shoff + ( e_shentsize * e_shnum ) = +/- file_size

// math way:

0x00013af8 + ( 0x0028 * 0x0013 ) = file_size

// radare2 way:

? (0x0028 * 0x0013) + 0x00013af8|grep hex

And.. there you go, this is my unpacked file: [link]

Next, let's see the detection ratio of this packed binary in Virus Total after successfully unpacked (..well, at least it is two points higher than the packed one) :

And the binary after unpacked is very much readable now..and BOOM! the C2 of this packed ELF is in 185.244.25[.]200, 185.244.25[.]201, and 185.244.25[.]202 are revealed! :)) Now we know why the adversary wanted to pack their binary that bad.

For the addition, nowadays IoT botnet adversaries are not only packing the Intel binaries, but the embedded platform's (some are RISC cpu too) Linux binary are often seen packed also with the custom packers. Like in this similar threat report I made [link], with the ELF binary for MIPS cpu (noted: big endian one), sample that was actually spotted inside of the house of a victim (in his MIPS IoT daily used device, I won't disclose it further). I analyzed and unpacked it, to find that is not only "UPX!" bytes tampering that has been replaced.

Let me quote it in here too about my suggested unpacking methods for embedded Linux binaries I wrote in the linked post, as follows:

"There are other radare2 ways also for unpacking and extracting 
unpacked sample manually too.

The "dmda" is also useful to dump but it's maybe a bit hard effort to 
run it on embedded system, or, you can fix the load0 and load1 that can 
also be done after you grab "OEP", or, you can also break it in the exact
rewriting process to the base address, but either ways, should be able 
to unpack it. 

First ones will consume workspace in the memory for performing it.. I 
don't think RISC systems has much luxury in space for that purpose, 
but the latter one in some circumstance can be performed in ESIL mode."

The thing is you should master all of those methods, and only by that most of binary packing possibility in Linux can be solved manually without depending on UPX or any automation tools.

"So don't worry, just fire your radare2, and everything will be just Okay!" :D (my favorite motto)

In a short summary as the conclusion

This binaries are a DoS bot clients, a part of a DDoS botnet. It spread as a worm with currently aiming Lynksys tmUnblock.cgi routers derived by non MIPS built binaries that infects machines to act as payload spreader too. I must warn you that I did not check the details in every 26 binaries came up during this investigation, but I think the general aspect is covered.

These are malware for Linux platform, it has backdoor, bot functions and are having infection capability with aiming vulnerability in routers CGI or telnet. The malware is coded with many originality intact, again, it is a newly coded, it is not using codes from Mirai-like, GafGyt (Qbot/Torlus base), or Kaiten (or STD like), but I can tell that the development is not mature yet. I was about to name it as "Cloudbot" but it looks like there is a legitimate software already using it so I switched to the "Airdropbot" instead due to the hardcoded message printed on a success infection. This is a new strain of various library of IoT botnet, I hope that other security entities and law enforcer aware of what has just been occurred here, before it is making bigger damage like Mirai botnet did before.

Detection methods

Binary detection

For the binary signature method of detection. The unpacked version will hit just fine. But since the AirDropBot was developed to support many embed platform from various CPU and "endianness" type, to detect it precisely you may need to code several signatures. However, if you see the typical functions of their binary carefully, so it is yes, one generic rule can be generated and applied. For that I PoC'ed it myself to develop a bit complex Yara rules to detect them all and to recognize which binary that is having the scanner and not.

The snippet code and scan example is as per screenshot below.

Traffic detection

For the traffic detection, there are two methods that you can apply as detection: (1) The Initial Connection and activities of AirDropBot does right after the success infection, or (2) the DoS traffic, I am explaining both as follows.

The Initial connection detection is related to the nature of this malware, which is connecting to C2 and performing scanning for vulnerabilities aiming random IP in 8080. I can suggest a nice Suricata or Snort rule can be coded for connection that's aiming TCP/455 (C2 connection port), but the C2 port can be changed by the adversaries too on their next campaign, but that's not going to be easy for them to prepare all of those varied binaries and C2 port changes immediately (smile). The other way is to focus on the scanner payloads as per described in some of pictures above, the Surucata rules to detect them will last longer IF the same vulnerability is still being aimed.

The other detection is by using the AirDropBot's hardcoded flood packets, which I was in purpose whoring them in the attached pictures above too. This way you may be able to recognize the DoS traffic activity performed by this threat in the future DDoS incidents. Sucicata and Snort rules are supported for this purpose.

The bad actors and his gang are still at large and reading this blog post too :) , I am sorry I can not share the generic scanning code I made in here, but the screenshots I provided are enough for fellow reversers to recognize and implement these detection methods to filter these series of AirdropBot activities. The rest is OpSec.

Hashes and IOC information

The hashes are listed as per below and IOC has been posted to MISP and OTX for all blue-teamer community to be noticed.

../bins/aarch64be.cloudbot    | 417151777eaaccfc62f778d33fd183ff
../bins/arc.cloudbot          | d31f047c125deb4c2f879d88b083b9d5
../bins/arcle-750d.cloudbot   | ff1eb225f31e5c29dde47c147f40627e
../bins/arcle-hs38.cloudbot   | f3aed39202b51afdd1354adc8362d6bf
../bins/arm.cloudbot          | 083a5f463cb84f7ae8868cb2eb6a22eb
../bins/arm5.cloudbot         | 9ce4decd27c303a44ab2e187625934f3
../bins/arm6.cloudbot         | b6c6c1b2e89de81db8633144f4cb4b7d
../bins/arm7.cloudbot         | abd5008522f69cca92f8eefeb5f160e2
../bins/fritzbox.cloudbot     | a84bbf660ace4f0159f3d13e058235e9
../bins/haarch64.cloudbot     | 5fec65455bd8c842d672171d475460b6
../bins/hnios2.cloudbot       | 4d3cab2d0c51081e509ad25fbd7ff596
../bins/hopenrisc.cloudbot    | 252e2dfdf04290e7e9fc3c4d61bb3529
../bins/hriscv64.cloudbot     | 5dcdace449052a596bce05328bd23a3b
../bins/linksys.cloudbot      | 9c66fbe776a97a8613bfa983c7dca149
../bins/m68k-68xxx.cloudbot   | 59af44a74873ac034bd24ca1c3275af5
../bins/microblazebe.cloudbot | 9642b8aff1fda24baa6abe0aa8c8b173
../bins/microblazeel.cloudbot | e56cec6001f2f6efc0ad7c2fb840aceb
../bins/mips.cloudbot         | 54d93673f9539f1914008cfe8fd2bbdd
../bins/mips2.cloudbot        | a84bbf660ace4f0159f3d13e058235e9
../bins/mpsl.cloudbot         | 9c66fbe776a97a8613bfa983c7dca149
../bins/ppc.cloudbot          | 6d202084d4f25a0aa2225589dab536e7
../bins/sh-sh4.cloudbot       | cfbf1bd882ae7b87d4b04122d2ab42cb
../bins/sh4.cloudbot          | b02af5bd329e19d7e4e2006c9c172713
../bins/x86.cloudbot          | 85a8aad8d938c44c3f3f51089a60ec16
../bins/x86_64.cloudbot       | 2c0afe7b13cdd642336ccc7b3e952d8d
../bins/xtensa.cloudbot       | 94b8337a2d217286775bcc36d9c862d2

Salutation & Epilogue

I would like to thank to @0xrb for his persistence trying to convince me that this binary is interesting. It is interesting indeed, and as promised, this is the analysis I did after work, writing this in 8hours more non-stop. Thank's also for other readers who keep on supporting MMD, and as team, we appreciate your patience in waiting for our new post.

Thank you pancake and Radare2 teams who keep on making radare2 the best RE tools for UNIX (All of the radare2 reversing was done in FreeBSD OS, thank you for your great support to FreeBSD!), and also I thank Tsurugi DFIR team for your great forensics tools. For these open source security frameworks I still keep on helping with tests and bug reports.

Okay, I will rest and will wordsmith some miserable jargon parts of the post later, maybe I will add detail that I didn't have much time to write it now, or, to correct some minor stuff. In the mean time, enjoy the writing, please share with mention or using #MalwareMustDie hashtag. This post is a start for more posts to come.

A tribute to the newborn radare2 community in Japan "r2jp", that we established in 2013 together with "pancake" on AVTokyo workshop in Tokyo, Japan.

This technical analysis and its contents is an original work and firstly published in the current MalwareMustDie Blog post (this site), the analysis and writing is made by @unixfreaxjp.

The research contents is bound to our legal disclaimer guide line in sharing of MalwareMustDie NPO research material.

Malware Must Die!

Saturday, September 21, 2019

MMD-0063-2019 - Summary of 3 years MMD research (Sept 2016-Sept 2019)

Hello, it's unixfreaxjp here. It has been a while since I wrote our own blog, and it is good to be back. Thank you for your patience for all of this time. If you want to see what we were doing during all of our silence time just click this link

The background / TLDR

It was in September 2016 when we decided to move our blog and since then myself and the team had a lot of fun in learning and experimenting with "Jekyll" (based on "Poole") and "BlackDoc", and I just convert all posts statically into "Markdown" and all syntax highlighter into "Rouge" highlighter with templates coded in "Liquid", and I was seriously dealing with Ruby codes on FreeBSD for it. Wasn't easy, but with helps from the community and the team, we did that, and I have learned about several blog systems a lot.
I assure you it absolutely need a great effort to move or convert a blog as big as this, but at that time, the reason why we had to move was principally way much more crucial than such resource matter. And now we have capability to move our blog to anywhere, at anytime we want too.

Then, during the process, the MMD research has been going on as usual. On posting my research I moved along to try out several platforms, it's good to actually know that we don't have to depend only into one platform, and 3 (three) fun years out there was making us learning a lot about other reliable services here and there. What myself and the mates have learned is, in using any services, either it's your own or other party's ones, they all are obviously having their pro's and con's. And frankly speaking, you won't know those con's unless you literally go there and try them yourself.

So, here we are, back to service where we first started to do MalwareMustDie blog. And I met with several cool guys found that the Blogger environment is way nicer and more in privacy efforts than before, thank you Google for doing the hard work in satisfying and securing blog users. So I just set it up and switched all access to HTTPS and hopefully the broken-links effect are minimum. For the unnoticed broken links occurs during this transition please adjust the URL's subdomain from blog.malwemustdie.org to blog2.malwaremustdie.org, this should fix that up. For those who previously had problem with broken RSS this HTTPS effort may be a good news for you. And, you can still access the MMD (MalwareMustDie) blog under sub-domain of "blog2" with HTTP, yet I won't add more posts in there though, and I will minimize its service.

The flip side of all of these 3years of adventure is, now I have my research materials scattering around all over the internet(smile). Oh yes, the activity has been actively going on as usual and we're learning a better OpSec too. The security awareness is also blooming better than before, which is happy things to see..not like what we had before in 2012, Even now I am still hanging out with our friends and we're still on to dissecting cyber threat or malware.. Linux or not.. Intel CPU or not, and to be noted: I am still a great fan of radare2 and FreeBSD!

I think some followers may not know what we've been doing all of these three years, or maybe they can't track well our activities on our security research, so I decided to list some links for you to catch up with for the public related threat only. Some of those reports are just screenshots with comments (security related pictures really paint thousand words), some are just text posts or analysis comments, but all contains important information.
Does this means I am posting analysis blog again? Well, you're going to find that out too :)

Here's the list of what's been done during these three years, enjoy:
(For the previous Linux Malware Research list can be seen in here [link])

1. Windows related malware posts

Raccoon stealer infection in the wild

Dissecting on memory post exploitation powershell beacon w/ radare2

Intel POPSS Vulnerability PoC Reversed

Win32/TelegramSpyBot

Win32/WaRAT

Win32/Bayrob

"FHAPPI attack" : FreeHosting APT PowerSploit Poison Ivy

2. Linux related malware posts

Honda Car's Panel's Rootkit from China

Linux/SystemTen

Linux/Httpsd

Linux/SS(Shark)

Linux/DDoSTF today

GoARM.Bot + static strip ARM ELF by ChinaZ

Linux/ChinaZ Edition 2

Linux/CarpeDiem

Linux/Haiduc (bruter/memo)

Linux/Vulcan

Linux/HelloBot

Linux/Cayosin

Linux/DDoSMan

Linux/Mirai-Miori

Linux/Mandibule (Process Injector)

So Many Mirai..Mirai on the wall)

Today's Kaiten and PerlDDoS

Linux/STD bot

Linux/Kaiten (modded ver) in Google clouds

Linux/Qbot or GafGyt ..in Kansas city?

ChinaZ gang is back to shellshock drops Elknot abuses USA networks

3. Mac OSX related malware posts

OSX/MugTheSec

OSX/MachO-PUP (a quickie)

4. Other malware reports

Webshell/r57shell, and..

I also posted either in VirusTotal comments, or previously posted some on kernelmode(not anymore), or sometimes making several posts or notes in reddit.

We also has opened the public twitter with handle of @MalwareMustD1e, a lot of analysis screenshots as awareness are posted in there too along with several news of forensics tools for education and development matters, feel free to follow or check the time line. Again, the previous Linux Malware Research list is also available.

5. My talks on security conference

About my presentation of: "Unpacking the non-unpackable" (ELF packers talk) in R2CON2018

Epilogue

I may edit/change my posts to adjust or brush up their contents along with this post on transitioning the services, so there will be addition or changes.

Please stay safe, don't code/use bad stuff, and enjoy the summary!

#MalwareMustDie!