It’s often desirable for an attacker to cover their tracks and hide their actions. This is often accomplished by randomization of any combination of bytes and strings, order of contact or time delays. While this can be effective in certain scenarios, a trained eye will still be suspicious of anomalous data traveling across their network. Take as a prime example the recent trend of attackers to downloading shellcode stages for payloads over HTTP(S). Accessing a stage over HTTP(S) is ideal for an attacker as the traffic will likely look more legitimate than a raw TCP connection as well as optionally use any configured proxies. The data in the response of the web request however is generally raw assembly (optionally encoded if Metasploit’s EnableStageEncoding option is set) and will look more suspicious than more common requests for HTML, CSS, Javascript and image resources. Enter the polyglot. A polyglot is data which is valid for multiple interpretations. It’s often used to refer to a source file which may be valid for more than one language.
It’s possible to take an existing Bitmap image file and modify it in such a way that will also be valid x86 assembly with space suitable for shellcode. Bitmap files can have slight variations in the use of its headers and how image and color data is stored. There are two different headers in the file the Bitmap and DIB file headers. The second header, the DIB header can have 7 different variations based on a field in the first header.

The Bitmap file header is what makes a Bitmap image an ideal image format to use as a shellcode polyglot. The Bitmap header starts with BM which dictates that the second DIB header is in the the 40-byte BITMAPINFOHEADER format. Following the BM, at offset 0x02 in the file is the 4-byte file size. The ASCII bytes BM when interpreted as x86 shellcode disassemble as inc edx; dec ebp which regardless of the state of the registers will not result in an exception as would occur if the instructions were reading or writing to memory locations that may be invalid. Following the BM, the 4-byte size can be artificially increased to be valid in the Bitmap context, as well as a valid x86 JMP instruction, which can be used to skip over the remaining bytes in the Bitmap header and the entire DIB header.
After inflating the Bitmap header size field, additional space can be occupied between the end of the DIB header and the start of the image data. It’s important to note that in certain Bitmap formats an additional field may be present here storing color data if the Pixel Format field of the DIB header is less than 24 bits per pixel. The size should be updated to the smallest value that is also an absolute x86 JMP instruction that will skip to the end of the DIB header which is where the shellcode can be stored. When calculating the JMP instruction, either a long or short variant can be used.
Once the size has been updated, the delta between the new and old sizes must be added to the data offset field of the Bitmap header in order for the file to still be valid. Shellcode can safely be stored in the space between the end of the DIB header and the start of the image data. Most modern payloads require at least a few hundred bytes (as is the case of Metasploit stagers) if not much more space. Because of this size, and the fact that the payload would exist essentially unobfuscated within the Bitmap image file it is desirable for an attacker to combine it with the image data itself which can be accomplished using a basic steganography technique. A malicious attacker can substitute the least significant bits (LSBs) of the image data with the raw shellcode. Once the shellcode and image data have been combined, a much smaller assembly stub can extract the original shellcode from the image data for execution at run time. The result is a stub of ~53 bytes that can extract a larger shellcode blob from an image while maintaining it’s original appearance.

Figure 2 illustrates the layout of the modified image file. The pads before and after the decoder stub are of a dynamic size and will change based on the size (and thus the x86 JMP) included in the Bitmap header.

The size of the image becomes an important factor to consider when selecting a suitable Bitmap file for use with this technique. Ideally only the first LSB of each pixel would be modified. This would not alter the image enough for a viewer to notice. However if the shellcode that is to be embedded is larger than (8 * len(image_data)) then the least 2 significant bits can be altered etc until the shellcode will fit. From experimentation, 4 LSBs starts to introduce some noticeable changes to the image. Using more than 4 LSBs should be avoided in preference for selection of a larger Bitmap image file. For compatibility with Metasploit as a stage encoder an image size of 3.5MB – 4MB should be selected. It’s also important to note that at the time of this writing that some of Metasploit’s stagers will exit the process if the stage >= 4MB.
The code following this post is an Encoder module for the Metasploit Framework which demonstrates the previously described polyglot encoding technique. In contrast to other Metasploit encoders, this module makes no effort to remove invalid characters and will greatly increase the size of the original shellcode as it is place within the selected image. The module allows a user to specify a base image to use by setting the BitmapFile option. The selected file needs to meet the previously outlined criteria (BM header start bytes, 40-byte BITMAPINFOHEADER header and 24-bits per pixel density). For an additional layer of obfuscation, the 53 byte assembly stub used to recover the original shellcode from the image data is polymorphic. This is the same technique used in the popular x86/shikata_ga_nai encoder. The polymorphic decoding assembly stub is also automatically set to use the smallest necessary 1, 2 or 4 LSBs to store the shellcode in the image data. When using the module to encode a stage, 4 LSBs will need to be used to accommodate the large size.
Perhaps the best use case for this module would be to encode a stageless HTTP Meterpreter into a Bitmap file that can then be served using any web server that would ideally set the MIME type correctly. An attacker could then request and execute this image with any number of techniques including Powershell. Alternatively, the module could be set using Metasploit’s EnableStageEncoding and StageEncoder options. At this time however, the Metasploit HTTP handler will not set the MIME type to correctly reflect the Bitmap image.
The above figures are all suitable Bitmap image files for use with this technique. Furthermore, figure #2 (bitmap_modified.bmp SHA-1 82ca3e260fb2c0aa6d76c4274b42a6f05e53fd79) is a product of the of the Metasploit module that when executed as raw shellcode (such as with Syringe) will spawn a Windows bind shell on port 4444. When figure #1 (which is provided as an unmodified comparison) and figure #2 are downloaded, the Linux file utility can be used to identify them both as being in the PC bitmap, Windows 3.x format.
Further reading:
Shah, Saumil “Weaponized Polyglots as Browser Exploits” PoC or GTFO 8 (2015): Page(27). Web. 2 Jan. 2016.
Special thanks to Emily Gundry for creating the images.
