433 Central Ave., 4th Floor, St. Petersburg, FL 33701 | [email protected] | Office: (656) 236-3022

FLARE Script Series: Recovering Stackstrings Using Emulation with ironstrings

This blog post continues our Script Series where the FireEye Labs Advanced Reverse Engineering (FLARE) team shares tools to aid the malware analysis community. Today, we release ironstrings: a new IDAPython script to recover stackstrings from malware. The script leverages code emulation to overcome this common string obfuscation technique. More precisely, it makes use of our flare-emu tool, which combines IDA Pro and the Unicorn emulation engine. In this blog post, I explain how our new script uses flare-emu to recover stackstrings from malware. In addition, I discuss flare-emu’s event hooks and how you can use them to easily adapt the tool to your own analysis needs.

Introduction

Analyzing strings in binary files is an important part of malware analysis. Although simple, this reverse engineering technique can provide valuable information about a program’s use and its capabilities. This includes indicators of compromise like file paths and domain names. Especially during advanced analysis, strings are essential to understand a disassembled program’s functionality. Malware authors know this, and string obfuscation is one of the most common anti-analysis techniques reverse engineers encounter.

Due to the prevalence of obfuscated strings, the FLARE team has already developed and shared various tools and techniques to deal with them. In 2014 we published an IDA Pro plugin to automate the recovery of constructed strings in malware. In 2016, we released FLOSS; a standalone open-source tool to automatically identify and decode strings in malware.

Both solutions rely on vivisect, a Python-based program analysis and emulation framework. Although vivisect is a robust tool, it may fail to completely analyze an executable file or emulate its code correctly. And just like any tool, vivisect is susceptible to anti-analysis techniques. With missing, incomplete, or erroneous processing by vivisect, dependent tools cannot provide the best results. Moreover, vivisect does not provide an easy-to-use graphical interface to interactively change and enhance program analysis.

I encountered all these shortcomings recently, when I analyzed a GandCrab ransomware sample (version 5.0.4, SHA256 hash: 72CB1061A10353051DA6241343A7479F73CB81044019EC9A9DB72C41D3B3A2C7). The malware contains various anti-analysis techniques to hinder disassembly and control-flow analysis. Before I could perform any efficient reverse engineering in IDA Pro, I had to overcome these hurdles. I used IDAPython to remove various anti-analysis instruction patterns which then allowed the disassembler to successfully identify all functions in the binary. Many of the recovered functions contained obfuscated strings. Unfortunately, my changes did not propagate to vivisect, because it performs its own independent analysis on the original binary. Consequently, vivisect still failed to recognize most functions correctly and I couldn’t use one of our existing solutions to recover the obfuscated strings.

While I could have tried to feed my patches in IDA Pro back to vivisect or to create a modified binary, I instead created a new IDAPython script that does not depend on vivisect. Thus, circumventing the mentioned shortcomings. It uses IDA Pro’s program analysis and Unicorn’s emulation engine. The easy integration of these two tools is powered by flare-emu.

Using IDA Pro instead of vivisect resolves multiple limitations of our previous implementations. Now changes that users make in their IDB file, e.g. by patching instructions to manually enhance analysis, are immediately available during emulation. Moreover, the tool more robustly supports different architectures including x86, AMD64, and ARM.

Stackstrings: An Example

The disassembly listing in Figure 1 shows an example string obfuscation from the sample I analyzed. The malware creates a string at run-time by moving each character into adjacent stack addresses (gray highlights). Finally, the sample passes the string’s starting offset as an argument to the InternetOpen API call (blue highlight). Manually following these memory moves and restoring strings by hand is a very cumbersome process. Especially if malware complicates value assignments using additional instructions like illustrated below.


Figure 1: Disassembly listing showing stackstring creation and usage

Because malware often uses stack memory to create such strings, Jay Smith coined the term stackstrings for this anti-analysis technique. Note that malware can also construct strings in global memory. Our new script handles both cases; strings constructed on the stack and in global memory.

ironstrings: Stackstring Recovery Using flare-emu

The new IDAPython script is an evolution of our existing solutions. It combines FLOSS’s stackstring recovery algorithm and functionality from our IDA Pro plugin. The script relies on IDA Pro’s program analysis and emulates code using Unicorn. The combination of both tools is powered by flare-emuFe, short for flare-emu, is the chemical symbol for iron and hence the script is named ironstrings.

To recover stackstrings, ironstrings enumerates all disassembled functions in a program except for library and thunk functions as identified by IDA Pro. For each function, the script emulates various code paths through the function and searches for stackstrings based on two heuristics:

  1. Before all call instructions in the function. As stackstrings are often constructed and then passed to other functions, i.e. Windows APIs like CreateFile or InternetOpenUrl.
  2. At the end of a basic block containing more than five memory writes. The number of memory writes is configurable. This heuristic is helpful if the same memory buffer is used multiple times in a function and if the string construction spans multiple basic blocks.

If any of these conditions apply, the script searches the function’s current stack frame for printable ASCII and UTF-16 strings. To detect strings in global memory, the script additionally searches for strings in all memory locations that have been written to.

Using flare-emu Hooks to Recover Stackstrings

If you’re not already familiar with flare-emu, I recommend reading our previous blog post. It discusses some of the interfaces the tool provides. Other helpful resources are the examples and the project documentation available on the flare-emu GitHub.

The stackstrings script uses flare-emu’s iterateAllPaths API. The function iterates multiple code paths through a function. It first finds possible paths from function start to function end. The tool then forces the emulation down all identified code paths independent from the actual program state. This extensive code coverage allows ironstrings to recover strings constructed from many different emulation runs.

A key feature of flare-emu are the various hook functions that get triggered by different emulation events. These hooks, or callbacks, enable the development of very powerful automation tasks. The available hooks are a combination of Unicorn’s standard hooks, e.g., to hook memory access events, and multiple convenience hooks provided by flare-emu. The following section briefly describes the available callbacks in flare-emu and illustrates how the ironstrings script uses them to recover obfuscated strings.

  • instructionHook: This Unicorn standard hook is triggered before an instruction is emulated. ironstrings uses this hook to initiate the stackstrings extraction if a basic block contained enough memory writes, for example.
  • memAccessHook: This Unicorn standard hook is triggered when memory read or write events occur during emulation. In the stackstrings script this function stores data about all memory writes.
  • callHook: This flare-emu hook is activated before each function call. The hook’s return value is ignored. In the stackstrings script this hook triggers the extraction of stackstrings.
  • preEmuCallback: This flare-emu hook is called before each emulation run. It is only available in the iterate and the iterateAllPaths functions. The hook’s return value is ignored. ironstrings does not use this hook.
  • targetCallback: This flare-emu hook gets called whenever one of the specified target addresses is hit. It is only available in the iterate and the iterateAllPaths functions. The hook’s return value is ignored. The stackstrings script does not use this hook.

The code in Figure 2 shows the callback functions that flare-emu’s API currently supports, their signatures, and examples of how to use them. All callbacks receive an argument named hookData. This named dictionary allows the user to provide application specific data to use before, during, and after emulation. Often, this dictionary is named userData in the user-defined callbacks, as in the examples below, due to its naming in Unicorn. ironstrings uses this to access function analysis data and store recovered strings across its various hooks. The dictionary also provides access to the EmuHelper object and emulation meta data.


Figure 2: flare-emu example hook implementations

Installation

Download and install flare-emu as described at the GitHub installation pageironstrings is available, along with our other IDA Pro plugins and scripts, at our ironstrings GitHub page.

Note that both flare-emu and ironstrings were written using the new IDAPython API available in IDA Pro 7.0 and higher. They are not backwards compatible with previous program versions.

Usage and Options

To run the script in IDA Pro, go to File – Script File… (ALT+F7) and select ironstrings.py. The script runs automatically on all functions, prints its results to IDA Pro’s output window, and adds comments at the locations where it recovered stackstrings. Figure 3 shows the script’s output of the recovered stackstring locations from the GandCrab sample. Analysis of this malware takes the script about 15 seconds.


Figure 3: Deobfuscated stackstrings and locations where they were identified

Figure 4 shows the disassembly listing of the stackstring creation example discussed at the beginning of this post after running ironstrings.


Figure 4: Commented stackstring after running ironstrings

After analyzing a sample, the script provides a summary and a unique listing of all recovered strings. The output for the ransomware sample is shown in Figure 5. Here the tool failed to analyze two functions due to invalid memory operations during Unicorn’s code emulation.


Figure 5: Script summary and unique string listing

Note that you can modify various options to change the script’s behavior. For example, you can configure the output format at the top of the ironstrings.py file. The script’s README file explains the options in more detail.

Conclusion

This blog post explains how our new IDAPython script ironstrings works and how you can use it to automatically recover stackstrings in IDA Pro. Overcoming anti-analysis techniques is just one of many useful applications of code emulation for malware analysis. This post shows that flare-emu provides the ideal base for this by integrating IDA Pro and Unicorn. The detailed discussion of flare-emu’s hook functions will help you to write your own powerful automation scripts. Please reach out to us with questions, suggestions and feedback via the flare-emu and flare-ida GitHub issue trackers.

44 – Space Telcos, the Spectrum Crunch and Flying Dragons

Satellite operators are facing a real problem concerning how they establish and maintain real-time connectivity for commercial spacecraft. As more and more commercial satellites are set to launch this problem is becoming increasingly important. Ralph Ewig, CEO of Audacy Space shares his expertise about how real-time connectivity, addressing the spectrum crunch and building a communications backbone are all critical elements needed to support the new space economy.

43 -User Experience Design, Space System Standards and Enterprise Ground Services

Listen to Michal Anne Rogondino discuss user experience design for the Space and Missile Systems Center (SMC). Michal Anne explains design thinking and how it is being utilized to create a user experience for DoD space applications.  The CEO of Rocket Communications talks about the workarounds of designing for a secret environment without having a secret clearance (it is possible) and working for the government in general.  The stakes are high as there cannot be any ambiguity around an icon or a simple alarm.  Bringing the user experience to the space domain is different and there are challenges with automation and autonomy that demand the creation of new rules and guidelines. In parallel is a need for creating baseline and guidelines that are common which allows everything to be designed in a common way.  Space systems are complex and there will be an advantage to giving airmen or operators the tools and time to focus on what is important, their mission.

Siemens SIMATIC S7-1500 CPU

1. EXECUTIVE SUMMARY

  • CVSS v3 7.5

  • ATTENTION: Exploitable remotely/low skill level to exploit
  • Vendor: Siemens
  • Equipment: SIMATIC S7-1500 CPU
  • Vulnerabilities: Improper Input Validation

2. RISK EVALUATION

Successful exploitation of these vulnerabilities could allow a denial of service condition of the device.

3. TECHNICAL DETAILS

3.1 AFFECTED PRODUCTS

The following versions of SIMATIC S7-1500 CPU are affected:

  • SIMATIC S7-1500 CPU all versions v1.8.5 and prior, and
  • SIMATIC S7-1500 CPU all versions prior to v2.5 down to and including v2.0.

3.2 VULNERABILITY OVERVIEW

3.2.1    IMPROPER INPUT VALIDATION CWE-20

An unauthenticated attacker sending specially crafted network packets to Port 80/tcp or 443/tcp may cause a denial of service on the device.

CVE-2018-16558 has been assigned to this vulnerability. A CVSS v3 base score of 7.5 has been calculated; the CVSS vector string is (AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H).

3.2.2    IMPROPER INPUT VALIDATION CWE-20

An unauthenticated attacker sending specially crafted network packets to Port 80/tcp or 443/tcp may cause a denial of service on the device.

CVE-2018-16559 has been assigned to this vulnerability. A CVSS v3 base score of 7.5 has been calculated; the CVSS vector string is (AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H).

3.3 BACKGROUND

  • CRITICAL INFRASTRUCTURE SECTORS: Chemical, Critical Manufacturing, Energy, Food and Agriculture, Water and Wastewater Systems
  • COUNTRIES/AREAS DEPLOYED: Worldwide
  • COMPANY HEADQUARTERS LOCATION: Germany

3.4 RESEARCHER

Georgy Zaytsev, Dmitry Sklyarov, Druzhinin Evgeny, Ilya Karpov, and Maxim Goryachy of Positive Technologies reported these vulnerabilities to Siemens.

4. MITIGATIONS

Siemens recommends users upgrade to Version 2.5 or newer. Users who cannot upgrade because of hardware restrictions are recommended to apply the manual mitigations. Updates are available for download from the following link:

https://support.industry.siemens.com/cs/de/en/view/109478459

Siemens also recommends users apply the following manual mitigations:

  • Protect network access to Port 80/tcp and Port 443/tcp of affected devices.
  • Apply cell protection concept.
  • Apply defense-in-depth.

As a general security measure, Siemens strongly recommends protecting network access to devices with appropriate mechanisms. In order to operate the devices in a protected IT environment, Siemens recommends configuring the environment according to Siemens’ operational guidelines for Industrial Security (Download: https://www.siemens.com/cert/operational-guidelines-industrial-security) and following the recommendations in the product manuals.

Additional information on industrial security for Siemens devices can be found at: 

https://www.siemens.com/Industrialsecurity

For more information on these vulnerabilities and more detailed mitigation instructions, please see Siemens Security Advisory SSA-180635 at the following location:

http://www.siemens.com/cert/advisories

NCCIC recommends users take defensive measures to minimize the risk of exploitation of this vulnerability. Specifically, users should:

  • Minimize network exposure for all control system devices and/or systems, and ensure that they are not accessible from the Internet.
  • Locate control system networks and remote devices behind firewalls, and isolate them from the business network.
  • When remote access is required, use secure methods, such as Virtual Private Networks (VPNs), recognizing that VPNs may have vulnerabilities and should be updated to the most current version available. Also recognize that VPN is only as secure as the connected devices.

NCCIC reminds organizations to perform proper impact analysis and risk assessment prior to deploying defensive measures.

NCCIC also provides a section for control systems security recommended practices on the ICS-CERT web page. Several recommended practices are available for reading and download, including Improving Industrial Control Systems Cybersecurity with Defense-in-Depth Strategies.

Additional mitigation guidance and recommended practices are publicly available on the ICS-CERT website in the Technical Information Paper, ICS-TIP-12-146-01B–Targeted Cyber Intrusion Detection and Mitigation Strategies.

Organizations observing any suspected malicious activity should follow their established internal procedures and report their findings to NCCIC for tracking and correlation against other incidents.

No known public exploits specifically target these vulnerabilities.