WTF is Frida?

Written by Roi Cohen and Tomer Zayit, translated by Benjamin Baker

Frida is a Dynamic Instrumentation ToolKit

Frida

Preface

Frida (often compared to GreaseMonkey) is a reverse engineering tool developed by two brilliant people: Havard and Ole. Over time, Frida became popular and gained a group of contributors that allow it to be flexible, lightweight, and researcher/developer friendly.

Frida supports debugging on OSX, Windows, Linux, and QNX, and has an API available for a plethora of languages, including Python, C#, and C.

What's Dynamic Binary Instrumentation

DBI (Dynamic Binary Instrumentation) is a technique for analyzing running processes. It uses a host of different methods, including code injection and module loading and has proven wildly successful - giants like Microsoft and Intel have already developed their own implementations.

Although most of the DBI functionality can be achieved with a traditional debugger, the lightweight and flexibility make DBI a superior method.

How it Works

Frida Python injects Google's V8 engine (Chrome's JavaScript engine) to a specific process. Frida Core (Python side) communicates with Frida Agent (process side). Subsequently, Frida Gum will utilize the V8 engine to run the aforementioned JavaScript code and generate a dynamic hook.

The above picture shows communication to the injected code over “NamedPipeline”.

Frida can be installed a bunch of different ways. Currently, we'll use the Python pip tool :

pip install frida

Once installed, we'll use a tool called frida-trace that lets us view real-time function calls.

A quick frida-trace tutorial :

-I (MODULE_NAME) : module to pay attention to

Module to pay attention to

-X (MODULE_NAME) : module to ignore.

Module to ignore

-i (FUNCTION_NAME) : function to pay attention to.

-i CreateFileW (APPLICATION NAME) : create file to analyze data.

After using the above commands and connecting successfully we'll see the output below in a notepad (or text editor of your choice) file.

notepad.exe output

Behind the scenes, the following things happened :

  • Validating the existence of notepad.exe in this example.
  • Code injection.

Behind the scenes

  • IAT searching for matches with the string we provided (two matches are evident: Kernel32.dll, Kernelbase32.dll)

Kernelbase.dll - Exports Table inside notepad.exe

Kernel32.dll - Exports Table inside notepad.exe

Kernel32.dll - Exports Table inside notepad.exe

  • Configuration script loading (and creation if it doesn’t exist) and initialization of Frida memory.

{ onEnter: function (log, args, state) { log("CreateFileW()"); }, onLeave: function (log, retval, state) { } }

  • First few bits of memory are copied to process memory and jump to Frida code.

Copy of bits

The Frida API

The Frida API has two main parts: Gum API (a.k.a JavaScript API) and Bindings API.

The most popular usage of Gum API is JavaScript (hence the name), and that's the kind we'll talk about.

Gum's main purpose is attaching hooks to a process. Additional benefits include running code in a process's context. For example, we could initiate a Java function call before it even happens.

Gum's different categories correspond to different operating systems and intentions: if I'm developing for Android and I need a hook, I'll use a Java class, but if I need a(n) ARM-specific function I'll use classes that start with ARM (ArmWriter, ArmRelocator e.t.c)

The Bindings API implementation is dependent on the language Frida Client was installed in. A Python version will require Python, and Node.js will partner up with JavaScript.

Bindings's main purpose is automation: we can recreate analysis done on single cases over and over, thanks to Frida Server

For example :

import frida

import json

session = frida.attach("notepad.exe") print json.dumps([x.name for x in session.enumerate_modules()], indent=4)

This segment of code will attach itself to notepad.exe and print out all module use (DLL function calls).

Python to attach to notepad.exe

Instead of joining an existing process, the code above creates a process and subsequently attaches itself.

Immediately afterward, Frida generates JavaScript code that runs alongside the process (see send function). Said code generates callback events (of Message and Event types) and continues running the process. The send function allows us to transfer information from process side (Gum API) to automation side (Bindings API). Using this information, we can decide what we want to do when our hook sees a certain result.

Frida Native

real-time function call following

As mentioned above, the moment frida-trace is run a JavaScript file is loaded that defines how Frida should react upon entering and exiting a function.
In this example, we'll use the default.

Function Calls

The above screenshot is filled to the brim with function calls. Each color represents a different Thread.

At this stage, we want to examine function inputs and reset Frida to print the relevant variables. According to MSDN documentation, the first variable (lpFileName) is actually the file path or device that we want to open or create.

We slightly adjust our script :

{ onEnter: function (log, args, state) { log("CreateFileW()"); log(Memory.readUtf16String(args[0])); }, onLeave: function (log, retval, state) { } }

By adding :

log(Memory.readUtf16String(args[0]));

Which eventually returns the correct file path.

Corrected file path

Change Input in Real-Time

To spice things up, we’ll slightly adjust our script to change the way cmd behaves. That is, every time cmd tries to open “Password1.txt” (the real file), we’ll force it to open “Password2.txt” (FAKE NEWS)

CMD behavior

The above screenshot details regular cmd behavior.

{ onEnter: function (log, args, state) { log("CreateFileW()"); log(Memory.readUtf16String(args[0])); if(Memory.readUtf16String(args[0]).includes("1")) { var fileName = Memory.readUtf16String(args[0]); var newFileName = fileName.replace("1", "2"); Memory.writeUtf16String(args[0], newFileName); } }, onLeave: function (log, retval, state) { }

Our tweaked script looks like that. We check to see if the memory string includes certain phrases and letter and if so, we change them. Hence, every time CreateFileW, Frida will change IpFileName to our desired value.

CMD output after our tomfoolery :
Kernel32.dll - Exports Table inside notepad.exe

Conclusion

What have we learned.

What DBI is, what it can do, and why it's better than your traditional debugger.
We met Frida, a wildly powerful tool that makes our lives much much easier.
How to use Frida in all sorts of different situations

Relevant Links

https://www.frida.re/
https://github.com/frida/frida/wiki/Comparison-of-function-hooking-libraries
https://github.com/frida/frida-python/blob/master/src/frida/tracer.py#L838
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx
https://github.com/frida/frida-java/tree/master/test/re/frida


Credits

Tomer Zayit is an information security research at F5 Networks and an Open-Source programmer.

Originally posted on Digital Whisper