Assignments

Your instructor will, over the course of the semester, assign you many different kinds of assignments. Here, we will briefly describe each type and the strategies you may employ to solve them.

Deobfuscate Source Code

Here, you are given an obfuscated C source file and are asked to return another C source file where as much of the obfuscation as possible has been removed. In other words, you will be graded on how close your deobfuscated program is to the original, unobfuscated, program. Below you can see how your instructor generates your challenge (obf.c) which you then try to obfuscate into result.c.

The goal of this assignment is for you to get familiar with different kinds of obfuscating transformations, and how you might defeat them. By initially working on source, the assignment gets a bit easier, and prepares you for later assignments when you will encounter the same problem but on binary code. Depending on how nice your instructor is, they can either keep the identifiers intact in obf.c, or they obfuscate them!

Since you're working on source code here, you have a lot of options! You can, for example, edit obf.c and add printf statements to trace the execution of the code. Or, you can compile it, and run it in a debugger. Or, you can compile and load it into a reverse engineering tool like Ghidra or Binary Ninja. Regardless, your first task should be to try to identify the obfuscation(s) that have likely been applied to the code, such as Flattening or Virtualization. Once you know that, you can proceed to work out the details of that transformation, such as which dispatch method that has been used.

Deobfuscate and Decompile Binary Code

Here, you are given an obfuscated executable file obf.exe and are asked to return a C source file result.c where as much of the obfuscation as possible has been removed, and you have decompiled the code to C source:

The goal of this assignment is for you to practice your assembly code skills, learning how to figure out what different instructions do, and how to translate them back to source. You can, of course, start by using a decompiler (such as provided by Ghidra or IDA Pro), but, depending on the the obfuscations your instructor has applied to orig.c, this may be more or less successful!! Also, your instructor may have told gcc to dynamically link obf,exe (now all symbols are intact which makes the task easier!) or they may have statically compiled obf.exe and stripped off all symbols (now it is much harder to find the different functions!).

Tamper with Binary Code

Here, you are given an obfuscated executable file obf.exe and are asked to return another executable file result.exe which you have tampered with such that it no longer exhibit a certain behavior (such as crashing).  For example, you may be asked to modify a function foo while, at the same time, ensuring that the checksum that protects foo isn't triggered:

Extract a Secret Asset

Here, you are given an obfuscated C source which contains an asset which you are asked to find. An asset is typically some data like a secret string, or a cryptographic key, or a machine learning model. In this case you will turn in a text file (result.txt) with the extracted data, Note that this is different from many of the other assignment types where you are asked to turn in source or binary code.

Depending on the nature of the data and the obfuscations applied, there are different ways you can attack this problem. For example, if you're asked to look for a secret string, and the program isn't heavily obfuscated, a first attack is simply to use Linux's "strings" command to print out all the strings in the program! If you're asked to extract a cryptographic key, a first attack is to look for sequences of bytes in the program with high entropy (randomness) - such random bytestrings may very well be the key you're looking for! A famous paper Playing Hide and Seek with Stored Keys discusses such attacks.

Analyze Malware

Here, you are given a set of malware programs malware_*.exe, each differently generated and obfuscated. The goal of the assignment is for you to analyze the executable files and find some invariant - some aspect of the files that do not change when obfuscated - such that you can write a set of rules (here, result.yara) that will discriminate malware from goodware! In other words, you're asked to produce pattern matching rules that could be input to a virus scanner.

Protect Software

Here, you are given an original program aes.c and are asked to produce a Tigress obfuscation script result.sh which will obfuscate and/or tamperproof aes.c in such a way that it has certain software complexity, or resilience against particular attacks, while staying within a particular performance bound: 

Handling Academic Integrity Issues 

To make it harder for students to cheat on these assignments, your instructor may generate a different assignment for each student in the class. For example, they may use your email address as a seed into the assignment generation process. Below, there are 3 students in the class, and their email addresses are used as seeds both to generate a random hash function and to obfuscate that function. The result is that each student gets a unique C source file to work on: