3.3 Directory Traversal Vulnerability Exercise

1. Introduction

Directory traversal vulnerabilities allow an attacker to escape basic file-system sandboxing. In many situations, we wish to provide access only to resources within a specific directory. In the presence of a directory traversal vulnerability, an attacker can access resources outside of the "safe" directory. This is often accomplished using the "dot dot slash" mechanism provided by all modern file systems. When present in a path to a file, this mechanism tells the file system to go back up to the parent directory. For example, the following paths refer to the same file on a Unix system: /etc/passwd and /bin/../etc/passwd.

There are different character sequences that control moving between directories in a path name. Unix systems use the "/" as a separator while Windows uses the "\" (and sometimes recognizes the "/". While ASCII historically has been the most common character encoding, Unicode is seeing increasing use. And, when path names form part of a URL, we will also see percent encoding used.

1.1 Exercise Description

As you can imagine, there are many situations in which directory traversal could be a problem. For example, a web site that allows for loading user-selected images from a directory on the web server may be vulnerable. Another example may be in a naive file transfer implementation that restricts user's working directory. For this exercise, we will consider a situation where some Java application must execute a third-party binary that is selected by a user. This could be present in some remote access mechanism where only the commands found in a certain directory are available. Your objective will be to input a string that executes a program outside of that directory.

We provide a simple Java program that executes a program found in a "safe" directory. The program takes a single argument: the name of the executable. It executes the program with that name found in the safe_programs directory located in the working directory of the program. For now, the only two programs are binary implementations of Bash's pwd and whoami commands.

1.2 Vulnerability Mitigation

To mitigate the vulnerability in this exercise, we will use Java's implementation of "real" paths to ensure that the file specified has the correct parent directory. For this purpose, we want canonical paths. A canonical path is one that contains no relative path elements (i.e., no dot dot slash) and expands all symbolic-type links (i.e., if a path links to another file, it replaces the prefix of that path to the pointed-to file). The important notion of canonical paths is that any two paths referring to the same file system object have identical canonical paths. This will eliminate the complications of detecting relative path elements.

2. Exercise Instructions

This exercise will be completed entirely on the command line terminal of the provided virtual machine. To open the terminal, right-click on the "EXERCISES" directory and select "Open in Terminal". Enter the following command to change into the exercise directory:

cd 3.6.1_directory_traversal

2.1 Compile the Program

We provide a Makefile that will compile the program. Every time you change the Main.java file, you must recompile the program before running it again. Enter the following command to compile the program:

make

2.2 Run the Program

We will run this Java program from the command line. You must provide an executable name to the program. Execute the program by entering the following command, where "exe_name" is the name of an executable in the safe_programs directory:

java Main exe_name

Note: If you somehow execute a program that gets stuck without terminating, you can press ctrl + c to terminate the program.

2.3 Inspect the Program Code

Now that you understand the basic behavior of the program, it's time to look at the implementation. This program is implemented in Main.java. Use your favorite text editor to open this file. Enter the following command to open it in Nano:

nano Main.java

Spend some time looking at the code to find our attack vector. How might our malicious input get to the file system?

2.4 Exploit the Vulnerability

As we noticed above, the program simply "resolves" the program name against our input filename. Using Path.resolve(...) and the information in the table above, craft an input that executes the "unsafe program" in our working directory (one level above the safe directory). In a real attack, we could climb the directory tree all the way to root and execute any zero-argument program in the filesystem, but let's stick to the unsafe_program in our working directory. You should see the following output on successful exploitation:

UNSAFE PROGRAM OUTPUT

Program Exit Code: 0

2.5 Mitigate the Vulnerability

As mentioned in the introduction, we will be comparing canonical paths to determine that the specified file is within the correct directory. First, take a look at the Paths API documentation to see what operations we can use. We'll need to use the "real" path mechanism to canonicalize both paths, then ensure that the executable's parent is exactly the same as our safe directory. For our path to truly be canonical, we must follow and replace links. Try to implement this using the Paths API. If we get an input that does not satisfy this condition, print an error message and return a negative value. Remember to compile your program before testing.

Repeat the process of implementing the mitigation, compiling the program, and testing the behavior until you can confirm that the program's desired behavior is unchanged and that our exploits no longer work. Congratulations, you've successfully mitigated the directory traversal vulnerability!