Directory traversal vulnerabilities allow an attacker to escape basic
file-system sandboxing.
In many situations, we wish to provide access only to resources within
a specific directory.
In the presence of a directory traversal vulnerability, an attacker can
access resources outside of the "safe" directory.
This is often accomplished using the "dot dot slash" mechanism provided
by all modern file systems.
When present in a path to a file, this mechanism tells the file system
to go back up to the parent directory.
For example, the following paths refer to the same file on a Unix
system: /etc/passwd
and /bin/../etc/passwd
.
There are different character sequences that control moving between directories in a path name. Unix systems use the "/" as a separator while Windows uses the "\" (and sometimes recognizes the "/". While ASCII historically has been the most common character encoding, Unicode is seeing increasing use. And, when path names form part of a URL, we will also see percent encoding used.
As you can imagine, there are many situations in which directory traversal could be a problem. For example, a web site that allows for loading user-selected images from a directory on the web server may be vulnerable. Another example may be in a naive file transfer implementation that restricts user's working directory. For this exercise, we will consider a situation where some Java application must execute a third-party binary that is selected by a user. This could be present in some remote access mechanism where only the commands found in a certain directory are available. Your objective will be to input a string that executes a program outside of that directory.
We provide a simple Java program that executes a program found in a
"safe" directory.
The program takes a single argument: the name of the executable.
It executes the program with that name found in the
safe_programs
directory located in the working directory
of the program.
For now, the only two programs are binary implementations of Bash's
pwd
and whoami
commands.
To mitigate the vulnerability in this exercise, we will use Java's implementation of "real" paths to ensure that the file specified has the correct parent directory. For this purpose, we want canonical paths. A canonical path is one that contains no relative path elements (i.e., no dot dot slash) and expands all symbolic-type links (i.e., if a path links to another file, it replaces the prefix of that path to the pointed-to file). The important notion of canonical paths is that any two paths referring to the same file system object have identical canonical paths. This will eliminate the complications of detecting relative path elements.
This exercise will be completed entirely on the command line terminal of the provided virtual machine. To open the terminal, right-click on the "EXERCISES" directory and select "Open in Terminal". Enter the following command to change into the exercise directory:
cd 3.6.1_directory_traversal
We provide a Makefile that will compile the program. Every time you
change the Main.java
file, you must recompile the program
before running it again. Enter the following command to compile the
program:
make
We will run this Java program from the command line.
You must provide an executable name to the program.
Execute the program by entering the following command, where "exe_name"
is the name of an executable in the safe_programs
directory:
java Main exe_name
Note: If you somehow execute a program that gets stuck without
terminating, you can press ctrl + c
to terminate the
program.
Now that you understand the basic behavior of the program, it's time
to look at the implementation.
This program is implemented in Main.java
.
Use your favorite text editor
to open this file. Enter the following command to open it in Nano:
nano Main.java
Spend some time looking at the code to find our attack vector. How might our malicious input get to the file system?
As we noticed above, the program simply "resolves" the program name
against our input filename.
Using
Path.resolve(...)
and the information in the table above, craft an input that executes
the "unsafe program" in our working directory (one level above the
safe directory).
In a real attack, we could climb the directory tree all the way to
root and execute any zero-argument program in the filesystem, but
let's stick to the unsafe_program
in our working
directory.
You should see the following output on successful exploitation:
UNSAFE PROGRAM OUTPUT
Program Exit Code: 0
As mentioned in the introduction, we will be comparing canonical paths to determine that the specified file is within the correct directory. First, take a look at the Paths API documentation to see what operations we can use. We'll need to use the "real" path mechanism to canonicalize both paths, then ensure that the executable's parent is exactly the same as our safe directory. For our path to truly be canonical, we must follow and replace links. Try to implement this using the Paths API. If we get an input that does not satisfy this condition, print an error message and return a negative value. Remember to compile your program before testing.
Repeat the process of implementing the mitigation, compiling the program, and testing the behavior until you can confirm that the program's desired behavior is unchanged and that our exploits no longer work. Congratulations, you've successfully mitigated the directory traversal vulnerability!