You are on page 1of 33

Auditing Closed-Source Applications

Using reverse engineering in a security


context
Speech Outline:
1. Different Approaches to auditing binaries
2. How to spot common programming mistakes in the
binary
3. Writing a small script that automates the task of
searching for suspicious coding constructs
4. Example of using this script to find a buffer
overflow in a major web server application.

HalVar Flake

White Hat vs Black Hat auditing


White Hat Auditing:
Trying to ensure application security by auditing every line of code in a
given application, hopefully fixing all problems and leaving the program
in a secure and stable condition

All code has to be audited

Continues after a vulnerability has been found

Has to be repeated upon every upgrade

Black Hat Auditing


Trying to find a single vulnerable condition through
which the security of an application can be compromised

HalVar Flake

Only audits suspicious parts of the code


Only one vulnerable condition is needed
Will only be repeated if all old problems
have been fixed.

Closed-Source Auditing Approaches


1. Stress Testing with Junk Input
Long strings of data are more or less randomly generated and sent to the application,
usually trying to overflow every single string that gets parsed by a certain protocol.
Pros:

Stress testing tools are re-usable for a given protocol


Will work automatically with little to no supervision
Do not require specialized personnel to use

Cons:

HalVar Flake

The analyzed protocol needs to be known in advance


Complex problems involving several conditions at once
will be missed
Undocumented options and backdoors will be missed

Closed-Source Auditing Approaches


2. Manual Reverse Engineering
A reverse engineer carefully reads the disassembly of the program, tediously reconstructing the program flow and spotting programming errors. This was the
approach Joey__ demonstrated at BlackHat Singapore.
Pros:

Even the most complex issues can be spotted

Cons:

HalVar Flake

The process involved is incredibly time-consuming


A highly skilled and specialized auditor is needed
The danger is inherent that an auditor will burn out
and thus miss obvious problems

Closed-Source Auditing Approaches


3. Looking at suspicious code constructs
A reverse engineer audits calls to functions wich are know to be the source of common
programming errors. He looks through these calls and decides which ones to read more
closely.
Pros:

Cons:

HalVar Flake

Reasonable depth: Even relatively complex issues can


be uncovered
In comparison to a complete manual audit, this approach
saves quite a bit of time
The process of looking for suspicious constructs can be
automated to a certain degree
Not all problems will be uncovered
Needs highly specialized auditor
If nothing is found, the auditor is back to approach Nr. 2

The right tool for the task:


IDA Pro by Ilfak Guilfanov
www.datarescue.com

HalVar Flake

Can disassemble x86, SPARC, MIPS and much more ...


Includes a powerful scripting language
Can recognize statically linked library calls
Features a powerful plug-in interface

Dynamically linked library calls:


Diagram of program flow

Application
Code

strcpy ( ) - Code
sprintf ( ) - Code
strcat ( ) - Code

Dynamic
Linkage Table
Executable Image
HalVar Flake

...
Dynamic Library

Statically linked library calls :


Diagram of program flow

Application
Code

strcpy( ) - Code
strcat( ) - Code
....
Executable Image
HalVar Flake

Assembly recap: Passing arguments


A simple example
void *memcpy(void *dest, void *src, size_t n);
Assembly representation:
push
mov
push
lea
push
call

HalVar Flake

4
eax, unkn_40D278
eax
eax, [ebp+var_458]
eax
_memcpy

Dangerous Programming Constructs


The classical strcpy/strcat

The source is variable, not a static string


This call targets a stack buffer

HalVar Flake

Dangerous Programming Constructs


The classical strcpy/strcat

Criteria for suspicious strcpy/strcat calls:


Does the call target a stack or heap buffer of fixed size ?
Is the source buffer dynamic and not a fixed string ?

HalVar Flake

Dangerous Programming Constructs


sprintf( ) targeting fixed buffers

Expanded strings are not static and not fixed in length


Format string containing %s
Target buffer is a stack buffer
HalVar Flake

Dangerous Programming Constructs


sprintf( ) targeting fixed buffers

Criteria for suspicious sprintf( ) calls:


Does the call target a stack or heap buffer of fixed size ?
Does the format string contain a %s ?
Is the expanded string of non-fixed length ?

HalVar Flake

Dangerous Programming Constructs


*scanf( ) parsing untrusted input

Data is parsed into stack buffers


Format string contains %s
HalVar Flake

Dangerous Programming Constructs


*scanf( ) parsing untrusted input

Criteria for suspicious *scanf( ) calls:


Does the format string contain a %s ?
Does the call parse a string into a fixed buffer ?

HalVar Flake

Dangerous Programming Constructs


strncat/strncpy failing to nullterminate

Copying data into a stack buffer again ...


If the source is larger than n (4000 bytes),
no NULL will be appended
HalVar Flake

Dangerous Programming Constructs


strncat/strncpy failing to nullterminate

The target buffer is only n bytes long

Dangerous Programming Constructs


strncat/strncpy failing to nullterminate
Criteria for suspicious strncat/strncpy( ) calls:
Is the length n the same size as or bigger than the targeted
buffer ?
Is the source buffer dynamic and not a fixed string ?
Does the call target a stack or heap buffer of fixed size ?

HalVar Flake

Dangerous Programming Constructs


format-string vulnerabilities

Format string is a dynamic variable


Argument deficiency

HalVar Flake

Dangerous Programming Constructs


format-string vulnerabilities

Criteria for suspicious *printf-calls


Does the call suffer from an argument deficiency ?
Is the format string dynamic instead of a static string ?

HalVar Flake

Dangerous Programming Constructs


Cast-screwups
void func(char *dnslabel)
{
char buffer[256];
char *indx = dnslabel;
int
count;
count = *indx;
buffer[0] = '\x00';
while (count != 0 && (count + strlen (buffer)) < sizeof (buffer) - 1)
{
strncat (buffer, indx, count);
indx += count;
count = *indx;
}
}
HalVar Flake

Dangerous Programming Constructs


Cast-screwups
Criteria for suspicious size_t utilization
Does the function copy memory with a size_t as length ?
Is the size_t a dynamic value instead of a hardwired one ?
Is the size_t subtracted from immediately before the call ?
Is the size_t at any point written after it has been signextended with the movsx-mnemonic ?

HalVar Flake

Automating the boring parts:


Hands on: A simple sprintf( ) analyzing script
Things to check for when analyzing a sprintf()-call:
Does the sprintf( ) target a static buffer ?
Does the format string contain an %s ?
Does the call suffer from an argument deficiency ?
If so, is the format string static or dynamic ?

HalVar Flake

Automating the boring parts:


Hands on: A simple sprintf( ) analyzing script

static GetStackCorr(lpCall)
{
while((GetMnem(lpCall) != "add")&&(GetOpnd(lpCall, 0) != "esp"))
lpCall = Rfirst(lpCall);
return(xtol(GetOpnd(lpCall, 1)));
}

Trace the code further until an add esp, somevalue is found


Convert the somevalue to a number and return it
HalVar Flake

Automating the boring parts:


Hands on: A simple sprintf( ) analyzing script
static GetBinString(eaString)
{
Zero the string
auto strTemp, chr;
strTemp = "";
Get a byte
chr = Byte(eaString);
while((chr != 0)&&(chr != 0xFF))
{
strTemp = form("%s%c", strTemp, chr);
eaString = eaString + 1;
chr = Byte(eaString);
}
return(strTemp);
}

HalVar Flake

Until either a NULL or a 0xFF is found, append one byte at


a time to the string, then return the string.

Automating the boring parts:


Hands on: A simple sprintf( ) analyzing script
Steps to take to retrieve argument n of a call:
1. Locate the n-th push before the function call
2. If an immediate offset is pushed, return this value
3. If a register was pushed, trace back until the
instruction is found which loaded the register
and return the value it was loaded with

HalVar Flake

static GetArg(lpCall, n)
Trace back until the
{
auto TempReg;
n-th push is found
while(n > 0)
{
lpCall = RfirstB(lpCall);
if(GetMnem(lpCall) == "push")
n = n-1;
}
Is the pushed operand
if(GetOpType(lpCall, 0) == 1)
a register ?
{
TempReg = GetOpnd(lpCall, 0);
Find where the
lpCall = RfirstB(lpCall);
register was last
while(GetOpnd(lpCall, 0) != TempReg)
lpCall = RfirstB(lpCall);
accessed ...
return(GetOpnd(lpCall, 1));
}
... and return the value
else return(GetOpnd(lpCall, 0));
which was pushed ...
}
HalVar Flake

static AuditSprintf(lpCall)
{
auto fString, fStrAddr, buffTarget;

Clean up the arguments


Check for argument deficiency

buffTarget = GetArg(lpCall, 1);


fString = GetArg(lpCall, 2);
Check for a dynamic
if(strstr(fString, "offset") != -1)
format string
fString = substr(fString, 7, -1);
fStrAddr = LocByName(fString);
fString = BinStrGet(fStrAddr);
Scan the format string for %s
if(GetStackCorr(lpCall) < 12)
if(strlen(fString) < 2)
Message("%lx --> Format String Problem ?\n", lpCall);
if(strstr(fString, "%s") != -1)
if(strstr(buffTarget, "var_") != -1)
Message("%lx --> Overflow problem ? \"%s\"\n", lpCall, fString);
}

Check if the target is a stack variable


HalVar Flake

static main()
{
auto FuncAddr, xref;
FuncAddr = AskAddr(-1, "Enter address:");
xref = Rfirst(FuncAddr);
Ask auditor to enter the
while(xref != -1)
address of the sprintf( )
{
if(GetMnem(xref) == "call")
AuditSprintf(xref);
Call the auditing function
xref = Rnext(FuncAddr, xref);
once for each call to sprintf( )
}
xref = DfirstB(FuncAddr);
while(xref != -1)
{
if(GetMnem(xref) == "call")
AuditSprintf(xref);
Repeat for all indirect calls
xref = DnextB(FuncAddr, xref);
}
}
HalVar Flake

Hands on: Seeing the script in action


Running it against iWS 4.1 SHTML.DLL

We feed our script the number 0x10007068


The result:

This looks as if we can supply a very long string here ...


HalVar Flake

Hands on: Seeing the script in action


Running it against iWS 4.1 SHTML.DLL

The suspicious call ...


... the target buffer push ...
... and the corresponding register load.
HalVar Flake

Hands on: Seeing the script in action


Running it against iWS 4.1 SHTML.DLL

... and a target buffer of only 256 bytes ...

HalVar Flake

Happy End
Why doesnt the webserver respond any more ?

HalVar Flake

You might also like