Jpeg Exploitation

Background

As promised, I said I would look into why the jpeg format was exploitable for a period to the windows OS and various applications. The answer I came up with was actually pretty interesting. (If you aren’t already familiar with the jpeg format then take a look at my previous post first)

Intro

As you can probably remember (or just read), jpeg files consist of various markers before the actual image data. The interesting thing about the markers happen to deal the length of the data which is represented as 2 bytes. The important part is that these two bytes include the length bytes (for example a length of 6 is actually 2 length bytes and 4 bytes of data). This means that technically a value of 1 or 0 for the length is invalid and parsing should be halted immediately.

Explanation and Impact

So, the problem arose when the library to decode jpeg files would read in a length and then subtract 2 from it to know how many more bytes to read in. This would result in a -2 or -1 for the length. Well it turns out that, before any verification, the length was transformed into a 32bit integer. This would make -2 into approximately 4GB and the subsequent read would trigger a buffer overflow.

So how severe was the problem? Well it turns out that the vulnerable code resided in the GDI, or Graphics Device Interface, which is basically a library that Microsoft provided for many programs to use. Therefore, all of the programs relying on this library were vulnerable and Microsoft had a fairly long list of programs using it.

Detection and Removal

Since the exploit was proven, how would one detect this attack? Well it seems fairly straightforward once you know all of the facts. Basically one would just need to parse the markers as usual and attempt to detect a length of less than 2. An example script demonstrating this can be found at the end of this post. If you decide to test this then I should warn you that if you create a jpeg file with a manipulated marker length then it seems all of the common anti-virus seem to detect it. This will probably result in it removing your image and either way it’s reported in the AV stats… So I wouldn’t test it at work (like I did… that’s a fun one to explain).

The normal anti-virus removal seems a little harsh as I would be rather upset if my personal images were deleted. Instead, lets investigate if we can remove the infection. For a jpeg, obviously what we truly care about is the ability to show the picture. Since it is possible to exploit any of the headers except for the SOS (Start of Scan-> 0xFFDA) marker then it is possible that the compressed image data could be unmodified. So, we need to determine which markers we actually need in order to decompress an image…they are as follows:

  • The Main Marker (Usually JFIF marker: xFFE0)
  • Quantization table marker (0xFFDB)
  • Huffman table markers (FFC4)
  • Start of frame marker (FFC0)
  • Start of Scan marker (FFDA)

If all of these markers are present and non-malicious then remove the others and save the file. This means that the maliciousness is most likely in the comments section of the jpeg and that is used primarily by image editors like photoshop to store other interesting information about the file.

If any of these markers are NOT present or are malicious, then it becomes tricky. Basically this will probably depend on whether or not the image is a digital camera image or one processed using an image editor. The reason lies in the structure of jpeg compression. I will elaborate further on this in another post, but basically jpeg compression uses binary trees to create shorter bit strings to represent common bytes. This is called Huffman compression and if you don’t want to wait for my explanation there are plenty on the net. The important fact is that the tree used is then stored in the jpeg file. Since digital cameras do not have a lot of processing power, they often use a hard-coded huffman tree for compression. This results in smaller files but not optimal compression. In contrast, photoshop forms its tables based on the image data. This results in optimal compression, but it uses different trees for each file.

So what does this all mean? Well if you have two images from the same digital camera (and using the same compression settings) then in all likelihood you can use the same tables from the other image to fix the corrupted image. This would actually have a pretty good chance of success but, in all fairness to the AV companies, it would be hard to automate. If I get a chance later, I’ll try to show an example in a later post. For now, I’ll leave you with the code I wrote that verifies a jpeg file is not malicious. If you want more information about the vulnerability and its exploitation then have a look here as it was an excellent writeup on all of the very technical details about the problem.

Code

NOTE: this is not optimized, nor should it be used in production environments. I just wrote this to give you a quick look into how jpeg files are structured and how simple it is to detect a malicious one:

[EDIT] The first program is in perl and I have added a second in C# below it. The C# program does not do as much checking as the perl version, but it will still detect malicious jpeg files.

use warnings;
my $buffer;
my $numRead;
my %verified;
my %signatures = ( "APPS"=> "^\xFF[\xE0-\xE9]", "QT"=>"^\xFF\xDB", "Huff"=>"^\xFF\xC4","SOF"=>"^\xFF\xC0","SOS"=>"^\xFF\xDA");

#quit if user did not provide a file name or the file doesn't exist
die "You must provide a filename" if(!$ARGV[0] || !(-f $ARGV[0]));

#open the file
open($IN,"<".$ARGV[0]);

#use binary reading
binmode($IN);

#read in the first two bytes
$numRead = read($IN,$buffer,2);

#Die if it doesn't look like a jpeg file
die "File is not a JPEG" if!($buffer =~ /^\xFF\xD8/);

#loop forever
while(1)
{
	$numRead = read($IN,$buffer,4);

	#if there is nothing else to read
	if($numRead < 4)
	{
		#exit the program
		determineSuccess();
		last;
	}

	#exit if we are at the compressed data segment
	if($buffer =~ /^\xFF\xDA/)
	{
		$verified{"SOS"} = 1;
		#exit the program
		determineSuccess();
		last;
	}

	#if the header does not start with \xFF then it's corrupt
	if(!($buffer =~ /^\xFF/))
	{
		print "Could not find marker, file appears corrupt\n";
		last;
	}

	#check to see if we're at a known header
	foreach $key (keys(%signatures))
	{
		if($buffer =~ /$signatures{$key}/)
		{
			$verified{$key} = 1;
		}
	}

	#unpack the length as an unsigned short in big-endian
	$len = unpack("n",substr($buffer,2,2));

	#check to see if the length is bad
	if($len < 2)
	{
		print "Jpeg appears to have malicious content\n";
		last;
	}

	#skip the header data
	$numRead = read($IN,$buffer,$len-2);
}

sub determineSuccess
{
	#Success if all of the necessary segments were found
	if($verified{"APPS"} && $verified{"QT"} && $verified{"Huff"} && $verified{"SOF"} && $verified{"SOS"})
	{
		print "Verified! - Jpeg appears to have no malicious content\n";
	}
	else
	{
		print "Jpeg appears to be missing a segment(s)\n";
	}
}

And an example in C#

using System;
using System.IO;
namespace JpegVerifier
{
	///
	/// This class will verify the jpeg sent to it
	///
	class Class1
	{
		///
		/// The main entry point for the application.
		///
		[STAThread]
		static void Main(string[] args)
		{
			byte[] header;
			int length;
			BinaryReader br;

			if(args.Length < 0)
			{
				return;
			}

			br = new BinaryReader(File.OpenRead(args[0]));

			try
			{
				header = br.ReadBytes(2);

				if(header[0] != '\xff' ||
					header[1] != '\xd8')
				{
					System.Console.WriteLine("File is not a JPG");
					return;
				}

				while(true)
				{
					//if the header is 0xffda then we're at the image data
					header = br.ReadBytes(2);
					if(header[0] == '\xff' &&
						header[1] == '\xda')
					{
						//Assume jpg is well formed
						System.Console.WriteLine("Jpeg File is Clean");
						return;
					}

					//if the header does not start with 0xff then it's not a header
					if(header[0] != '\xff')
					{
						System.Console.WriteLine("Jpeg Marker is corrupt");
						return;
					}

					//assume marker is valid for our purposes
					//read in length
					length = br.ReadUInt16();

					//convert to BigEndian
					length = ((0x00FF & length) << 8 | (0xFF00 & length) >> 8 );

					if(length < 2)
					{
						System.Console.WriteLine("Jpeg is MALICIOUS!");
						return;
					}

					//skip the marker data
					br.ReadBytes(length-2);
				}
			}
			catch(Exception e)
			{
				System.Console.WriteLine(e.Message);
			}
		}
	}
}
Advertisement

~ by imbeyondboredom on July 16, 2009.

2 Responses to “Jpeg Exploitation”

  1. Excellent work! I would like to know if You can provide some examples in C/C++ or even C# of this detection technique since I’m not familiar with code provided above.I’ve started exploration about JPEG exploits today and I would be very grateful if You can help me and point out some other techniques of creating/preventing this exploit.

    Thank You very much in advance.

    Best regards,

    Civa

    • The above code is in Perl (I will edit to state that later) but I can whip something up in c++ or C# soon. I’m about to go on vacation but i’ll respond as soon as I get a chance
      [EDIT] C# code is now included and the post has been edited to clearly state that it is in perl. Thanks for the comment, let me know if you need more help

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.