Tuesday, July 8, 2014

Windows 8 - Unsuitable for Network Processing?

Hi all - long time no see!

I've spent the last 2 weeks tearing my hair out at work, wrestling with what I originally believed to be network infrastructure issues. As part of our first big push to virtualize our forensic environment, we have purchased and begun experimenting with ESXi 5.5. During my preliminary evaluation of OSes for our forensic environment (we have historically used Windows 7), I immediately gravitated towards Windows 8/8.1 as I've had excellent personal experiences with it (the UI is questionable but has gotten a lot better with 8.1) and all of the core kernel improvements (especially SMB 3.0).

With that in mind, I recreated our forensic environment with Windows 8 as the base and began putting it through the paces, using it for daily forensic tasks. Note (background important to the story): we run a Netapp FAS2240-4 HA with roughly 140TB (raw) worth of storage for evidence and case work. It is configured with 10GB mezzanine cards and attached directly to our ESXi box via SFP+ in an effort to avoid putting traffic generated from evidence processing onto our core (1GB) network where round trips are longer and where switching equipment is is older, less reliable.

Both ESXi and 10GB were new to our environment, each with their own pitfalls and learning curves, so when I first witnessed issues with evidence on our NAS not verifying properly in Encase or FTK Imager, I immediately assumed it was some sort of configuration issue. Evidence verification jobs were returning unbelievable amounts of bad sectors and segment CRC errors. Attempting to hash files in Encase gave very unpredictable results - not what you want out of a forensic tool. Most of files would return valid (looking) hash values. However, during my review of the results, something caught my eye - groups of clearly dissimilar files (with distinctly different file sizes) were returning '93b885adfe0da089cdf634904fd59f71' (aka null byte) as their hash value. If I wasn't looking closely specifically for these kinds of issues, I may never have noticed - unnerving to say the least. The fact that these dubious hashes were occurring in runs of 50-100 files at a time made me think that Encase was having trouble reading parts of the image over the network (maybe losing connection temporarily), much the same way it appeared to be having trouble reading image segments during verification, hence the large number of sector errors indicative of bad image segments. In FTK Imager (versions 3.1.4 and 3.2), verification speeds would start high (70+MB/s) and quickly drop off, slowly bleeding down to ~15MB/s by the end and returning 'mismatch' for the verification result.

Could it be an issue with IP-hash load balancing over Etherchannel/LACP? Maybe our new virtualized domain controller was timing out intermittently and user authentication to the NAS was being lost temporarily - some sort of Kereberos time source differential problem? Maybe 802.3x (flow control) was dropping frames? Was our 2240 silently corrupting data? Did Cryptolocker somehow make it on to our network and start overwriting various E01 segments?

I spent about a week working through all of the aforementioned issues with no resolution and was starting to lose hope - not a good feeling. But as part of any good troubleshooting process (and probably where I should have started), I thought - maybe it has something to do with Windows 8? Microsoft reworked the network sharing protocol (SMB) a lot between W7 and W8 so maybe something got knocked loose, but as of the time of writing, Netapp's most recent ONTAP version (8.2.1) doesn't even support SMB3.0 and will only negotiate client sessions to SMB2.1, which has proven fine for us for a while now, so I wasn't readily considering it as a suspect. But as I exhausted other avenues, I had to consider the fact that Windows 8 was the last remaining 'unknown' in our environment, one that I had not considered as a possible suspect in this hellride - why would W8 make a difference if it's still using SMB2.1 at the heart of it's file sharing? Did the Windows file APIs change somehow?

I don't know, but recreating our forensic environment (exact same tools, exact same steps) with Windows 7 at the heart fixed the issue immediately. I am relieved the say the least, but still nervous - why would this be the case? I am going to continue following up on this - I'm too invested at this point not to have an answer. I thought maybe Netapp was the issue - I evaluated the 'cifs stat' output from our filers but saw nothing unusual. At the time of writing, I attempting to recreate the issues and pinpoint Windows 8 as the culprint by obtaining similar results, removed completely from our environment.

Update (07/10/14): I forgot to share some of the things I tried through my research with no success
  • Setting HKLM\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters\RequireSecureNegotiate to 0 (via http://support.microsoft.com/kb/2686098/en-us)
  • Setting HKLM\SYSTEM\CurrentControlSet\Services\LanmanWorkstation\Parameters\SessTimeout to 60 (via http://blogs.msdn.com/b/openspecification/archive/2013/03/27/smb-2-x-and-smb-3-0-timeouts-in-windows.aspx) - was 60s starting in Vista, changed to 20s in W8
I hope someone in the forensic community is able to benefit from this confusing journey. Please share any thoughts or experiences you might have had - I'm dying to get resolution on this!

Wednesday, February 20, 2013

Ghost in the Shellcode 2013 - Imgception Writeup

Back at it with UTDCSG (http://csg.utdallas.edu), we competed in Ghost in the Shellcode 2013. We had a great turnout Friday night with lots of old and new members working together shoulder-to-shoulder in ECSS 4.619!

Still relatively new to CTFs but armed with new techniques, new tools and a sharpened approach, I set off to tackle question 9 - 'Imgception', a forensics problem worth 150 points. Given the name of the problem it's probably safe to assume there are multiple images buried inside of each other.

We are given 'imgception-ce4fae066ffabd57aeb4a4d29faa1de1cf4c988f.png' and simply told 'find the key'. Since the file format is a PNG (a CTF favorite), of course the first thought is steganography. I gave it a quick run through my defacto stego tools with no luck, along with exiftool, foremost, scalpel and some others.

OK, let's take a look at it through pngcheck with the -v (verbose) flag set.

Nothing stands out at first glance, at least as far as tEXt or tEXt-esque chunks are concerned. Let's take a moment to review the PNG specification (http://en.wikipedia.org/wiki/Portable_Network_Graphics). The first thing that caught my were the 'unknown private, ancillary' chunks. According to the specification, PNGs can contain a variety of data 'chunks' that are optional (non-critical) as far as rendering is concerned.

Let's take an inventory of this file's chunk types - we have IHDR, sRGB, pHYs, giTs, ITDAT and IEND. All of these seem pretty typical, but let's compare them to a list of standard PNG chunk types.
  • bKGD gives the default background color. It is intended for use when there is no better choice available, such as in standalone image viewers (but not web browsers; see below for more details)
  • cHRM gives the chromaticity coordinates of the display primaries and white point
  • gAMA specifies gamma
  • hIST can store the histogram, or total amount of each color in the image
  • iCCP is an ICC color profile
  • iTXt contains UTF-8 text, compressed or not, with an optional language tag. iTXt chunk with the keyword
  • pHYs holds the intended pixel size and/or aspect ratio of the image
  • sBIT (significant bits) indicates the color-accuracy of the source data
  • sPLT suggests a palette to use if the full range of colors is unavailable
  • sRGB indicates that the standard sRGB color space is used
  • sTER stereo-image indicator chunk for stereoscopic images
  • tEXt can store text that can be represented in ISO/IEC 8859-1, with one name=value pair for each chunk
  • tIME stores the time that the image was last changed
  • tRNS contains transparency information. For indexed images, it stores alpha channel values for one or more palette entries. For truecolor and grayscale images, it stores a single pixel value that is to be regarded as fully transparent
  • zTXt contains compressed text with the same limits as tEXt
The one that stands out as not being in the list is giTs, and while it seems obvious after making the connection, giTs = Ghost in the Shellcode. OK, that's encouraging, let's look closer. Opening the file up in WinHex, we can search for the giTs chunk header.

If we examine the other giTs headers closely, we notice something interesting - the standard 4 byte chunk header (giTs) is immediately succeeded by 3 ASCII characters and then some spaces - a clue?

Let's take a closer look at what all of those letters are by pulling out the 8 byte chunk header and stripping out 'giTs' as it's a constant. Note: I originally performed this by copying and pasting, but since I wrote a similar script to pull out of data later in this problem, I re-purposed it a bit for use in this problem - feel free to use it for similar tasks, unless you're PPP in which case you probably have something way better kicking around.

Checking the program's output, we are left with the following text:

ger low ndT The edA heD uns heG ed  lin lac Man ese ssT kFl rtA Fol cro InB

No way that's a mistake - let's rearrange and see what we come up with:

The Man In Black Fled Across The Desert And The Gunslinger Followed

Apparently this is a quote from a famous Stephen King novel as pointed out to us by one of the CSG folks (and it explains the original PNG we were given). OK, so now what? During our initial recon, one of our team members ran Foremost against the original image and we were greeted with something that looked like this:

We can make out what looks to be a cat but the data is clearly malformed. It looks like there are sections that are out of order (particularly interesting considering BMPs are more or less just a raw stream of pixel data) and the color channels are wacked. We know that each of the giTs chunks are directly related to a section of the resulting rearranged quote - what if we rearranged each of the data chunks (32356 bytes) to match the order that they appear in the quote?

I didn't script this part and I'm not going to (though it would be good practice) - it was accomplished by hand pretty easily with the help of TweakPNG (http://entropymine.com/jason/tweakpng/) which allowed us to quickly export the chunks, stripping the footer bytes in the process (header bytes remain). Assign each of the chunks a name based on their position in the sentence (eg: chunk1, chunk2) and then just:

#: cat chunk1 > reassembled.bin
#: cat chunk2 > reassembled.bin

and repeat until they're all lined up and arranged properly. Strip the 8 header bytes out (simple find/replace RegEx for 'giTs....') and we are left with a proper looking 24bit Windows BMP file!

Now what? This part stumped us - we had lots of eyes on it but unfortunately ran out of time before we discovered the solution. Come to find out, the BMP file format (http://en.wikipedia.org/wiki/BMP_file_format#Pixel_array_.28bitmap_data.29) allows for padding at the end of each row:
Padding bytes (not necessarily 0) must be appended to the end of the rows in order to bring up the length of the rows to a multiple of four bytes. When the pixel array is loaded into memory, each row must begin at a memory address that is a multiple of 4. This address/offset restriction is mandatory only for Pixel Arrays loaded in memory. For file storage purposes, only the size of each row must be a multiple of 4 bytes while the file offset can be arbitrary.
A 24-bit bitmap with Width=1, would have 3 bytes of data per row (blue, green, red) and 1 byte of padding, while Width=2 would have 2 bytes of padding, Width=3 would have 3 bytes of padding, and Width=4 would not have any padding at all.
So let's take a look at our resultant bitmap (cat_rearranged.bmp) in WinHex:

According to the file format spec, we can examine byte 0xA (highlighted in yellow) to determine where the actual image data starts (highlighted in blue). We know the image width is 319px and the bit-depth is 24, so we know that each row is 319*(24/8) or 957 bytes in length. So starting at offset 0x36 and skipping ahead 957 bytes, we have skipped the row pixel data and arrive at the padding.

Ahhh, good old 0xFFD8! Forensics professionals and general purpose nerds will recognize this pattern instantly - JPEG file header! You only live once, that's the motto nigga ÿØÿà....

Examining the padding from the next few rows, it looks like that a JPEG image has been stuffed into the 3 padding bytes of each row - let's write a Python program to traverse the file and extract it.

Run it and we get get the following output - we have our key and the answer to Imgception for 150 points.

Thursday, February 7, 2013

Welcome CSG!

Hello to any visitors that may have stumbled in here from UTDCSG!

We just wrapped up with week 2 of forensics, my slides can be found here: http://csg.utdallas.edu/?attachment_id=272

I'll be doing a short writeup in the next few days for Nullcon 2013 FOR400, which was demoed during last night's presentation.

Monday, December 3, 2012

Cellebrite and Geohot, sitting in a tree...

As part of a recent case, I am reviewing debug logs from Cellebrite Physical Analyzer (v3.5) because I'm having some trouble parsing out e-mail. I've always wondered how much new development Cellebrite performs in regards to the exploits they leverage during iOS physical extractions. Well, here's a little insight:
[INFO ] Loading payload files.
[INFO ] Progress report: [Loading forensic program to device] [Step 1/19] Connecting to device
[INFO ] Progress report: [Loading forensic program to device] [Step 2/19] Enabling code execution (part 1)
[DEBUG] Following is the limera1n log:
limera1n params: 8402b001 0002c000 8403bf9c
Initializing control file upload
Sending pattern buffer
Sending several padding buffers
Sending main payload buffer
Triggering exploit

Sunday, October 7, 2012

CSAW 2012 - Net300 Writeup

Last weekend I competed in my first CTF together with the UTDCSG (http://utdcsg.org/) - it was a really great experience and I'm very much looking forward to competing in future CTFs. I attempted to help out on a number of challenges for CSAW 2012 but Net300 was definitely my most worthwhile contribution.

Net300 was interesting because while I am familiar with Wireshark's overall layout and capabilities, I do not use it on a regular basis and it required a lot of supplemental research. And considering that the challenge wasn't really a 'network' challenge in the traditional sense, that added another layer of difficulty.

We were provided with a file 'dongle.pcap', and it became immediately apparent upon opening the pcap in Wireshark that we were not dealing with 802.3 Ethernet traffic.

OK - so this is a USB traffic capture of some sort. My immediate thought (which turned out to be pretty spot-on) was that "this is probably a capture of USB keyboard traffic; the key was typed in and is subsequently buried in the traffic".

In a supremely disorganized fashion, I began doing my homework and researching the basics of USB communication. I started with some of the official Wireshark USB documentation (http://wiki.wireshark.org/USB) and found my way to a few others. I decided the easiest thing to try would be replaying the pcap file in my VM with the aid of usbreplay.py (https://bitbucket.org/dwaley/usb-reverse-engineering). I am relatively new to Linux and was experiencing serious issues with this method (possibly due to PyUSB - not entirely sure) so I gave up and moved on to decoding the traffic manually.

Calling upon my existing understanding of USB input devices (I am not a USB protocol expert so if you are reading this and feel I'm wrong about anything, please contact me with corrections), the host 'polls' at regular intervals looking for interrupt signals from devices. We can build a simple filter in Wireshark and narrow our capture file to this interrupt-related traffic (which also happens to make up the majority of the pcap file) with 'usb.transfer_type == 0x01'.

Looking at the capture traffic, we notice a lot of communication between endpoint 0x83 and the host, so let's take a closer look there. Just observing we see what looks to be a two-way conversation, alternating communication back and forth between a device (26) and the host. Looking closer, frames that list '26.3' as their source and 'host' as the destination have a length of 72 bytes (versus 64 bytes for communication going the other way) - an extra 8 bytes worth of data seems like a good number.

Skipping around the web and reading about USB reveals that there are four basic modes of transfer for USB: control, interrupt, isochronous, and bulk transfers (http://www.beyondlogic.org/usbnutshell/usb4.shtml#Interrupt). For interrupt transfers, the link also states "the maximum data payload size for low-speed devices is 8 bytes" which is is encouraging. We can pretty much assume at this point that its using interrupt because a) that's just how keyboards work and b) it flat out says so the frame capture. Let's look closely at the highlighted lines:

From the frame capture above we can determine the following:
  • It's an HID-related communication [bInterfaceClass: HID (0x03)]
  • It's using interrupt for it's transfer mode [URB transfer type: URB_INTERRUPT (0x01)]
  • There appears to be data at byte 0x42 (value of 12h in this screenshot) that changes in value
Ok, let's build a filter that only looks at 72 byte frames (carrying data) from 26.3 to host and filter out the frames with empty data: '((usb.transfer_type == 0x01) && (frame.len == 72)) && !(usb.capdata == 00:00:00:00:00:00:00:00)'. Right-click the 'Leftover Capture Data' field (this is our 8 bytes of data, the description basically means that Wireshark doesn't know how to interpret the data and thus doesn't know what to call it) and 'Apply as Column' so we can get an overview of the range of values.

Well that's interesting - the values mostly range from 0x04 to 0x30 - ASCII data perhaps? Looking around the official USB site lands us two really, really clutch documents
Looking at page 54 of the second document gives us what we need, something to translate captured USB data into actual keystrokes - bingo! Finding this chart was the big breakthrough moment in the problem for me where I knew I was finally getting close:

Now we just need to whip up a quick Python program to map our raw extracted data values (data.txt) with the corresponding letters from the chart above:

When we run the program we receive the following back:

We have our key!

Or do we?

Looking closely, you'll see that it's creating multiple xterm windows with specific offset coordinates. I understood what it was doing but neglected to look at it closely which turned out to be a huge mistake and monumental waste of time which would later prevent me from solving Web500 as I ran up against the 6pm deadline (I blame lack of sleep and overall inexperience). I kept trying different combinations of the key 'C48BA993D35C3A' and it kept returning as bad. My initial thought is that the key was actually case sensitive and that I needed to look for instances of the shift key being used (and figure out how it worked as a 'modifier' key). I actually found some which turned out to be irrelevant and I went through all of my steps again, banging my head against the wall for a while, until I looked at the output again realized that two of the windows printed on the lower line are out of order based on the value of their x-axis offset.

  1. RXTERM -GEOMETRY 12X1=300=40
  2. ECHO 5
  3. RXTERM -GEOMETRY 12X1=450=40
  4. ECHO C
  5. RXTERM -GEOMETRY 12X1=375=40
  6. ECHO 3
  7. RXTERM -GEOMETRY 12X1=525=40
  8. ECHO A
  9. RXTERM -GEOMETRY 12X1=600=40

If we rearrange the two out-of-place letters according to where they are supposed to be drawn on the screen we see that the key changes to 'C48BA993D353CA' - it works and we have our solution for Net300.

Wednesday, October 3, 2012

Libewf Installation

I ran into a few issues installing the latest libewf release (libewf-experimental-20120809.tar.gz) yesterday on a fresh Ubuntu 11.04 x64 VM. Before you say anything, yes I know that 12.04 is the latest Ubuntu release, however Log2Timeline does not official support it yet (according to https://code.google.com/p/log2timeline/wiki/Installation) and trying to fix all the dependencies by hand sucks - BAD.

If you are download the latest version of libewf and try to do a simple ./configure, make, make install, you may be greeted with the following message when you go to use ewfmount:
Unable to open EWF file(s).
libewf_decompress_data: missing support for deflate compression
libewf_section_compressed_string_read: unable to decompress string.
libewf_handle_open_read_section_data: unable to read header file object string.
libewf_handle_open_read_segment_files: unable to read section data from segment file: 1.
libewf_handle_open_file_io_pool: unable to read segment files.
libewf_handle_open: unable to open handle using a file IO pool.
I exchanged e-mails with the project's developer who responded in less than 12 hours and suggested I check  the output of .configure for missing packages (apologies in advance if this is obvious knowledge). I found it was missing multiple:
ADLER32 checksum support:                       NO
DEFLATE compression support:                    NO
BZIP2 compression support:                      NO
FUSE support:                                   NO
Well, the FUSE one is obvious - Joachim clearly states in his documentation that it's required (http://code.google.com/p/libewf/wiki/Mounting). The other three are not so obvious.

Running the following should resolve those dependencies:
apt-get update
apt-get install zlib1g-dev
apt-get install libbz2-dev
apt-get install libfuse-dev
The last line may be unnecessary. After this, you should be able to navigate to your unTARed libewf directory and perform a ./configure, make, make install (followed by ldconfig or you will receive messages about a missing libewf.so.2 dependancy!) and begin mounting E01s.

Again this may be common knowledge but I hope this helps some other investigators out there.

Happy forensicating!

Saturday, July 28, 2012


I am a 20-something year-old computer forensic examiner living in the Dallas, TX area. I have been working in the computer forensics (CF) field for ~2.5 years at the time of writing and I am constantly striving to improve my knowledge and skill sets  I have some minor programming experience and a huge appreciation for those that spend their free time coding tools for the community. I am highly interested in the hardware side of things (something about SMCs and copper traces really does it for me) as well as the security side; I'm curious by nature and have no problem asking questions so if you have the patience to explain a concept, you have my full attention.

This is my first attempt at maintaining a 'professional' blog. I cannot guarantee that I will update this with any sort of regularity (in fact, I joke that realistically there is a 90% chance that I will never post to this again), but I will try to drop in to share any thoughts I think that other people working in the community may benefit from. The amount of transparency and willingness to share information is in my opinion one of the CF community's strongest assets and one of its most interesting qualities; this blog will serve as my personal attempt at giving back to the community. Of course there is a fair amount of self-promotion (what do you think this is?) and commercialization that goes on within the field but those are natural and even quite beneficial; companies like Guidance Software, AccessData and Cellebrite certainly did not become industry leaders by giving their code away... though many would make the case that we (the examiners) should be the ones getting paid to use these products at times.

I will preface any further posts by saying that I am committed to respecting my employer's investment in me and have no desire to cause a conflict of interest. I will apologize ahead of time and say that I cannot (and will not) share highly proprietary processes or findings if they are developed on company time unless I have explicit approval, but I will make a strong effort to share what I can, when I can. Much of the CF community is comprised of law enforcement (LE) agencies which aren't particularly susceptible to conflict of interest situations as LE is fundamentally a public service (though depending on your level of cynicism you may feel otherwise).

Thanks for stopping by and hope to see you again sometime.