Sunday, October 7, 2012

CSAW 2012 - Net300 Writeup

Last weekend I competed in my first CTF together with the UTDCSG ( - it was a really great experience and I'm very much looking forward to competing in future CTFs. I attempted to help out on a number of challenges for CSAW 2012 but Net300 was definitely my most worthwhile contribution.

Net300 was interesting because while I am familiar with Wireshark's overall layout and capabilities, I do not use it on a regular basis and it required a lot of supplemental research. And considering that the challenge wasn't really a 'network' challenge in the traditional sense, that added another layer of difficulty.

We were provided with a file 'dongle.pcap', and it became immediately apparent upon opening the pcap in Wireshark that we were not dealing with 802.3 Ethernet traffic.

OK - so this is a USB traffic capture of some sort. My immediate thought (which turned out to be pretty spot-on) was that "this is probably a capture of USB keyboard traffic; the key was typed in and is subsequently buried in the traffic".

In a supremely disorganized fashion, I began doing my homework and researching the basics of USB communication. I started with some of the official Wireshark USB documentation ( and found my way to a few others. I decided the easiest thing to try would be replaying the pcap file in my VM with the aid of ( I am relatively new to Linux and was experiencing serious issues with this method (possibly due to PyUSB - not entirely sure) so I gave up and moved on to decoding the traffic manually.

Calling upon my existing understanding of USB input devices (I am not a USB protocol expert so if you are reading this and feel I'm wrong about anything, please contact me with corrections), the host 'polls' at regular intervals looking for interrupt signals from devices. We can build a simple filter in Wireshark and narrow our capture file to this interrupt-related traffic (which also happens to make up the majority of the pcap file) with 'usb.transfer_type == 0x01'.

Looking at the capture traffic, we notice a lot of communication between endpoint 0x83 and the host, so let's take a closer look there. Just observing we see what looks to be a two-way conversation, alternating communication back and forth between a device (26) and the host. Looking closer, frames that list '26.3' as their source and 'host' as the destination have a length of 72 bytes (versus 64 bytes for communication going the other way) - an extra 8 bytes worth of data seems like a good number.

Skipping around the web and reading about USB reveals that there are four basic modes of transfer for USB: control, interrupt, isochronous, and bulk transfers ( For interrupt transfers, the link also states "the maximum data payload size for low-speed devices is 8 bytes" which is is encouraging. We can pretty much assume at this point that its using interrupt because a) that's just how keyboards work and b) it flat out says so the frame capture. Let's look closely at the highlighted lines:

From the frame capture above we can determine the following:
  • It's an HID-related communication [bInterfaceClass: HID (0x03)]
  • It's using interrupt for it's transfer mode [URB transfer type: URB_INTERRUPT (0x01)]
  • There appears to be data at byte 0x42 (value of 12h in this screenshot) that changes in value
Ok, let's build a filter that only looks at 72 byte frames (carrying data) from 26.3 to host and filter out the frames with empty data: '((usb.transfer_type == 0x01) && (frame.len == 72)) && !(usb.capdata == 00:00:00:00:00:00:00:00)'. Right-click the 'Leftover Capture Data' field (this is our 8 bytes of data, the description basically means that Wireshark doesn't know how to interpret the data and thus doesn't know what to call it) and 'Apply as Column' so we can get an overview of the range of values.

Well that's interesting - the values mostly range from 0x04 to 0x30 - ASCII data perhaps? Looking around the official USB site lands us two really, really clutch documents
Looking at page 54 of the second document gives us what we need, something to translate captured USB data into actual keystrokes - bingo! Finding this chart was the big breakthrough moment in the problem for me where I knew I was finally getting close:

Now we just need to whip up a quick Python program to map our raw extracted data values (data.txt) with the corresponding letters from the chart above:

When we run the program we receive the following back:

We have our key!

Or do we?

Looking closely, you'll see that it's creating multiple xterm windows with specific offset coordinates. I understood what it was doing but neglected to look at it closely which turned out to be a huge mistake and monumental waste of time which would later prevent me from solving Web500 as I ran up against the 6pm deadline (I blame lack of sleep and overall inexperience). I kept trying different combinations of the key 'C48BA993D35C3A' and it kept returning as bad. My initial thought is that the key was actually case sensitive and that I needed to look for instances of the shift key being used (and figure out how it worked as a 'modifier' key). I actually found some which turned out to be irrelevant and I went through all of my steps again, banging my head against the wall for a while, until I looked at the output again realized that two of the windows printed on the lower line are out of order based on the value of their x-axis offset.

  1. RXTERM -GEOMETRY 12X1=300=40
  2. ECHO 5
  3. RXTERM -GEOMETRY 12X1=450=40
  4. ECHO C
  5. RXTERM -GEOMETRY 12X1=375=40
  6. ECHO 3
  7. RXTERM -GEOMETRY 12X1=525=40
  8. ECHO A
  9. RXTERM -GEOMETRY 12X1=600=40

If we rearrange the two out-of-place letters according to where they are supposed to be drawn on the screen we see that the key changes to 'C48BA993D353CA' - it works and we have our solution for Net300.

No comments:

Post a Comment