02-25-2024 05:19 AM
I've been exploring the capabilities of various VR systems and their approaches to tracking accuracy and user interaction. Specifically, I've been intrigued by the Quest 3's use of controllers with built-in cameras, primarily aimed at navigating and interacting within the virtual environment. However, this got me thinking about the potential for these cameras to be utilized not just for their intended purpose, but also for enhancing hand tracking capabilities, similar to how the Valve Index leverages its base stations for precise tracking.
Given that the Quest 3 controllers are already equipped with cameras, has there been any consideration or experimental development towards using these controllers as a sort of "mobile base station"? Essentially, this would not replace the standard tracking functionalities but could potentially offer developers and users an optional, more precise tracking method when needed.
I understand that the primary function of these cameras is not for tracking in the same capacity as base stations. However, considering the flexibility and innovation in VR development, the prospect of harnessing these cameras for dual purposes seems like an intriguing possibility. This could potentially reduce the need for external hardware in environments where precision is key but the setup of traditional base stations is impractical.
Are there any technical limitations or developmental considerations that would prevent the Quest 3 controllers' cameras from being utilized in this dual capacity? Could software updates or developer tools enable this functionality, or is it more a matter of hardware limitations?
I'm looking forward to hearing your thoughts and insights on this possibility. The idea of expanding the functionality of existing hardware for enhanced user experience and developer flexibility is exciting and could open new avenues for VR interactions and immersion.
Solved! Go to Solution.
04-03-2024 09:40 AM
Hi Rocha,
The Quest 3 controllers do not use cameras. The controllers you are referring to are the Quest Pro controllers, which can be purchased separately (I have them) and can be paired with a Quest 2 or Quest 3.
I haven't looked at an SDK since the controllers were released so I can't be 100% sure, but since no one else has replied yet, I'll make a guestimate that Meta would have to do quite a bit of coding to make this possible. I don't think the controllers provide the precision that they had hoped for. I personally feel like my Quest 3 controllers track better, more accurately, and much more quickly than the Pro controllers do. I keep my Pro controllers synced to one of my Quest 2's.
I would love to see an external tracking module though, to help with precision. Something you can put at the front of your play space that faces you. The problem with inside-out tracking will always be occlusion, and because of that, it will never have the precision of using base stations. In all fairness, hand tracking can't be done with Vive Base Station 1 or 2. Current inside-out tracking uses cameras on the headset and there's virtually no latency when aggregating the data of each camera into a single space. The algorithms are established and mostly mature. The distance between cameras is known, as well as the angles, allowing for optimized aggregation. When you start collecting data from external devices that include tracking their relative position to the headset and collecting object position and orientation from the external device, then moving that data into place based on the relative position AND orientation of the external device in contrast with the position and orientation of the headset, aligning the external device data with the on-device data, and filling in blanks, you complicate things a bit more. The main additional complexity is the latency between the external device and headset. The external data is already aging by the time it's on the headset. It might not be much, but when you consider that at a minimum of 120 updates per second, you need to calculate and draw the updates every 8.3ms, and that the headset has to wait on that external data to be collected and then sent to the headset, then aggregate, process and render the next frame every 8.3ms, you're already dealing in microseconds.
The external device would need to run at the same refresh (120hz). Interpolating the data would use too much processing and be less accurate that waiting for the external data to arrive on the device, knowing it's synchronized with the headset's camera data. So the headset has to wait until it receives the data, cutting into that 8.3ms, and then the calculations are more advanced than using only the headset's data. Let's assume that it processes the data and presents it to the application the same way it already does, but with the occluded data. The application has less time to use and process the data, and it has to stay within the time constraints required to present a frame to the renderer to draw.
It's definitely doable, and limiting the refresh to 90hz would give even more time, but it's a much more complex process than what it currently does. I'm sure we'll eventually see it, but given the underwhelming performance of the Pro controllers currently, I don't think we're there yet.
04-03-2024 09:40 AM
Hi Rocha,
The Quest 3 controllers do not use cameras. The controllers you are referring to are the Quest Pro controllers, which can be purchased separately (I have them) and can be paired with a Quest 2 or Quest 3.
I haven't looked at an SDK since the controllers were released so I can't be 100% sure, but since no one else has replied yet, I'll make a guestimate that Meta would have to do quite a bit of coding to make this possible. I don't think the controllers provide the precision that they had hoped for. I personally feel like my Quest 3 controllers track better, more accurately, and much more quickly than the Pro controllers do. I keep my Pro controllers synced to one of my Quest 2's.
I would love to see an external tracking module though, to help with precision. Something you can put at the front of your play space that faces you. The problem with inside-out tracking will always be occlusion, and because of that, it will never have the precision of using base stations. In all fairness, hand tracking can't be done with Vive Base Station 1 or 2. Current inside-out tracking uses cameras on the headset and there's virtually no latency when aggregating the data of each camera into a single space. The algorithms are established and mostly mature. The distance between cameras is known, as well as the angles, allowing for optimized aggregation. When you start collecting data from external devices that include tracking their relative position to the headset and collecting object position and orientation from the external device, then moving that data into place based on the relative position AND orientation of the external device in contrast with the position and orientation of the headset, aligning the external device data with the on-device data, and filling in blanks, you complicate things a bit more. The main additional complexity is the latency between the external device and headset. The external data is already aging by the time it's on the headset. It might not be much, but when you consider that at a minimum of 120 updates per second, you need to calculate and draw the updates every 8.3ms, and that the headset has to wait on that external data to be collected and then sent to the headset, then aggregate, process and render the next frame every 8.3ms, you're already dealing in microseconds.
The external device would need to run at the same refresh (120hz). Interpolating the data would use too much processing and be less accurate that waiting for the external data to arrive on the device, knowing it's synchronized with the headset's camera data. So the headset has to wait until it receives the data, cutting into that 8.3ms, and then the calculations are more advanced than using only the headset's data. Let's assume that it processes the data and presents it to the application the same way it already does, but with the occluded data. The application has less time to use and process the data, and it has to stay within the time constraints required to present a frame to the renderer to draw.
It's definitely doable, and limiting the refresh to 90hz would give even more time, but it's a much more complex process than what it currently does. I'm sure we'll eventually see it, but given the underwhelming performance of the Pro controllers currently, I don't think we're there yet.