Beyond the Blue Dot: The Rise of Visual Positioning Systems (VPS)
Decoding VPS: The foundational tech for the next generation of AR and robotics to navigate the world.
GPS is 50 years old. While it’s the reason you can find a latte in a new city, it was never designed for the precision required by the next generation of technology. If you’ve ever seen your blue dot drift aimlessly while walking between skyscrapers, you’ve experienced the limits of satellite-based positioning.
As someone who spent six years building this tech at Google, I’ve seen firsthand how we’re moving from satellites in the sky to pixels on the ground.
Check out my latest video for a deep dive into the necessity of VPS and a look at the tools shaping this future:
VPS: The Precision Layer for the 3D World
The solution to GPS’s coarse accuracy is VPS (Visual Positioning System). Instead of relying on timestamps from satellites 12,000 miles away, VPS uses the camera on your device to look at the world and match it against a pre-existing 3D model.
Sub-meter Positioning: While GPS is accurate to a few meters, VPS enables centimeter-level accuracy.
Rotational Accuracy: VPS eliminates the “which way am I facing?” problem, providing precise orientation that’s critical for AR and robotics.
The Google & Apple Advantage: By leveraging over a decade of Street View data, Google can provide high-fidelity localization across 100+ countries, and Apple can use imagery from its “look around” feature in 80+ cities.
Why This Matters: Robots, Drones, and AR
We are moving into an era of autonomous agents and spatial computing where “close enough” isn’t gonna cut it.
Robotics: A delivery robot needs to know the difference between the street and the sidewalk. VPS provides the map-to-world alignment to make that possible.
Contested Environments: In areas where GPS signals are jammed or denied, drones can use VPS and a camera to navigate perfectly.
Persistent AR: Imagine leaving a digital sticky note on a specific piece of equipment for a colleague. VPS allows 3D annotations to stay locked to the physical world, even when you walk away.
The Emerging Ecosystem
It’s not just the tech giants. A new wave of startups like Multiset.AI is making it possible for anyone to scan a street with an iPhone and create a localized 3D model in real-time. We’re even seeing high-end capture devices like XGrids’ PortalCam using Gaussian Splatting to turn LiDAR and images into photorealistic, navigable scenes.
The Big Picture: Spatial Intelligence 🔭
VPS is the “where,” but the next frontier is the “what.” The real magic happens when we combine visual positioning with AI that understands the semantic meaning of the environment—a field known as Spatial Intelligence. We are moving toward world models that don’t just map space but understand it, opening the door for robots to step inside.
If you found this useful, share it with fellow reality mappers. The future’s too interesting to navigate alone.
Cheers, Bilawal Sidhu
bilawal.ai




