Uber’s Real‑Time Location Architecture

 5 min read

YouTube video ID: gHIs0Mdow8M

Source: YouTube video by Philipp LacknerWatch original video

PDF

Introduction

When you open the Uber app, enter your pickup and drop‑off points, and tap “Request a ride,” two things happen instantly: your location is sent to Uber, and the positions of nearby drivers are sent back to you. Making this interaction feel seamless for hundreds of millions of users, even on unreliable networks, requires a sophisticated backend architecture.

The Original Polling Solution

  • Uber initially used a polling‑based approach.
  • The client repeatedly asked the server, “Are there new locations for nearby drivers?” in a tight loop.
  • Problems that emerged:
  • Unnecessary server load – most polling requests returned no new data.
  • Battery drain on the device.
  • Extra overhead from request headers.
  • At one point, 80 % of network requests from the app were polling calls.
  • Cold‑start time increased dramatically because many concurrent polling calls competed for resources, delaying UI rendering.

Moving to a Push‑Based Architecture

To eliminate the inefficiencies of polling, Uber switched to a push‑based communication model built around a system called Ramen (Realtime Asynchronous Messaging Network).

Key Components

  1. Firewall (microservice) – decides when to push data. It listens to events such as:
  2. A rider requesting a ride.
  3. A driver accepting a ride.
  4. New location updates for riders or drivers.
  5. It may compare a new location with the previous one and only push if the change is significant.

  6. API Gateway – receives the minimal push payload from Firewall, enriches it with additional context (e.g., user locale, operating system), and forwards it to the client.

  7. Ramen Server – delivers the final payload directly to the client.

How Ramen Works Internally

  • Initially built on Server‑Sent Events (SSE), guaranteeing at‑least‑once delivery of pushed events.
  • Later migrated to gRPC, enabling bidirectional streaming so the client can also send data to the server over the same channel.

Spatial Partitioning for Efficient Driver Queries

Uber processes real‑time location updates for millions of drivers each minute. Sending every driver’s location to every rider would be impossible.

Naïve Distance Calculation (Rejected)

  • Calculating the distance from a rider’s position to all drivers would create an unmanageable server load at scale.

The Spatial Partitioning Solution

  • Uber divides the world into smaller geographic regions, mapping each driver’s position to a specific region.
  • This avoids per‑driver distance calculations; the server only checks which drivers reside in the rider’s region and neighboring regions.

From Squares to Hexagons – H3

  • Square grids cause corner bias because diagonal squares are farther away than side‑adjacent squares.
  • Uber adopted H3, an open‑source hexagonal spatial index:
  • The globe is tiled with hexagons, each having equal distance to its neighbors.
  • Any GPS coordinate is mapped to an H3 index (the hexagon it falls in).
  • Queries use a K‑ring: all hexagons within K steps of the rider’s hexagon.
    • K = 1 → 7 hexagons (the central one plus its immediate neighbors).
    • K = 2 → includes the second layer of neighbors.

Complexity Reduction

  • The algorithm’s time complexity drops from O(N) (where N could be millions of active drivers) to O(K² + M), where K is the radius (number of hex rings) and M is the number of nearby drivers.
  • Example: If 100 drivers are nearby, Uber iterates over those 100 instead of millions.

Additional Uses of the Spatial Index

  • Creating dynamic pricing zones.
  • Predicting estimated time of arrival (ETA).
  • Demand forecasting.

Optimizing Network Latency with Edge Servers

Mobile users often rely on cellular connections, which can be unstable. Uber reduces latency by deploying edge servers:

  • Hundreds or thousands of servers positioned globally, each serving nearby users.
  • A rider in Berlin talks to a server in Frankfurt rather than one in California.
  • Edge servers also cache relevant data, speeding up access.
  • On a typical 4G connection, this can shave ≈ 100 ms off request times compared with a distant server.

Smoothing Location Updates on Unreliable Networks

Even with edge servers, location updates may arrive irregularly. Uber ensures the map marker moves smoothly by:

  1. Reckoning – predicts the driver’s next position based on the last known speed and direction.
  2. Kalman filters – blend the predicted coordinates with actual measured coordinates.

The combination prevents abrupt jumps when a new measured point differs significantly from the prediction, resulting in fluid motion of the car marker.

The “Phantom Cars” Rumor

  • Some users claim Uber shows non‑existent “phantom” cars to make the map look populated and discourage switching to competitors.
  • Uber denies these allegations.
  • The discussion highlights how seemingly simple UI elements can mask a highly complex, scalable backend.

Conclusion

Uber’s journey from a naïve polling system to a sophisticated push‑based architecture illustrates the challenges of delivering real‑time location data at massive scale. By introducing services like Firewall, adopting Ramen with gRPC, leveraging hexagonal spatial indexing (H3), deploying edge servers, and applying reckoning with Kalman filters, Uber provides a responsive and fluid experience even under poor network conditions. The system’s intricacy underscores why modern mobile apps often rely on elaborate backend engineering to meet user expectations.

Uber transformed its architecture by replacing a heavy polling model with a push‑based system that uses Firewall, an API Gateway, and Ramen, first over SSE and later gRPC, dramatically cutting unnecessary traffic and latency. Spatial partitioning with the H3 hexagonal index reduces driver‑search complexity from linear to a small, region‑based computation, enabling fast nearby‑driver queries and supporting features like dynamic pricing and ETA estimation. Deploying edge servers close to users trims round‑trip times by roughly a tenth of a second, improving responsiveness on cellular networks. Predictive reckoning combined with Kalman filtering smooths driver location updates despite irregular network delivery, ensuring fluid map animations. Together, these engineering choices illustrate how Uber’s backend handles massive real‑time location data while delivering a seamless user experience.

  Takeaways

  • Polling accounted for 80 % of network requests and caused unnecessary server load, battery drain, and increased cold‑start time.
  • Uber replaced polling with a push‑based model that uses Firewall, an API Gateway, and Ramen, initially built on Server‑Sent Events and later on gRPC for bidirectional streaming.
  • Spatial partitioning with the H3 hexagonal index reduces driver query complexity from O(N) to O(K² + M), allowing efficient nearby‑driver searches.
  • Edge servers positioned near users reduce latency by about 100 ms compared with distant data centers.
  • Reckoning and Kalman filters are used together to predict and smooth driver location updates on unreliable networks.
  • Uber denies the existence of “phantom” cars, highlighting the gap between UI perception and backend complexity.

Frequently Asked Questions

Who is Philipp Lackner on YouTube?

Philipp Lackner is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

How Ramen Works Internally

* Initially built on **Server‑Sent Events (SSE)**, guaranteeing **at‑least‑once delivery** of pushed events. * Later migrated to **gRPC**, enabling **bidirectional streaming** so the client can also send data to the server over the same channel.

PDF