The client itself decides which server to talk to. It gets a list of available servers from a directory (called a service registry) and picks one. No middleman needed.
this is the cached addresses
Step A: Registration & Heartbeat When a "Server" instance (the service providing the data) boots up, it sends a REST call to the Service Registry to register itself. It then sends a "heartbeat" (a tiny ping) every few seconds. If the heartbeat stops, the Registry removes that server from the list.
Step B: Discovery (The Pull) When the "Client" (the service needing the data) starts up, it reaches out to the Service Registry and says: "Give me the current list of all healthy instances for 'Payment-Service'." It saves this list locally.
Step C: Selection (The Logic) When your code actually executes a call (e.g., restTemplate.getForObject("http://payment-service/pay")), the load balancer library intercepts the request. It looks at its local cache and sees three IPs:
10.0.0.1
10.0.0.2
10.0.0.3
It applies an algorithm—usually Round Robin (cycling through them) or Random Selection—to pick one.
Step D: The Direct Call The client swaps the service name (payment-service) for the real IP address (10.0.0.2) and sends the request directly to that server. No middleman is involved in the actual data transfer.