I’m not pretending to understand how homomorphic encryption works or how it fits into this system, but here’s something from the article.
With some server optimization metadata and the help of Apple’s private nearest neighbor search (PNNS), the relevant Apple server shard receives a homomorphically-encrypted embedding from the device, and performs the aforementioned encrypted computations on that data to find a landmark match from a database and return the result to the client device without providing identifying information to Apple nor its OHTTP partner Cloudflare.
There’s a more technical write up here. It appears the final match is happening on device, not on the server.
The client decrypts the reply to its PNNS query, which may contain multiple candidate landmarks. A specialized, lightweight on-device reranking model then predicts the best candidate by using high-level multimodal feature descriptors, including visual similarity scores; locally stored geo-signals; popularity; and index coverage of landmarks (to debias candidate overweighting). When the model has identified the match, the photo’s local metadata is updated with the landmark label, and the user can easily find the photo when searching their device for the landmark’s name.
Might be an 18.1 or 18.2 feature