Project frozen · 2026-05-26 snapshot

Hong Kong grocery price intelligence

Two million grocery prices. One honest wall.

My wife noticed no single Hong Kong shop carries everything she wants. I noticed foodpanda often charges more than the shop's own website. So I read the publicly listed price of nearly everything, eleven retailers deep, and let the numbers talk.

Then I parked it, on purpose. This whole dashboard runs inside your browser tab. No server. No database humming in a closet. Just SQL on Parquet, right here. Scroll down and poke it.

2,027,565
price observations
1,147
stores
11
retail sources
156,433
product families
110,508
barcodes
warming up…

The convenience tax

Walk, or let foodpanda walk for you?

Same product. Same shop. Two prices. One for your feet, one for your couch. For every product we could match between a chain's own website and its foodpanda delivery page, we measured the gap. That gap is the channel premiumChannel premium: how much more the same item costs on foodpanda delivery versus buying it direct from the retailer's own website. Measured per matched product, then summarised per chain..

Bars = the average markup and the painful 90th-percentile markup, per chain. The couch is rarely free.

Delivery markup by chain

foodpanda price vs chain-direct price, per matched product family

+20%
city'super adds a flat fifth on delivery. Same trolley, +20%, across ~3,007 matched items.
+0%
Marks & Spencer barely blinks. 97% of 2,660 items sit at exact parity online and delivered.
+13.2%
Wellcome's average. But the top tenth of items? +50%. Read the small print.

The shop scoreboard

Start here. Which chains lean on the couch tax, ranked by average markup. The shop is the story; the products are the evidence.

ShopAvg markupWorst 10%Items dearer on deliveryProducts compared
querying…

city'super and M&S bridge by exact product id (not name), so their rows are dead reliable: city'super tacks a flat fifth onto almost everything; M&S charges you the same whether you walk or wait.

…and the products that prove it

Biggest single-item gaps we found (direct shelf price vs foodpanda median, same retailer, matched products)

ProductChainCategoryDirectfoodpandaMarkup
querying…
Why delivery costs more
Visual explainer slot — a plain-language illustration of this concept lands here.

The hunt

Why you end up shopping in three places

It started with my wife. No single Hong Kong shop carries her whole list, sure. But the deeper trap is this: it's not that she can't find a thing. It's that the cheapest version of each thing keeps landing in a different shop, on a different week. That is not bad luck. That is Hi-LoHi-Lo pricing: keep list prices high, then run frequent deep promotions on a rotating handful of items. The deals are real, but they move. You have to chase them. (Hermann Simon, Confessions of the Pricing Man.) pricing doing its job.

She isn't fussy. The strategy makes hunting the only way to win. Here's the treadmill they put you on.

One basket, three shops, no winner

how a real weekly shop actually goes

1 Wellcome
✓ grab: Black Thunder choc $16 (was $63)
✗ but: the 24-pack water is dearer here
2 Mannings
✓ grab: the vitamins, half off
✗ but: tap “deliver” on the shampoo and it's +220%
3 city'super
✓ grab: the good cheese
✗ but: a flat +20% on every basic you add

Every shop is cheap on something and dear on the rest. You feel like you win each time you grab the marked-down thing. The basket disagrees. Chasing each deal to a different shop isn't saving — it's paying, in time and bus fare, to keep playing their game. That's not noise in the data. That's the strategy working.

Today's bait

real markdowns pulled from the snapshot — the hooks that get you through the door

Black Thunder Mini Choc 139g$63$16
Campbell's Trolley Cart$399$50
Wah Yuen Butter Egg Rolls 18pc$129$35
Bonaqua Water 24×770ml$192$59
Black Thunder Almond & Hazelnut$63$16
Headache capsules 60pc (Mannings)$199$50

The evidence: every chain has a personality

x = price level vs market · y = how much prices wobble (CVCoefficient of variation: standard deviation ÷ mean. Low = steady, predictable prices. High = constant promotions and reversals, i.e. Hi-Lo.) · bubble = SKUs compared

CV 2.30
Mannings, the wildest swings in town. You genuinely cannot guess if today's price is a gift or a gouge. So you check. Every time.
CV 0.47
Wellcome keeps you guessing too — steady on the surface, spiky underneath.
flat
A handful of shops barely move. Boring is a feature: you can trust the price without a spreadsheet.
The basket you can never fill cheaply in one place
Visual explainer slot — a plain-language illustration of this concept lands here.

Promo intensity

Who is always “on sale”?

A big red SALE sticker on half the shelf is a strategy, not a coincidence. Share of each chain's catalogue carrying a discount on snapshot day.

Share of catalogue on promotion

percent of priced SKUs marked down · snapshot 2026-05-26

The tell

Read the last digit, read the shop

A price ending in .90 is a small trick played on your eye: you read “$9 and change,” not “$10.” (Hermann Simon: the further right a digit sits, the less your brain weighs it.) So how a shop ends its prices is a quiet confession of who it's pretending to be. This is charm pricingCharm pricing: setting a price just under a round number ($9.90 not $10) so it feels cheaper. A round ending ($10.00) signals the opposite — quality, confidence, no gimmick. Hong Kong skips the Western .99 and splits between .90 and fully round..

Two camps fall out of the data, and they're exactly who you'd guess.

Playing cheap
Everything ends in .90 — Donki 87%, AEON 71%. The whole shop whispers “look how affordable.”
Pretending premium
Clean round numbers — city'super 92%, Wellcome 74%. The flex is refusing the trick at all.

How prices end, by chain

share of prices ending in .90 (deal-feel), round .00 (premium), and the rare .99

Why $9.90 lands differently than $10.00
Visual explainer slot — a plain-language illustration of this concept lands here.

Ghosts in the data

When a 2-litre bottle of water costs $9,999

Sift two million prices and you meet the ghosts. 26 listings priced at exactly HK$9,999. Nobody is paying ten thousand dollars for two litres of Kirin water. It's a placeholder — a price typed by a human who didn't want anyone to click buy.

HK$9,999
Kirin Japan Soft Water 2L · AEON
The lazy out-of-stock. Marking an item unavailable on these platforms is fiddly clicks; typing a number no human will ever pay is one. So the water “exists,” technically, at a price designed to be ignored. AEON does this 17 times; JHC Japan Home, 9.
HK$9,999
Georgia The Black coffee 500ml · AEON
Or it's a plain fat-finger — someone meant $19.90 and a zero ran away. Either way, a naive “average price” would swallow these whole and lie to you. It's exactly why the charts above lean on robust statisticsRobust statistics: methods (median, modified z-score, Qn scale) that ignore a handful of absurd values instead of letting them drag the average. A single $9,999 ghost can't move a median. instead of a plain mean.

A shop that plays fair

The Donki exception

After all the markups and the moving deals, one of the biggest catalogues on the platform turns out to be the straightest shooter we found.

Don Don Donki

37,431
products
CV 0.11
barely wobbles (not Hi-Lo)
87%
prices end in .90
~$33
median price

Donki plays the cheap game hard — nearly everything ends in .90. But it doesn't run the Hi-Lo treadmill (its prices barely move) and it has no separate website to mark up, so foodpanda is its only digital shelf. No walk-versus-deliver gouge, because there's nothing to gouge against. The result is a shop that's loud about being cheap and, as far as the data can see, actually keeps its word. Affordable, consistent, fair. You still have to like the chaos of the store.

Volatility by aisle

Where prices swing the most

Toilet paper: boring, everyone charges about the same. Hot pot and fresh seafood: a casino. The price spreadPrice spread: how far apart the cheapest and most expensive store are for the same product, as a percentage. Big spread = shopping around actually pays. within a category tells you where it's worth shopping around.

Median price spread by category

how much the same item varies store-to-store · higher = shop around

The whole haystack

Two million rows. In this tab. Right now.

The charts above ran on tidy little summaries. This runs on everything: 2,027,565 raw price observations, a single 33 MB ParquetParquet: a columnar file format. It stores data by column instead of by row, which compresses far better and lets a reader grab just the columns and chunks it needs. Our 10 GB of raw listings became one 290 MB Parquet, then 33 MB once trimmed. file sitting on object storage. Your browser reaches in and grabs only the bytes it needs, using HTTP range requestsHTTP range request: asking a server for just bytes 1,000–2,000 of a file instead of the whole thing. DuckDB uses the Parquet file's index to fetch only the chunks a query touches, so a query over 2M rows might download a few MB, not 33..

This is DuckDB-WASMDuckDB-WASM: the DuckDB analytical database compiled to WebAssembly (~3.5 MB), running entirely in your browser. Real SQL, no backend, no API. This is the 2026 way to ship a data product with zero servers. doing real analytical SQL with nothing behind it. Edit the query. Hit run. Watch a laptop-grade database chew through two million rows from a static file.

First run downloads the DuckDB engine and the file's index, then streams only what each query needs.
Run a query to pull live rows from the firehose.

Why it's frozen

The honest part: where I stopped

Comparing the same product across different companies needs a shared barcodeBarcode / EAN / SKU: the 13-digit GS1 number on a product. It's the only bulletproof way to know two listings are literally the same item across different retailers.. Most chains don't hand that out. So I matched by normalised name instead (a family_idfamily_id: a fingerprint built from a cleaned-up product name, so ‘Yakult 5x100ml’ and ‘Yakult LT 500ml [random delivery]’ collapse into one comparable family even without a barcode.). That works, until you notice almost every cross-chain match is the same parent company wearing a different hat.

The match funnel

families that survive each honesty filter

All product families156,433
Appear at 2+ chains (by name)14,325
Genuinely cross-company (3+ rivals)131

Wellcome, Mannings and Market Place by Jasons are all DFI. PARKnSHOP and Watsons are all AS Watson. Strip the same-parent banners out and a 14,325-family mountain becomes a 131-family hill. To climb past it you need the loyalty-app price feeds (yuu, MoneyBack). That's a different, much bigger project. I chose to stop. Knowing where the diminishing returns start is the actual skill.

The local model earned its keep

messy merchant categories, mapped once, on my own laptop

Thousands of junk category strings (“$10 Flash Sale”, “3:15 PM Tea Break”) refused to map. Instead of paying an API per row, qwen2.5:7b on Ollama classified each distinct string exactly once into a durable map (632 of them), then never ran again. A second local model, bge-m3, matched Chinese names to their English twins where barcodes were missing. Two million rows, normalised privately, for the price of electricity.

Same-parent banners vs real rivals
Visual explainer slot — a plain-language illustration of this concept lands here.