How LAGeSo Berlin Screens 1M+ Reviews to Catch Food-Borne Illness
A health agency listening to a million reviews

Inside the agency, the Infektionsschutz unit handles infection protection. Its job is to spot outbreaks early, trace them, and stop them spreading. Food-borne illness is one of the hardest signals to catch in time.
The reason is simple. By the time a food-poisoning case reaches a hospital and a lab report lands on an epidemiologist's desk, days have passed. The meal is long gone, the kitchen has turned over, and the trail is cold.
But people don't wait for a lab to talk about a bad meal. They post about it the same day, on Google Maps, in public.
So the question LAGeSo asked was a data problem, not a medical one... how do you find the rare genuine illness reports hidden inside a million restaurant reviews a month?
The signal is public, but it's buried
Berlin has thousands of restaurants and a constant stream of Google Maps reviews, mostly in German and English.

Plenty of them code-switch... German peppered with Turkish or Arabic from the city's döner and shawarma spots, or written in Kiezdeutsch, the local multiethnic street slang.
Somewhere in that flood are reviews describing real food-borne illness... abdominal pain after a meal, a night in the ER, a severe allergic reaction. These are the cases an epidemiologist would want to see.
The problem is ratio. Genuine illness reports are a fraction of a percent of all reviews. Everything else is noise... slow service, cold fries, a rude waiter, slang like "zum Kotzen" that means "awful", not "vomiting".
Reading a million reviews by hand is impossible. Keyword filters drown in false positives.
To make public reviews usable as a surveillance signal, LAGeSo needed two things... a reliable way to collect reviews at scale every month, and a way to classify them with enough nuance to separate real illness from figures of speech.

Collecting 1M+ Google Maps reviews a month

The requirement was high-volume, recurring collection, not a one-off pull. A surveillance system has to run on a schedule, return clean data in a consistent format, and do it without the agency managing scraping infrastructure, proxies, or blocks itself.
lobstr.io handles the collection layer end to end. Reviews come back structured and consistent... review text, star rating, language, and date... ready to feed straight into a screening pipeline. The same run repeats every month, so the dataset stays current and the agency works with fresh signal rather than a stale snapshot.

That gives LAGeSo a dependable monthly feed of a million reviews. The next problem is turning that million into something a human can act on.
From a million reviews to a ranked triage queue
On top of the lobstr.io feed, LAGeSo built a screening pipeline (with help from lobstr.io) that reads every review and decides whether it describes food-borne illness.

It runs in clear stages:
- Collect — lobstr.io scrapes 1,000,000+ Google Maps reviews for Berlin restaurants and normalizes them to a minimal 9-field schema.
- Pre-filter — reviews with a star rating of 4 or 5, or with no text, are dropped. They almost never describe illness. This alone cuts the corpus from 1,000,000 to roughly 103,019 reviews in scope.
- Classify — each remaining review is read by an LLM using a 1,000+ token prompt with six worked examples. The examples teach it to catch real illness reports, route allergic reactions to a separate pathway, and ignore false positives like slang and idioms across languages.
- Structure — every review gets a JSON verdict with controlled-vocabulary fields: symptoms, onset timing, severity, people affected, restaurant attribution, and a confidence score.
- Deliver — results are written to a database (raw responses kept for audit) and exported as two CSVs in LAGeSo's delivery format... one with everything screened, one with flagged cases only.
The result is a funnel. A million raw reviews narrow to a ranked, prioritized queue an epidemiologist can actually work through.
The deliverable
Each flagged review becomes a structured record. Restaurant and dish names are redacted here for publication.
A real flagged case, anonymized:
"Nach dem Essen dolle Bauchschmerzen, anschlieĂźend im Krankenhaus gelandet." (Severe stomach pain after eating, then ended up in hospital.)
The pipeline turns that into:
{ "restaurant": "[redacted]", "rating": 1, "language": "de", "category": "illness", "symptoms": "abdominal pain", "onset": "immediate", "severity": "hospitalized", "people_affected": "self", "attributed_to_restaurant": true, "confidence": "high" }f
In a single screening run, the funnel looks like this:
| Stage | Reviews |
|---|---|
| Scraped and screened | 1,000,000 |
| In scope after pre-filter | ~103,019 |
| Flagged cases | 2,410 |
| Food-borne illness (restaurant-attributed) | 2,267 |
| Allergic reactions | 143 |
| High / medium-confidence cases | 2,363 |
Because every record carries controlled-vocabulary fields, a single run of flagged reviews turns into a picture an epidemiologist can read at a glance.
What symptoms are people describing:

How severe the reported cases are:

When symptoms set in relative to the meal:

And how many people each case affected:

The most severe cases surface automatically, so they rise to the top of the triage queue rather than getting lost in the volume.
Results
In a single screening run, LAGeSo moved from an unusable flood of reviews to a focused surveillance queue.
The model scores its own certainty on every flagged case, so the queue can be ranked before a human ever looks at it:

And because each review carries a date, the flagged cases can be placed on a timeline... they cluster heavily in the most recent months:

Outcomes at LAGeSo's scale:
- 1,000,000 reviews screened in a single run
- 2,410 cases flagged for food-borne illness or allergic reaction
- 2,267 restaurant-attributed illness reports surfaced from the noise
- 143 allergic reactions routed to a dedicated pathway
- 2,363 cases ranked at high or medium confidence for triage
- 16 cases severe enough to involve hospitalization
- A recurring, structured signal where there was none before

The agency is honest about the limits. Public reviews are biased toward people motivated to complain, auto-translation can shift meaning, and a single review is a signal to investigate, not proof. The pipeline is built to flag leads for human epidemiologists, not to replace them.
But the core problem is solved. The signal was always there in public... LAGeSo can now find it.