Open Food Facts
Crowdsourced food product database, ODbL-licensed.
A meaningful portion of our food and beverage catalog comes from Open Food Facts (OFF) — a community-edited dataset with millions of products world-wide. We ingest the daily dump filtered to Gulf-shelf relevance, and we hit the OFF API live when a barcode you scan isn't yet in our cache.
License: ODbL (Open Database License) — share-alike for any derivative database. We keep OFF-derived rows in their own provenance tier so the share-alike obligation stays scoped to that slice of the catalog.
Open Beauty Facts
Cosmetic catalog, same community, same license.
Cosmetics in our catalog (shampoos, sunscreens, skin care, oral care) come from Open Beauty Facts (OBF) — the cosmetics-side sibling of OFF. Same dump structure, same ODbL terms.
Substance dictionary
Regulator-cited harm flags, never inferred.
Whenever an ingredient on a scanned panel is flagged for harm — endocrine disruption, allergen, carcinogen, irritant, restricted use — that flag points at a real document. The current sources we cite, in order of regional authority:
- GSO 1943:2016 — Gulf Standardization Organization cosmetic safety requirements; mirrors EU 1223/2009. Regional authority.
- EU CosIng / EU 1223/2009 — European cosmetic ingredient registry and the regulation that drives Annex II (banned) and Annex III (restricted).
- EU SCCS — Scientific Committee on Consumer Safety opinions, the layer beneath EU 1223/2009 that explains why a substance is restricted.
- IARC monographs — International Agency for Research on Cancer carcinogen classifications (Group 1, 2A, 2B, 3).
- EFSA / JECFA — European Food Safety Authority and the Joint FAO/WHO Expert Committee on Food Additives, primary sources for food additive ADIs.
- FDA — US Food and Drug Administration, cited where the FDA position diverges from or pre-dates the EU/GSO line.
Every harm flag in our database carries the source organisation, the document reference (e.g. 'GSO 1943:2016 Annex II item 1339'), the date the source was issued, and a URL. We never infer harm from an LLM or from a heuristic — flags either cite a real regulator or they don't exist.
Crowdsourced submissions
Your scans grow the long tail.
When a barcode isn't in our cache and isn't in OFF/OBF either — typically a Khaleeji or Omani local brand — the app prompts you to submit a photo of the ingredient panel. A moderator reviews the OCR output and promotes the row. These crowdsourced rows are tagged 'crowdsource' and downweighted relative to regulator and GS1 GCC feeds.
What we don't use
No retail scraping. No LLM-fabricated ingredients.
We don't scrape Noon, Lulu, Carrefour, or any retailer. We don't use a language model to invent ingredient lists for products we haven't seen the label of. Both of these are explicitly forbidden in our build rules. The substance dictionary is hand-curated from regulator documents, and the verdict engine is a deterministic rules engine — no ML in the verdict.