Language codes - Dialects and interop

Good morning,
Having some questions from trusts around the language codes used for language in PDS.

The communication language is indicated in ISO-639-1
Some languages do not have the ISO 639-1 codes because the standard was initially designed to represent major and primary national languages with well-established terminologies and lexicography.
The big example here is Cantonese/Mandarin is not listed, instead we have Chinese macro language only despite these being very different spoken dialects (as i understand). If the primary purpose is to inform translator choice is this correct?
Other examples we’ve had people ask about include flemish and Dari.

This then hits further issues when we start looking at ECDS reporting which uses snomed codes which differentiate between these dialects.

And then further issues again when we start using the reasonable adjustments flags API which uses translator required snomed codes (the flags api does indicate we should start ignoring PDS language when its in use but we then have an issue where all our existing data is in a different standard).

Is there any over arching guidance on how language and translator needs should be record as part of a patient record, how the different requirements work together that we could reference to add clarity here?

Thanks,

Liam

ISO 639-1 intentionally collapses “Chinese” into a single macrolanguage code zh and does not provide separate codes for Cantonese or Mandarin as you’ve highlighted. That’s by design for high‑level cataloging.

What to do:

  • For interpreter choice, don’t rely on PDS’s ISO 639-1. Record the specific spoken variety with SNOMED CT (e.g., Cantonese vs Mandarin) and prefer RA flags where present.

  • If you need language tags outside SNOMED, use ISO 639-3/BCP 47: yue for Cantonese, cmn for Mandarin. When a system only accepts 639-1, downcast both to zh.

  • Similar pattern for Dari: use SNOMED; BCP 47 prs (or fa-AF); PDS fallback fa (Persian).

Hope that helps.