Hi to all in the Developer Community. I’m @marcusbaw and I’m one of the maintainers of the NHS Number Python Package which is linked to from the NHS API Catalogue .
We now get about 15k downloads of the library per month, according to PyPi, so I thought I should probably check in with the community of users who might be using it about future changes.
We are looking to make a few breaking changes in 2.0, which are detailed in recent Issue on the repo. In particular one of them will improve strict CHI Number validity checking - finally adding DOB checking - but could break downstream applications that weren’t ready for the change.
opened 07:49PM - 27 Apr 26 UTC
## Summary
Implement the feature foreshadowed by the TODO at [`validate.py:68`]… (https://github.com/uk-fci/nhs-number/blob/9dd7b14/nhs_number/validate.py#L72):
```python
# Additional checks for Scotland CHI number DOB validity will go here
```
CHI numbers are structured as `DDMMYYsssC` — the first 6 digits are a date of birth. Today the library treats the whole 9-digit identifier as opaque and only checks the mod-11 checksum and (optionally) the range. A number like `3102990050` (claiming "31st February 1999") with a valid checksum currently passes `is_valid(..., for_region=REGION_SCOTLAND)`.
This issue proposes adding an **opt-in** `strict_chi_dob` keyword to `is_valid` and `NhsNumber` that additionally requires the date segment to be a real, non-future calendar date.
## Design spec
Full design sketch lives at [`spec/strict-chi-dob-validation.md`](https://github.com/uk-fci/nhs-number/blob/9dd7b14/spec/strict-chi-dob-validation.md). Headlines:
- **Default behaviour is unchanged.** Existing callers see no difference until they pass `strict_chi_dob=True`.
- **Scope is intuitive**: the flag only affects numbers in the Scotland CHI range. Non-CHI inputs are unaffected, so the flag never *adds* CHI semantics to a non-CHI number.
- **No new top-level functions** — adds one keyword to two existing surfaces (`is_valid`, `NhsNumber`).
- **2-digit year rule**: resolve to the most recent past century (so today, `27` resolves to 1927 because 2027 is in the future). Time-dependent, documented clearly.
- **Out of scope**: parsing/exposing the DOB or sex digit as structured attributes (separate, larger feature).
The spec includes a worked edge-case table covering leap years, day/month bounds, and the century-rollover case.
## Open questions (highlighted in the spec)
1. Should `generate(for_region=REGION_SCOTLAND)` produce numbers with valid DOBs? Today it doesn't — leaning toward documenting that explicitly rather than changing `generate`.
2. Should `NhsNumber` expose the parsed DOB? Tempting; deferred.
3. Should the century cutoff be configurable? YAGNI for now.
## Why opt-in
Some callers may already be relying on the current "checksum and range only" behaviour — for example, validating CHI numbers from historical data sets where the date segment is known to be unreliable. An opt-in flag preserves that, while letting strict callers (e.g. front-line data entry validation) pick up the extra check.
## Acceptance
- [ ] `is_valid(nhs_number, strict_chi_dob=True)` and `NhsNumber(nhs_number, strict_chi_dob=True)` implemented per spec
- [ ] Tests for every row of the edge-case table in the spec
- [ ] `today` injected/parametrized in tests so they don't decay
- [ ] All existing tests pass unchanged (default behaviour pinned)
- [ ] TODO at `validate.py:68` removed
- [ ] Spec updated with the resolved open questions
If this is important to you, we’d appreciate any feedback you have directly on the Issue.
If you have any other thoughts about the package, feel free to comment here or on the Issue.
1 Like