bolster.data_sources.nisra.claimant_count

NISRA Claimant Count Statistics Module.

This module provides access to the Northern Ireland Statistics and Research Agency (NISRA) monthly Claimant Count statistics, covering Universal Credit (UC) and Jobseeker’s Allowance (JSA) claimants.

The Claimant Count is an experimental statistic measuring the number of people claiming benefits principally for the reason of being unemployed. Data is published monthly and covers Northern Ireland with multiple geographic breakdowns including Local Government Districts, Parliamentary Constituency Areas, Travel-to-Work Areas, and Super Output Areas.

Data Source:

Publication page pattern: https://www.nisra.gov.uk/publications/labour-market-report-{month_name}-{year}

The module scrapes the monthly Labour Market Report publication page to find the lmr-claimant-count-tables-*.xlsx Excel file link, falling back to direct URL construction if scraping fails.

Update Frequency: Monthly, approximately 2–3 weeks after the reference month.

Sheets parsed:
  • Headline: NI total by sex, seasonally adjusted and non-seasonally adjusted, full time series from April 1997.

  • Age: NI total by age band (16–24, 25–49, 50+), from January 2013.

  • LGD_11: Current-month snapshot for 11 Local Government Districts.

  • PCA: Current-month snapshot for 18 Westminster Parliamentary Constituency Areas.

  • TTWA: Current-month snapshot for 10 Travel-to-Work Areas.

  • SOA: 889 Super Output Areas, wide-format time series from October 2017, melted to long format.

Notes

Claimant Count is an experimental statistic. The rate denominator is claimant count + workforce jobs. Five-week months are annotated [2], revised data with (r), provisional with (p). Annotation markers are stripped before date parsing.

SOA data has a methodology break at January 2026 (transition from COA2011 to DZ2021 geographies).

Usage:
>>> from bolster.data_sources.nisra import claimant_count
>>> df = claimant_count.get_latest_claimant_count("headline")
>>> "claimants_000s" in df.columns
True
>>> lgd_df = claimant_count.get_latest_claimant_count("lgd")
>>> "claimants_total" in lgd_df.columns
True

Example

>>> from bolster.data_sources.nisra import claimant_count
>>> df = claimant_count.get_latest_claimant_count("headline")
>>> df[df["sex"] == "all_people"].sort_values("date").tail(1)["claimants_000s"].values[0] > 0
True

Author: Claude Code

Attributes

logger

Functions

get_latest_publication_url()

Discover the URL of the most recent claimant count Excel file.

parse_headline(file_path)

Parse the Headline sheet: NI total claimant count by sex.

parse_age(file_path)

Parse the Age sheet: NI claimant count by age band.

parse_geography(file_path, sheet)

Parse a geographic breakdown sheet (LGD_11, PCA, or TTWA).

parse_soa(file_path)

Parse the SOA sheet: Super Output Area time series.

get_latest_claimant_count([breakdown, force_refresh])

Download and parse the latest NISRA claimant count data.

validate_claimant_count(df, breakdown)

Validate the integrity of a claimant count DataFrame.

Module Contents

bolster.data_sources.nisra.claimant_count.logger[source]
bolster.data_sources.nisra.claimant_count.get_latest_publication_url()[source]

Discover the URL of the most recent claimant count Excel file.

Scrapes the NISRA Labour Market Report publication page for the current month, falling back to previous months if needed, then falls back to direct URL construction.

Returns:

Full URL to the latest claimant count Excel file.

Raises:

NISRADataNotFoundError – If no publication can be found.

Return type:

str

Example

>>> url = get_latest_publication_url()
>>> url.endswith(".xlsx")
True
bolster.data_sources.nisra.claimant_count.parse_headline(file_path)[source]

Parse the Headline sheet: NI total claimant count by sex.

The Headline sheet contains two side-by-side tables: - Table 1a: Seasonally adjusted claimant count by sex - Table 1b: Non-seasonally adjusted claimant count by sex

Both tables share the same date column structure with men, women and all people counts (thousands) and rates.

Parameters:

file_path (str | pathlib.Path) – Path to the claimant count Excel file.

Returns:

  • date: pandas Timestamp (monthly, day=1)

  • adjusted: "seasonally_adjusted" or "non_seasonally_adjusted"

  • sex: "men", "women", or "all_people"

  • claimants_000s: Claimant count in thousands (float)

  • claimant_rate: Claimant rate as percentage (float)

Return type:

DataFrame with columns

Raises:

NISRADataNotFoundError – If the Headline sheet is not found.

Example

>>> df = parse_headline("/tmp/claimant_count.xlsx")
>>> sorted(df["sex"].unique())
['all_people', 'men', 'women']
>>> sorted(df["adjusted"].unique())
['non_seasonally_adjusted', 'seasonally_adjusted']
bolster.data_sources.nisra.claimant_count.parse_age(file_path)[source]

Parse the Age sheet: NI claimant count by age band.

Contains a single table of non-seasonally adjusted claimant counts broken down into three age bands: 16–24, 25–49, 50+. Data runs from January 2013.

Parameters:

file_path (str | pathlib.Path) – Path to the claimant count Excel file.

Returns:

  • date: pandas Timestamp (monthly, day=1)

  • age_group: One of "16-24", "25-49", "50+".

  • claimants: Claimant count (integer, rounded to nearest 5).

Return type:

DataFrame with columns

Raises:

NISRADataNotFoundError – If the Age sheet is not found.

Example

>>> df = parse_age("/tmp/claimant_count.xlsx")
>>> sorted(df["age_group"].unique())
['16-24', '25-49', '50+']
bolster.data_sources.nisra.claimant_count.parse_geography(file_path, sheet)[source]

Parse a geographic breakdown sheet (LGD_11, PCA, or TTWA).

Each sheet contains a current-month snapshot with columns for: male/female/total claimant numbers, working-age rates, month and year changes.

Parameters:
  • file_path (str | pathlib.Path) – Path to the claimant count Excel file.

  • sheet (str) – Sheet name — one of "LGD_11", "PCA", or "TTWA".

Returns:

  • date: pandas Timestamp (extracted from the Excel filename)

  • geography: Area name (e.g., "Belfast")

  • geography_type: Sheet type identifier (e.g., "LGD_11")

  • claimants_male: Number of male claimants (int)

  • claimants_female: Number of female claimants (int)

  • claimants_total: Total claimants (int)

  • claimant_rate_male_pct: Male working-age claimant rate (float)

  • claimant_rate_female_pct: Female working-age claimant rate (float)

  • claimant_rate_total_pct: Total working-age claimant rate (float)

  • change_over_month_number: Change vs previous month (int)

  • change_over_year_number: Change vs same month last year (int)

Return type:

DataFrame with columns

Raises:
  • NISRADataNotFoundError – If the requested sheet is not found.

  • ValueError – If sheet is not one of the supported values.

Example

>>> df = parse_geography("/tmp/claimant_count.xlsx", "LGD_11")
>>> len(df["geography"].unique()) >= 11
True
>>> "claimants_total" in df.columns
True
bolster.data_sources.nisra.claimant_count.parse_soa(file_path)[source]

Parse the SOA sheet: Super Output Area time series.

The SOA sheet is wide-format with 889 Super Output Areas as rows and monthly dates as columns from October 2017. This function melts it to long format.

Note

There is a methodology break at January 2026 where geography codes transition from COA2011 to DZ2021. Both series are included in the output.

Parameters:

file_path (str | pathlib.Path) – Path to the claimant count Excel file.

Returns:

  • soa_code: Super Output Area code and name (e.g., "95AA01S1 : Aldergrove_1")

  • date: pandas Timestamp (monthly, day=1)

  • claimants: Claimant count (int, rounded to nearest 5)

Return type:

DataFrame with columns

Raises:

NISRADataNotFoundError – If the SOA sheet is not found.

Example

>>> df = parse_soa("/tmp/claimant_count.xlsx")
>>> "soa_code" in df.columns
True
>>> df["date"].min().year <= 2018
True
bolster.data_sources.nisra.claimant_count.get_latest_claimant_count(breakdown='headline', force_refresh=False)[source]

Download and parse the latest NISRA claimant count data.

Automatically discovers and downloads the most recent monthly publication, then returns the requested breakdown.

Parameters:
  • breakdown (str) – One of: - "headline" — NI total by sex, SA and non-SA (default) - "age" — NI total by age band (16–24, 25–49, 50+) - "lgd" — 11 Local Government Districts (current month) - "pca" — 18 Parliamentary Constituency Areas (current month) - "ttwa" — 10 Travel-to-Work Areas (current month) - "soa" — 889 Super Output Areas, long-format time series

  • force_refresh (bool) – If True, bypass cache and download fresh data.

Returns:

DataFrame for the requested breakdown. See individual parse_* functions for column documentation.

Raises:
  • ValueError – If breakdown is not a supported value.

  • NISRADataNotFoundError – If the data cannot be downloaded.

Return type:

pandas.DataFrame

Example

>>> df = get_latest_claimant_count("headline")
>>> "claimants_000s" in df.columns
True
>>> df_lgd = get_latest_claimant_count("lgd")
>>> len(df_lgd) >= 11
True
bolster.data_sources.nisra.claimant_count.validate_claimant_count(df, breakdown)[source]

Validate the integrity of a claimant count DataFrame.

Checks that required columns are present, values are in plausible ranges, and the DataFrame is non-empty.

Parameters:
  • df (pandas.DataFrame) – DataFrame returned by get_latest_claimant_count or a parse_* function.

  • breakdown (str) – The breakdown type that produced the DataFrame. One of "headline", "age", "lgd", "pca", "ttwa", "soa".

Returns:

True if validation passes, False otherwise.

Return type:

bool

Example

>>> import pandas as pd
>>> validate_claimant_count(pd.DataFrame(), "headline")
False