bolster.data_sources.nisra.population

NISRA Mid-Year Population Estimates Data Source.

Provides access to mid-year population estimates for Northern Ireland with breakdowns by: - Geography (Northern Ireland, Parliamentary Constituencies, Health and Social Care Trusts) - Sex (All persons, Males, Females) - Age (5-year age bands: 00-04, 05-09, …, 85-89, 90+) - Year (1971-present for NI overall, 2021-present for sub-geographies)

Mid-year estimates are referenced to June 30th of each year.

Data Source:

Mother Page: https://www.nisra.gov.uk/statistics/people-and-communities/population

This page lists all population statistics publications in reverse chronological order (newest first). The module automatically scrapes this page to find the latest “Mid-Year Population Estimates for Small Geographical Areas” publication, then downloads the age bands Excel file from that publication’s detail page.

The files contain complete time series data in a pre-processed “Flat” format, making this one of the most analysis-ready NISRA datasets.

Update Frequency: Annual (published ~6 months after reference date) Geographic Coverage: Northern Ireland Reference Date: June 30th of each year

Example

>>> from bolster.data_sources.nisra import population
>>> # Get latest population estimates for all geographies
>>> df = population.get_latest_population()
>>> 'population' in df.columns
True
>>> # Get only Northern Ireland overall
>>> ni_df = population.get_latest_population(area='Northern Ireland')
>>> len(ni_df) > 0
True

Attributes

logger

POPULATION_BASE_URL

Functions

get_latest_population_publication_url()

Scrape NISRA population mother page to find the latest MYE age bands file.

parse_population_file(file_path[, area])

Parse NISRA mid-year population estimates Excel file.

get_latest_population([area, force_refresh])

Get the latest mid-year population estimates.

validate_population_totals(df)

Validate that Males + Females population equals All persons for each group.

get_population_by_year(df, year[, sex])

Filter population data for a specific year and optional sex.

get_population_pyramid_data(df, year[, area_name])

Prepare data for population pyramid visualization.

Module Contents

bolster.data_sources.nisra.population.logger[source]
bolster.data_sources.nisra.population.POPULATION_BASE_URL = 'https://www.nisra.gov.uk/statistics/population/mid-year-population-estimates'[source]
bolster.data_sources.nisra.population.get_latest_population_publication_url()[source]

Scrape NISRA population mother page to find the latest MYE age bands file.

Navigates the publication structure: 1. Scrapes mother page for latest “Mid-Year Population Estimates” publication 2. Follows link to publication detail page 3. Finds age bands Excel file

Returns:

Tuple of (excel_file_url, year)

Raises:

NISRADataNotFoundError – If publication or file not found

Return type:

tuple[str, int]

bolster.data_sources.nisra.population.parse_population_file(file_path, area='all')[source]

Parse NISRA mid-year population estimates Excel file.

The population file contains a “Flat” sheet with pre-processed long-format data, making this one of the easiest NISRA datasets to work with.

Parameters:
  • file_path (str | pathlib.Path) – Path to the population Excel file

  • area (Literal['all', 'Northern Ireland', 'Parliamentary Constituencies (2024)', 'Health and Social Care Trusts', 'Parliamentary Constituencies (2008)'] | None) – Which geographic area(s) to return: - “all”: All geographic breakdowns - “Northern Ireland”: NI overall only (1971-present) - “Parliamentary Constituencies (2024)”: 2024 constituencies (2021-present) - “Health and Social Care Trusts”: HSC Trusts (2021-present) - “Parliamentary Constituencies (2008)”: 2008 constituencies (2021-present)

Returns:

  • area: str (e.g., “1. Northern Ireland”)

  • area_code: str (ONS geography code)

  • area_name: str (full area name)

  • year: int (reference year)

  • sex: str (“All persons”, “Males”, “Females”)

  • age_5: str (5-year age band: “00-04”, “05-09”, …, “90+”)

  • age_band: str (custom age band)

  • age_broad: str (broad age band: “00-15”, “16-39”, “40-64”, “65+”)

  • population: int (mid-year estimate)

Return type:

DataFrame with columns

Raises:

NISRAValidationError – If file structure is unexpected

bolster.data_sources.nisra.population.get_latest_population(area='all', force_refresh=False)[source]

Get the latest mid-year population estimates.

Automatically discovers and downloads the most recent population estimates from the NISRA website.

Parameters:
  • area (Literal['all', 'Northern Ireland', 'Parliamentary Constituencies (2024)', 'Health and Social Care Trusts', 'Parliamentary Constituencies (2008)'] | None) – Which geographic area(s) to return (default: “all”)

  • force_refresh (bool) – If True, bypass cache and download fresh data

Returns:

  • area, area_code, area_name: Geographic identifiers

  • year: Reference year

  • sex: “All persons”, “Males”, or “Females”

  • age_5: 5-year age band

  • age_band, age_broad: Alternative age groupings

  • population: Mid-year estimate

Return type:

DataFrame with columns

Raises:
  • NISRADataNotFoundError – If latest publication cannot be found

  • NISRAValidationError – If file structure is unexpected

Example

>>> df = get_latest_population()
>>> 'population' in df.columns
True
>>> ni_df = get_latest_population(area='Northern Ireland')
>>> sorted(df.columns.tolist())
['age_5', 'age_band', 'age_broad', 'area', 'area_code', 'area_name', 'population', 'sex', 'year']
bolster.data_sources.nisra.population.validate_population_totals(df)[source]

Validate that Males + Females population equals All persons for each group.

Parameters:

df (pandas.DataFrame) – DataFrame from parse_population_file or get_latest_population

Returns:

True if validation passes

Raises:

NISRAValidationError – If validation fails

Return type:

bool

bolster.data_sources.nisra.population.get_population_by_year(df, year, sex='All persons')[source]

Filter population data for a specific year and optional sex.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_population()

  • year (int) – Year to filter

  • sex (Literal['All persons', 'Males', 'Females'] | None) – Sex category to filter (default: “All persons”)

Returns:

Filtered DataFrame

Return type:

pandas.DataFrame

Example

>>> df = get_latest_population(area='Northern Ireland')
>>> pop_2024 = get_population_by_year(df, 2024)
>>> total = pop_2024['population'].sum()
>>> bool(total > 0)
True
bolster.data_sources.nisra.population.get_population_pyramid_data(df, year, area_name='NORTHERN IRELAND')[source]

Prepare data for population pyramid visualization.

Returns males and females by age band for a specific year and area, formatted for easy pyramid plotting.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_population()

  • year (int) – Year to visualize

  • area_name (str | None) – Area name to filter (default: “NORTHERN IRELAND”)

Returns:

  • age_5: Age band

  • males: Male population (positive values)

  • females: Female population (negative values for pyramid)

Return type:

DataFrame with columns

Example

>>> df = get_latest_population(area='Northern Ireland')
>>> pyramid = get_population_pyramid_data(df, 2024)
>>> sorted(pyramid.columns.tolist())
['age_5', 'females', 'males']