bolster.data_sources.nisra.population
NISRA Mid-Year Population Estimates Data Source.
Provides access to mid-year population estimates for Northern Ireland with breakdowns by: - Geography (Northern Ireland, Parliamentary Constituencies, Health and Social Care Trusts) - Sex (All persons, Males, Females) - Age (5-year age bands: 00-04, 05-09, …, 85-89, 90+) - Year (1971-present for NI overall, 2021-present for sub-geographies)
Mid-year estimates are referenced to June 30th of each year.
- Data Source:
Mother Page: https://www.nisra.gov.uk/statistics/people-and-communities/population
This page lists all population statistics publications in reverse chronological order (newest first). The module automatically scrapes this page to find the latest “Mid-Year Population Estimates for Small Geographical Areas” publication, then downloads the age bands Excel file from that publication’s detail page.
The files contain complete time series data in a pre-processed “Flat” format, making this one of the most analysis-ready NISRA datasets.
Update Frequency: Annual (published ~6 months after reference date) Geographic Coverage: Northern Ireland Reference Date: June 30th of each year
Example
>>> from bolster.data_sources.nisra import population
>>> # Get latest population estimates for all geographies
>>> df = population.get_latest_population()
>>> 'population' in df.columns
True
>>> # Get only Northern Ireland overall
>>> ni_df = population.get_latest_population(area='Northern Ireland')
>>> len(ni_df) > 0
True
Attributes
Functions
Scrape NISRA population mother page to find the latest MYE age bands file. |
|
|
Parse NISRA mid-year population estimates Excel file. |
|
Get the latest mid-year population estimates. |
Validate that Males + Females population equals All persons for each group. |
|
|
Filter population data for a specific year and optional sex. |
|
Prepare data for population pyramid visualization. |
Module Contents
- bolster.data_sources.nisra.population.POPULATION_BASE_URL = 'https://www.nisra.gov.uk/statistics/population/mid-year-population-estimates'[source]
- bolster.data_sources.nisra.population.get_latest_population_publication_url()[source]
Scrape NISRA population mother page to find the latest MYE age bands file.
Navigates the publication structure: 1. Scrapes mother page for latest “Mid-Year Population Estimates” publication 2. Follows link to publication detail page 3. Finds age bands Excel file
- bolster.data_sources.nisra.population.parse_population_file(file_path, area='all')[source]
Parse NISRA mid-year population estimates Excel file.
The population file contains a “Flat” sheet with pre-processed long-format data, making this one of the easiest NISRA datasets to work with.
- Parameters:
file_path (str | pathlib.Path) – Path to the population Excel file
area (Literal['all', 'Northern Ireland', 'Parliamentary Constituencies (2024)', 'Health and Social Care Trusts', 'Parliamentary Constituencies (2008)'] | None) – Which geographic area(s) to return: - “all”: All geographic breakdowns - “Northern Ireland”: NI overall only (1971-present) - “Parliamentary Constituencies (2024)”: 2024 constituencies (2021-present) - “Health and Social Care Trusts”: HSC Trusts (2021-present) - “Parliamentary Constituencies (2008)”: 2008 constituencies (2021-present)
- Returns:
area: str (e.g., “1. Northern Ireland”)
area_code: str (ONS geography code)
area_name: str (full area name)
year: int (reference year)
sex: str (“All persons”, “Males”, “Females”)
age_5: str (5-year age band: “00-04”, “05-09”, …, “90+”)
age_band: str (custom age band)
age_broad: str (broad age band: “00-15”, “16-39”, “40-64”, “65+”)
population: int (mid-year estimate)
- Return type:
DataFrame with columns
- Raises:
NISRAValidationError – If file structure is unexpected
- bolster.data_sources.nisra.population.get_latest_population(area='all', force_refresh=False)[source]
Get the latest mid-year population estimates.
Automatically discovers and downloads the most recent population estimates from the NISRA website.
- Parameters:
area (Literal['all', 'Northern Ireland', 'Parliamentary Constituencies (2024)', 'Health and Social Care Trusts', 'Parliamentary Constituencies (2008)'] | None) – Which geographic area(s) to return (default: “all”)
force_refresh (bool) – If True, bypass cache and download fresh data
- Returns:
area, area_code, area_name: Geographic identifiers
year: Reference year
sex: “All persons”, “Males”, or “Females”
age_5: 5-year age band
age_band, age_broad: Alternative age groupings
population: Mid-year estimate
- Return type:
DataFrame with columns
- Raises:
NISRADataNotFoundError – If latest publication cannot be found
NISRAValidationError – If file structure is unexpected
Example
>>> df = get_latest_population() >>> 'population' in df.columns True
>>> ni_df = get_latest_population(area='Northern Ireland') >>> sorted(df.columns.tolist()) ['age_5', 'age_band', 'age_broad', 'area', 'area_code', 'area_name', 'population', 'sex', 'year']
- bolster.data_sources.nisra.population.validate_population_totals(df)[source]
Validate that Males + Females population equals All persons for each group.
- Parameters:
df (pandas.DataFrame) – DataFrame from parse_population_file or get_latest_population
- Returns:
True if validation passes
- Raises:
NISRAValidationError – If validation fails
- Return type:
- bolster.data_sources.nisra.population.get_population_by_year(df, year, sex='All persons')[source]
Filter population data for a specific year and optional sex.
- Parameters:
df (pandas.DataFrame) – DataFrame from get_latest_population()
year (int) – Year to filter
sex (Literal['All persons', 'Males', 'Females'] | None) – Sex category to filter (default: “All persons”)
- Returns:
Filtered DataFrame
- Return type:
Example
>>> df = get_latest_population(area='Northern Ireland') >>> pop_2024 = get_population_by_year(df, 2024) >>> total = pop_2024['population'].sum() >>> bool(total > 0) True
- bolster.data_sources.nisra.population.get_population_pyramid_data(df, year, area_name='NORTHERN IRELAND')[source]
Prepare data for population pyramid visualization.
Returns males and females by age band for a specific year and area, formatted for easy pyramid plotting.
- Parameters:
df (pandas.DataFrame) – DataFrame from get_latest_population()
year (int) – Year to visualize
area_name (str | None) – Area name to filter (default: “NORTHERN IRELAND”)
- Returns:
age_5: Age band
males: Male population (positive values)
females: Female population (negative values for pyramid)
- Return type:
DataFrame with columns
Example
>>> df = get_latest_population(area='Northern Ireland') >>> pyramid = get_population_pyramid_data(df, 2024) >>> sorted(pyramid.columns.tolist()) ['age_5', 'females', 'males']