bolster.data_sources.nisra.population_projections ================================================= .. py:module:: bolster.data_sources.nisra.population_projections .. autoapi-nested-parse:: NISRA Population Projections for Northern Ireland. Provides access to official NISRA population projections with demographic breakdowns by year, age, sex, and geography. Projections extend from the base year (e.g., 2022) into the future (typically 50 years) to support demographic planning and policy analysis. Data includes: - Projected population by single year of age (0-90+) - Breakdowns by sex (Males, Females, All Persons) - Geographic coverage (Northern Ireland overall, and by Local Government District) - Multiple projection variants (principal, high, low population scenarios) Data Source: **Principal Projection**: https://www.nisra.gov.uk/publications/2024-based-population-projections-northern-ireland **Variant Projections**: https://www.nisra.gov.uk/publications/2024-based-population-projections-northern-ireland-variant-projections **LGD Sub-area Projections**: https://www.nisra.gov.uk/publications/2022-based-population-projections-areas-within-northern-ireland The principal projection publication provides 2 Excel files: - NPP24_ppp_age_sex.xlsx: Population by age and sex (RECOMMENDED - uses "Flat File" sheet) - NPP24_ppp_coc.xlsx: Components of change summary The LGD projections file (SNPP22_SYA_Age_Bands.xlsx) uses the "Flat" sheet, already in long format. Covers all 11 LGDs plus NI-total, 2022-2047. Update Frequency: Biennial (NI-level); LGD projections published alongside the 2022-based series Geographic Coverage: Northern Ireland (NI-level and 11 Local Government Districts) Projection Horizon: Typically 50 years (NI); 2022-2047 (LGD) .. rubric:: Example >>> from bolster.data_sources.nisra import population_projections >>> df = population_projections.get_latest_projections() >>> 'population' in df.columns True >>> df_decade = population_projections.get_latest_projections( ... area='Northern Ireland', ... start_year=2025, ... end_year=2035 ... ) >>> len(df_decade) > 0 True >>> lgd_df = population_projections.get_lgd_projections() >>> 'lgd_name' in lgd_df.columns True Attributes ---------- .. autoapisummary:: bolster.data_sources.nisra.population_projections.logger bolster.data_sources.nisra.population_projections.PROJECTIONS_INDEX_URL bolster.data_sources.nisra.population_projections.LGD_PROJECTIONS_PUB_URL bolster.data_sources.nisra.population_projections.LGD_PROJECTIONS_BASE_YEAR Functions --------- .. autoapisummary:: bolster.data_sources.nisra.population_projections.get_latest_projections_publication_url bolster.data_sources.nisra.population_projections.parse_projections_file bolster.data_sources.nisra.population_projections.validate_projections_totals bolster.data_sources.nisra.population_projections.validate_projection_coverage bolster.data_sources.nisra.population_projections.get_latest_projections bolster.data_sources.nisra.population_projections.get_lgd_projections_url bolster.data_sources.nisra.population_projections.parse_lgd_projections_file bolster.data_sources.nisra.population_projections.validate_lgd_projections bolster.data_sources.nisra.population_projections.get_lgd_projections Module Contents --------------- .. py:data:: logger .. py:data:: PROJECTIONS_INDEX_URL :value: 'https://www.nisra.gov.uk/statistics/people-and-communities/population' .. py:function:: get_latest_projections_publication_url(variant = 'principal') Discover latest population projections publication URL and base year. Auto-discovers the current projection series from the NISRA statistics index so the module stays current when NISRA publishes a new biennial vintage. :param variant: Projection variant ('principal', 'hhh', 'lll', etc.). Default: 'principal' :returns: Tuple of (excel_file_url, base_year) :raises NISRADataNotFoundError: If publication cannot be found .. py:function:: parse_projections_file(file_path, variant = 'principal', base_year = 2024) Parse downloaded projections Excel file into long-format DataFrame. For principal projection, uses the "Flat File" sheet which is already in perfect long format requiring no transformation. :param file_path: Path to downloaded projections Excel file :param variant: Projection variant ('principal' or variant code) :param base_year: Base year of the projection vintage (e.g. 2024). Used to populate the base_year column since the file itself doesn't record it. :returns: - year: int (projection year) - base_year: int (base year for projection, e.g., 2024) - age_group: str (5-year age band, e.g., "00-04", "05-09", "90+") - sex: str ("Males", "Females", "All Persons") - area: str (geographic area, typically "Northern Ireland") - population: int (projected population count) :rtype: DataFrame with columns :raises NISRAValidationError: If file format unexpected or data invalid .. py:function:: validate_projections_totals(df) Validate that All Persons = Males + Females for each year/age/area. :param df: DataFrame from get_latest_projections() :returns: True if validation passes :raises NISRAValidationError: If totals don't match .. py:function:: validate_projection_coverage(df) Validate that projections cover expected year range. :param df: DataFrame from get_latest_projections() :returns: True if validation passes :raises NISRAValidationError: If year range is incomplete or suspicious .. py:function:: get_latest_projections(area = None, start_year = None, end_year = None, variant = 'principal', force_refresh = False) Get latest NISRA population projections with optional filtering. :param area: Filter to specific geographic area (e.g., "Northern Ireland"). Default: no filter :param start_year: Filter projections >= this year. Default: no filter :param end_year: Filter projections <= this year. Default: no filter :param variant: Projection variant ('principal', 'hhh', 'lll', etc.). Default: 'principal' :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with population projections :raises NISRADataNotFoundError: If publication cannot be found :raises NISRAValidationError: If data fails integrity checks .. rubric:: Example >>> df = get_latest_projections() >>> sorted(df.columns.tolist()) ['age_group', 'area', 'base_year', 'population', 'sex', 'year'] >>> df_ni_2030s = get_latest_projections( ... area='Northern Ireland', ... start_year=2030, ... end_year=2039 ... ) >>> len(df_ni_2030s) > 0 True .. py:data:: LGD_PROJECTIONS_PUB_URL :value: 'https://www.nisra.gov.uk/publications/2022-based-population-projections-areas-within-northern-ireland' .. py:data:: LGD_PROJECTIONS_BASE_YEAR :value: 2022 .. py:function:: get_lgd_projections_url() Scrape the LGD projections publication page to find the age/sex Excel file. :returns: URL of SNPP22_SYA_Age_Bands.xlsx (or equivalent) :raises NISRADataNotFoundError: If the file link cannot be found .. py:function:: parse_lgd_projections_file(file_path) Parse the LGD sub-area projections Excel file (Flat sheet). :param file_path: Path to downloaded SNPP22_SYA_Age_Bands.xlsx :returns: - lgd_name: str (e.g. "Belfast") - lgd_code: str (e.g. "N09000003") - year: int (2022-2047) - base_year: int (2022) - sex: str ("All persons", "Male", "Female") - age: int (single year of age, 0-90+) - age_group: str (5-year band, e.g. "00-04") - population: int :rtype: DataFrame with columns :raises NISRAValidationError: If file structure is unexpected .. py:function:: validate_lgd_projections(df) Validate LGD projections DataFrame for basic integrity. :param df: DataFrame from get_lgd_projections() :returns: True if validation passes :raises NISRAValidationError: If validation fails .. py:function:: get_lgd_projections(lgd = None, start_year = None, end_year = None, force_refresh = False) Get 2022-based population projections for NI Local Government Districts. Covers all 11 LGDs from 2022 to 2047, with single-year-of-age and sex breakdowns. :param lgd: Filter to a specific LGD name (e.g. "Belfast") or code (e.g. "N09000003"). Default: all 11 LGDs. :param start_year: Filter projections >= this year. Default: no filter :param end_year: Filter projections <= this year. Default: no filter :param force_refresh: If True, bypass cache and download fresh data :returns: lgd_name, lgd_code, year, base_year, sex, age, age_group, population :rtype: DataFrame with columns :raises NISRADataNotFoundError: If publication cannot be found :raises NISRAValidationError: If data fails integrity checks .. rubric:: Example >>> df = get_lgd_projections() >>> 'lgd_name' in df.columns True >>> sorted(df['lgd_name'].unique())[:3] ['Antrim and Newtownabbey', 'Ards and North Down', 'Armagh City, Banbridge and Craigavon']