bolster.data_sources.nisra.migration

NISRA Migration Estimates - Official and Derived.

This module provides access to NISRA migration data through two approaches:

  1. Official Migration Statistics: Published NISRA long-term international migration estimates from administrative data and the International Passenger Survey (IPS).

  2. Derived Migration Estimates: Calculated from demographic components using the demographic accounting equation:

    Net Migration = Population Change - Natural Change Net Migration = ΔPopulation - (Births - Deaths)

Both approaches are useful: - Official statistics are authoritative but published with a lag - Derived estimates can be calculated for more recent periods - Comparing both validates the demographic equation approach

Data Sources:

Official Migration: https://www.nisra.gov.uk/statistics/population/long-term-international-migration-statistics

Derived Migration (combines three NISRA sources): - Population: https://www.nisra.gov.uk/statistics/people-and-communities/population - Births: https://www.nisra.gov.uk/statistics/births-deaths-and-marriages/births - Deaths: https://www.nisra.gov.uk/statistics/births-deaths-and-marriages/deaths

Update Frequency: Annual (both official and derived) Geographic Coverage: Northern Ireland Reference Period: Mid-year (July to June) for official; Calendar year for derived

Example

>>> from bolster.data_sources.nisra import migration
>>>
>>> # Get official NISRA migration statistics
>>> official = migration.get_official_migration()
>>> sorted(official.columns.tolist())
['date', 'net_migration', 'year']
>>> # Get derived migration estimates (from demographic equation)
>>> derived = migration.get_derived_migration()
>>> 'net_migration' in derived.columns
True
>>> # Compare official vs derived for validation
>>> comparison = migration.compare_official_vs_derived(official, derived)
>>> 'absolute_difference' in comparison.columns
True

Attributes

logger

MIGRATION_MOTHER_PAGE

get_derived_migration

Functions

calculate_annual_births(births_df)

Aggregate monthly births data to annual totals.

calculate_annual_deaths(deaths_df)

Aggregate weekly deaths data to annual totals.

calculate_annual_population(population_df)

Aggregate population data to annual totals for Northern Ireland.

derive_migration(population_df, births_df, deaths_df)

Derive net migration from demographic components.

get_latest_migration([force_refresh])

Get the latest derived migration estimates for Northern Ireland.

validate_demographic_equation(df[, tolerance])

Validate that the demographic accounting equation holds.

get_migration_by_year(df, year)

Filter migration data for a specific year.

get_migration_summary_statistics(df[, start_year, ...])

Calculate summary statistics for migration data.

get_official_migration_publication_url()

Scrape NISRA migration mother page to find latest Official estimates file.

parse_official_migration_file(file_path)

Parse downloaded official migration Excel file into DataFrame.

validate_official_migration(df)

Validate official migration data quality.

get_official_migration([force_refresh])

Get the latest official NISRA migration statistics.

compare_official_vs_derived(official_df, derived_df[, ...])

Compare official migration data with derived estimates for validation.

Module Contents

bolster.data_sources.nisra.migration.logger[source]
bolster.data_sources.nisra.migration.calculate_annual_births(births_df)[source]

Aggregate monthly births data to annual totals.

Parameters:

births_df (pandas.DataFrame) – DataFrame from births.get_latest_births(event_type=’occurrence’)

Returns:

  • year: int

  • births: int (total births in year)

Return type:

DataFrame with columns

bolster.data_sources.nisra.migration.calculate_annual_deaths(deaths_df)[source]

Aggregate weekly deaths data to annual totals.

Parameters:

deaths_df (pandas.DataFrame) – DataFrame from deaths.get_historical_deaths()

Returns:

  • year: int

  • deaths: int (total deaths in year)

Return type:

DataFrame with columns

bolster.data_sources.nisra.migration.calculate_annual_population(population_df)[source]

Aggregate population data to annual totals for Northern Ireland.

Parameters:

population_df (pandas.DataFrame) – DataFrame from population.get_latest_population(area=’Northern Ireland’)

Returns:

  • year: int

  • population: int (mid-year population estimate)

Return type:

DataFrame with columns

bolster.data_sources.nisra.migration.derive_migration(population_df, births_df, deaths_df)[source]

Derive net migration from demographic components.

Uses the demographic accounting equation:

Net Migration = ΔPopulation - (Births - Deaths)

Parameters:
  • population_df (pandas.DataFrame) – DataFrame from population.get_latest_population()

  • births_df (pandas.DataFrame) – DataFrame from births.get_latest_births(event_type=’occurrence’)

  • deaths_df (pandas.DataFrame) – DataFrame from deaths.get_latest_deaths()

Returns:

  • year: int

  • population_start: int (population at start of year, June 30 t-1)

  • population_end: int (population at end of year, June 30 t)

  • births: int (births in calendar year)

  • deaths: int (deaths in calendar year)

  • natural_change: int (births - deaths)

  • population_change: int (population_end - population_start)

  • net_migration: int (derived migration estimate)

  • migration_rate: float (per 1,000 population)

Return type:

DataFrame with columns

Raises:

NISRAValidationError – If data sources cannot be aligned

bolster.data_sources.nisra.migration.get_latest_migration(force_refresh=False)[source]

Get the latest derived migration estimates for Northern Ireland.

Automatically downloads the most recent population, births, and deaths data, then calculates net migration using the demographic accounting equation.

Parameters:

force_refresh (bool) – If True, bypass cache and download fresh data for all sources

Returns:

  • year: int

  • population_start, population_end: int (mid-year estimates)

  • births, deaths: int (annual totals)

  • natural_change: int (births - deaths)

  • population_change: int (year-over-year change)

  • net_migration: int (derived estimate)

  • migration_rate: float (per 1,000 population)

Return type:

DataFrame with columns

Example

>>> df = get_latest_migration()
>>> 'net_migration' in df.columns
True
>>> len(df) > 0
True
bolster.data_sources.nisra.migration.validate_demographic_equation(df, tolerance=100)[source]

Validate that the demographic accounting equation holds.

Checks that:

Population Change = Natural Change + Net Migration

Parameters:
  • df (pandas.DataFrame) – DataFrame from derive_migration() or get_latest_migration()

  • tolerance (int) – Allowable difference due to rounding/measurement error (default: 100)

Returns:

True if validation passes

Raises:

NISRAValidationError – If equation doesn’t hold within tolerance

Return type:

bool

bolster.data_sources.nisra.migration.get_migration_by_year(df, year)[source]

Filter migration data for a specific year.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_migration()

  • year (int) – Year to filter

Returns:

Filtered DataFrame

Return type:

pandas.DataFrame

Example

>>> df = get_latest_migration()
>>> df_2024 = get_migration_by_year(df, 2024)
>>> 'net_migration' in df_2024.columns
True
bolster.data_sources.nisra.migration.get_migration_summary_statistics(df, start_year=None, end_year=None)[source]

Calculate summary statistics for migration data.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_migration()

  • start_year (int | None) – Optional start year for analysis period

  • end_year (int | None) – Optional end year for analysis period

Returns:

  • total_years: Number of years analyzed

  • avg_net_migration: Average annual net migration

  • avg_migration_rate: Average migration rate per 1,000

  • positive_years: Number of years with net immigration

  • negative_years: Number of years with net emigration

  • max_immigration_year: Year with highest immigration

  • max_immigration: Highest immigration value

  • max_emigration_year: Year with highest emigration

  • max_emigration: Highest emigration value (as negative)

Return type:

Dictionary with summary statistics

Example

>>> df = get_latest_migration()
>>> stats = get_migration_summary_statistics(df, start_year=2010)
>>> 'avg_net_migration' in stats
True
bolster.data_sources.nisra.migration.MIGRATION_MOTHER_PAGE = 'https://www.nisra.gov.uk/statistics/population/long-term-international-migration-statistics'[source]
bolster.data_sources.nisra.migration.get_official_migration_publication_url()[source]

Scrape NISRA migration mother page to find latest Official estimates file.

Navigates the publication structure: 1. Scrapes mother page for latest “Long-Term International Migration” publication 2. Follows link to publication detail page 3. Finds “Official” Excel file (Mig[YY][YY]-Official_1.xlsx)

Returns:

Tuple of (excel_file_url, publication_year)

Raises:

NISRADataNotFoundError – If publication or file not found

Return type:

tuple[str, int]

bolster.data_sources.nisra.migration.parse_official_migration_file(file_path)[source]

Parse downloaded official migration Excel file into DataFrame.

Extracts Table 1.1 (Net International Migration time series) from the Official estimates file and transforms it into long-format DataFrame.

Parameters:

file_path (pathlib.Path) – Path to downloaded Mig[YY][YY]-Official_1.xlsx file

Returns:

  • year: int (mid-year)

  • net_migration: int (net international migration)

  • date: pd.Timestamp (reference date, June 30 of end year)

Return type:

DataFrame with columns

Raises:

NISRAValidationError – If file format is unexpected or parsing fails

bolster.data_sources.nisra.migration.validate_official_migration(df)[source]

Validate official migration data quality.

Parameters:

df (pandas.DataFrame) – DataFrame from parse_official_migration_file() or get_official_migration()

Returns:

True if validation passes

Raises:

NISRAValidationError – If validation fails

Return type:

bool

bolster.data_sources.nisra.migration.get_official_migration(force_refresh=False)[source]

Get the latest official NISRA migration statistics.

Automatically downloads the most recent official migration estimates from NISRA and parses them into a structured DataFrame.

Parameters:

force_refresh (bool) – If True, bypass cache and download fresh data

Returns:

  • year: int (mid-year)

  • net_migration: int (net international migration)

  • date: pd.Timestamp (reference date)

Return type:

DataFrame with columns

Raises:
  • NISRADataNotFoundError – If publication cannot be found

  • NISRAValidationError – If data fails validation

Example

>>> official = get_official_migration()
>>> sorted(official.columns.tolist())
['date', 'net_migration', 'year']
>>> len(official) > 0
True
bolster.data_sources.nisra.migration.get_derived_migration[source]
bolster.data_sources.nisra.migration.compare_official_vs_derived(official_df, derived_df, threshold=1000)[source]

Compare official migration data with derived estimates for validation.

Parameters:
  • official_df (pandas.DataFrame) – DataFrame from get_official_migration()

  • derived_df (pandas.DataFrame) – DataFrame from get_derived_migration() / get_latest_migration()

  • threshold (int) – Absolute difference threshold for flagging discrepancies (default: 1000)

Returns:

  • year: int

  • official_net_migration: int

  • derived_net_migration: int

  • absolute_difference: int

  • percent_difference: float

  • exceeds_threshold: bool

Return type:

DataFrame with columns

Example

>>> official = get_official_migration()
>>> derived = get_derived_migration()
>>> comparison = compare_official_vs_derived(official, derived)
>>> sorted(comparison.columns.tolist())
['absolute_difference', 'derived_net_migration', 'exceeds_threshold', 'official_net_migration', 'percent_difference', 'year']