bolster.data_sources.nisra.migration
NISRA Migration Estimates - Official and Derived.
This module provides access to NISRA migration data through two approaches:
Official Migration Statistics: Published NISRA long-term international migration estimates from administrative data and the International Passenger Survey (IPS).
Derived Migration Estimates: Calculated from demographic components using the demographic accounting equation:
Net Migration = Population Change - Natural Change Net Migration = ΔPopulation - (Births - Deaths)
Both approaches are useful: - Official statistics are authoritative but published with a lag - Derived estimates can be calculated for more recent periods - Comparing both validates the demographic equation approach
- Data Sources:
Official Migration: https://www.nisra.gov.uk/statistics/population/long-term-international-migration-statistics
Derived Migration (combines three NISRA sources): - Population: https://www.nisra.gov.uk/statistics/people-and-communities/population - Births: https://www.nisra.gov.uk/statistics/births-deaths-and-marriages/births - Deaths: https://www.nisra.gov.uk/statistics/births-deaths-and-marriages/deaths
Update Frequency: Annual (both official and derived) Geographic Coverage: Northern Ireland Reference Period: Mid-year (July to June) for official; Calendar year for derived
Example
>>> from bolster.data_sources.nisra import migration
>>>
>>> # Get official NISRA migration statistics
>>> official = migration.get_official_migration()
>>> sorted(official.columns.tolist())
['date', 'net_migration', 'year']
>>> # Get derived migration estimates (from demographic equation)
>>> derived = migration.get_derived_migration()
>>> 'net_migration' in derived.columns
True
>>> # Compare official vs derived for validation
>>> comparison = migration.compare_official_vs_derived(official, derived)
>>> 'absolute_difference' in comparison.columns
True
Attributes
Functions
|
Aggregate monthly births data to annual totals. |
|
Aggregate weekly deaths data to annual totals. |
|
Aggregate population data to annual totals for Northern Ireland. |
|
Derive net migration from demographic components. |
|
Get the latest derived migration estimates for Northern Ireland. |
|
Validate that the demographic accounting equation holds. |
|
Filter migration data for a specific year. |
|
Calculate summary statistics for migration data. |
Scrape NISRA migration mother page to find latest Official estimates file. |
|
|
Parse downloaded official migration Excel file into DataFrame. |
Validate official migration data quality. |
|
|
Get the latest official NISRA migration statistics. |
|
Compare official migration data with derived estimates for validation. |
Module Contents
- bolster.data_sources.nisra.migration.calculate_annual_births(births_df)[source]
Aggregate monthly births data to annual totals.
- Parameters:
births_df (pandas.DataFrame) – DataFrame from births.get_latest_births(event_type=’occurrence’)
- Returns:
year: int
births: int (total births in year)
- Return type:
DataFrame with columns
- bolster.data_sources.nisra.migration.calculate_annual_deaths(deaths_df)[source]
Aggregate weekly deaths data to annual totals.
- Parameters:
deaths_df (pandas.DataFrame) – DataFrame from deaths.get_historical_deaths()
- Returns:
year: int
deaths: int (total deaths in year)
- Return type:
DataFrame with columns
- bolster.data_sources.nisra.migration.calculate_annual_population(population_df)[source]
Aggregate population data to annual totals for Northern Ireland.
- Parameters:
population_df (pandas.DataFrame) – DataFrame from population.get_latest_population(area=’Northern Ireland’)
- Returns:
year: int
population: int (mid-year population estimate)
- Return type:
DataFrame with columns
- bolster.data_sources.nisra.migration.derive_migration(population_df, births_df, deaths_df)[source]
Derive net migration from demographic components.
- Uses the demographic accounting equation:
Net Migration = ΔPopulation - (Births - Deaths)
- Parameters:
population_df (pandas.DataFrame) – DataFrame from population.get_latest_population()
births_df (pandas.DataFrame) – DataFrame from births.get_latest_births(event_type=’occurrence’)
deaths_df (pandas.DataFrame) – DataFrame from deaths.get_latest_deaths()
- Returns:
year: int
population_start: int (population at start of year, June 30 t-1)
population_end: int (population at end of year, June 30 t)
births: int (births in calendar year)
deaths: int (deaths in calendar year)
natural_change: int (births - deaths)
population_change: int (population_end - population_start)
net_migration: int (derived migration estimate)
migration_rate: float (per 1,000 population)
- Return type:
DataFrame with columns
- Raises:
NISRAValidationError – If data sources cannot be aligned
- bolster.data_sources.nisra.migration.get_latest_migration(force_refresh=False)[source]
Get the latest derived migration estimates for Northern Ireland.
Automatically downloads the most recent population, births, and deaths data, then calculates net migration using the demographic accounting equation.
- Parameters:
force_refresh (bool) – If True, bypass cache and download fresh data for all sources
- Returns:
year: int
population_start, population_end: int (mid-year estimates)
births, deaths: int (annual totals)
natural_change: int (births - deaths)
population_change: int (year-over-year change)
net_migration: int (derived estimate)
migration_rate: float (per 1,000 population)
- Return type:
DataFrame with columns
Example
>>> df = get_latest_migration() >>> 'net_migration' in df.columns True >>> len(df) > 0 True
- bolster.data_sources.nisra.migration.validate_demographic_equation(df, tolerance=100)[source]
Validate that the demographic accounting equation holds.
- Checks that:
Population Change = Natural Change + Net Migration
- Parameters:
df (pandas.DataFrame) – DataFrame from derive_migration() or get_latest_migration()
tolerance (int) – Allowable difference due to rounding/measurement error (default: 100)
- Returns:
True if validation passes
- Raises:
NISRAValidationError – If equation doesn’t hold within tolerance
- Return type:
- bolster.data_sources.nisra.migration.get_migration_by_year(df, year)[source]
Filter migration data for a specific year.
- Parameters:
df (pandas.DataFrame) – DataFrame from get_latest_migration()
year (int) – Year to filter
- Returns:
Filtered DataFrame
- Return type:
Example
>>> df = get_latest_migration() >>> df_2024 = get_migration_by_year(df, 2024) >>> 'net_migration' in df_2024.columns True
- bolster.data_sources.nisra.migration.get_migration_summary_statistics(df, start_year=None, end_year=None)[source]
Calculate summary statistics for migration data.
- Parameters:
df (pandas.DataFrame) – DataFrame from get_latest_migration()
start_year (int | None) – Optional start year for analysis period
end_year (int | None) – Optional end year for analysis period
- Returns:
total_years: Number of years analyzed
avg_net_migration: Average annual net migration
avg_migration_rate: Average migration rate per 1,000
positive_years: Number of years with net immigration
negative_years: Number of years with net emigration
max_immigration_year: Year with highest immigration
max_immigration: Highest immigration value
max_emigration_year: Year with highest emigration
max_emigration: Highest emigration value (as negative)
- Return type:
Dictionary with summary statistics
Example
>>> df = get_latest_migration() >>> stats = get_migration_summary_statistics(df, start_year=2010) >>> 'avg_net_migration' in stats True
- bolster.data_sources.nisra.migration.MIGRATION_MOTHER_PAGE = 'https://www.nisra.gov.uk/statistics/population/long-term-international-migration-statistics'[source]
- bolster.data_sources.nisra.migration.get_official_migration_publication_url()[source]
Scrape NISRA migration mother page to find latest Official estimates file.
Navigates the publication structure: 1. Scrapes mother page for latest “Long-Term International Migration” publication 2. Follows link to publication detail page 3. Finds “Official” Excel file (Mig[YY][YY]-Official_1.xlsx)
- bolster.data_sources.nisra.migration.parse_official_migration_file(file_path)[source]
Parse downloaded official migration Excel file into DataFrame.
Extracts Table 1.1 (Net International Migration time series) from the Official estimates file and transforms it into long-format DataFrame.
- Parameters:
file_path (pathlib.Path) – Path to downloaded Mig[YY][YY]-Official_1.xlsx file
- Returns:
year: int (mid-year)
net_migration: int (net international migration)
date: pd.Timestamp (reference date, June 30 of end year)
- Return type:
DataFrame with columns
- Raises:
NISRAValidationError – If file format is unexpected or parsing fails
- bolster.data_sources.nisra.migration.validate_official_migration(df)[source]
Validate official migration data quality.
- Parameters:
df (pandas.DataFrame) – DataFrame from parse_official_migration_file() or get_official_migration()
- Returns:
True if validation passes
- Raises:
NISRAValidationError – If validation fails
- Return type:
- bolster.data_sources.nisra.migration.get_official_migration(force_refresh=False)[source]
Get the latest official NISRA migration statistics.
Automatically downloads the most recent official migration estimates from NISRA and parses them into a structured DataFrame.
- Parameters:
force_refresh (bool) – If True, bypass cache and download fresh data
- Returns:
year: int (mid-year)
net_migration: int (net international migration)
date: pd.Timestamp (reference date)
- Return type:
DataFrame with columns
- Raises:
NISRADataNotFoundError – If publication cannot be found
NISRAValidationError – If data fails validation
Example
>>> official = get_official_migration() >>> sorted(official.columns.tolist()) ['date', 'net_migration', 'year'] >>> len(official) > 0 True
- bolster.data_sources.nisra.migration.compare_official_vs_derived(official_df, derived_df, threshold=1000)[source]
Compare official migration data with derived estimates for validation.
- Parameters:
official_df (pandas.DataFrame) – DataFrame from get_official_migration()
derived_df (pandas.DataFrame) – DataFrame from get_derived_migration() / get_latest_migration()
threshold (int) – Absolute difference threshold for flagging discrepancies (default: 1000)
- Returns:
year: int
official_net_migration: int
derived_net_migration: int
absolute_difference: int
percent_difference: float
exceeds_threshold: bool
- Return type:
DataFrame with columns
Example
>>> official = get_official_migration() >>> derived = get_derived_migration() >>> comparison = compare_official_vs_derived(official, derived) >>> sorted(comparison.columns.tolist()) ['absolute_difference', 'derived_net_migration', 'exceeds_threshold', 'official_net_migration', 'percent_difference', 'year']