bolster.data_sources.nisra.ashe =============================== .. py:module:: bolster.data_sources.nisra.ashe .. autoapi-nested-parse:: NISRA Annual Survey of Hours and Earnings (ASHE) Module. This module provides access to Northern Ireland's earnings statistics: - Median weekly, hourly, and annual earnings - Breakdowns by employment type, sector, geography, occupation, industry - Gender pay gap analysis - Historical timeseries from 1997 to present Data is published annually in October by NISRA's Economic & Labour Market Statistics Branch. Data Source: Northern Ireland Statistics and Research Agency provides Annual Survey of Hours and Earnings statistics through their Work, Pay and Benefits section at https://www.nisra.gov.uk/statistics/work-pay-and-benefits/annual-survey-hours-and-earnings. ASHE data covers employee earnings across all sectors based on a sample survey of payroll records from HMRC PAYE data, providing comprehensive earnings statistics for Northern Ireland. Update Frequency: Annual publications released in October each year, covering earnings data for the reference period of April. The dataset provides the most comprehensive and official source of employee earnings statistics for Northern Ireland, updated once per year with historical revisions as necessary. Data Coverage: - Weekly Earnings: 1997 - Present (annual, full-time/part-time/all) - Hourly Earnings: 1997 - Present (annual, excluding overtime) - Annual Earnings: 1999 - Present (annual, full-time/part-time/all) - Geographic: 11 Local Government Districts (workplace vs residence basis) - Sector: Public vs Private sector comparison (2005 - Present) .. rubric:: Examples >>> from bolster.data_sources.nisra import ashe >>> # Get latest weekly earnings timeseries >>> df = ashe.get_latest_ashe_timeseries(metric='weekly') >>> sorted(df.columns.tolist()) ['median_weekly_earnings', 'work_pattern', 'year'] >>> # Get geographic earnings by workplace >>> df_geo = ashe.get_latest_ashe_geography(basis='workplace') >>> 'median_weekly_earnings' in df_geo.columns True >>> # Get public vs private sector comparison >>> df_sector = ashe.get_latest_ashe_sector() >>> 'location' in df_sector.columns True Publication Details: - Frequency: Annual (October publication) - Reference period: April of each year - Published by: NISRA Economic & Labour Market Statistics Branch - Contact: economicstats@nisra.gov.uk - Base: Employee jobs in Northern Ireland (not self-employed) Attributes ---------- .. autoapisummary:: bolster.data_sources.nisra.ashe.logger bolster.data_sources.nisra.ashe.ASHE_BASE_URL Functions --------- .. autoapisummary:: bolster.data_sources.nisra.ashe.get_latest_ashe_publication_url bolster.data_sources.nisra.ashe.get_ashe_file_url bolster.data_sources.nisra.ashe.parse_ashe_timeseries_weekly bolster.data_sources.nisra.ashe.parse_ashe_timeseries_hourly bolster.data_sources.nisra.ashe.parse_ashe_timeseries_annual bolster.data_sources.nisra.ashe.parse_ashe_geography bolster.data_sources.nisra.ashe.parse_ashe_sector bolster.data_sources.nisra.ashe.get_latest_ashe_timeseries bolster.data_sources.nisra.ashe.get_latest_ashe_geography bolster.data_sources.nisra.ashe.get_latest_ashe_sector bolster.data_sources.nisra.ashe.get_earnings_by_year bolster.data_sources.nisra.ashe.calculate_growth_rates bolster.data_sources.nisra.ashe.parse_ashe_gender_pay_gap bolster.data_sources.nisra.ashe.parse_ashe_hourly_earnings_by_sector_gender bolster.data_sources.nisra.ashe.parse_ashe_hourly_earnings_by_age_gender bolster.data_sources.nisra.ashe.parse_ashe_hourly_earnings_by_occupation_gender bolster.data_sources.nisra.ashe.parse_ashe_hourly_earnings_by_pattern_gender bolster.data_sources.nisra.ashe.parse_ashe_ni_uk_earnings_comparison bolster.data_sources.nisra.ashe.parse_ashe_uk_regional_pay_ratio bolster.data_sources.nisra.ashe.parse_ashe_hours_distribution bolster.data_sources.nisra.ashe.parse_ashe_working_pattern_pay_gap bolster.data_sources.nisra.ashe.parse_ashe_mean_hours_by_pattern_gender bolster.data_sources.nisra.ashe.get_gender_pay_gap bolster.data_sources.nisra.ashe.get_hourly_earnings_by_sector_gender bolster.data_sources.nisra.ashe.get_hourly_earnings_by_age_gender bolster.data_sources.nisra.ashe.get_hourly_earnings_by_occupation_gender bolster.data_sources.nisra.ashe.get_hourly_earnings_by_pattern_gender bolster.data_sources.nisra.ashe.get_ni_uk_earnings_comparison bolster.data_sources.nisra.ashe.get_uk_regional_pay_ratio bolster.data_sources.nisra.ashe.get_hours_distribution bolster.data_sources.nisra.ashe.get_working_pattern_pay_gap bolster.data_sources.nisra.ashe.get_mean_hours_by_pattern_gender bolster.data_sources.nisra.ashe.parse_ashe_real_earnings bolster.data_sources.nisra.ashe.parse_ashe_real_earnings_change_by_pattern bolster.data_sources.nisra.ashe.parse_ashe_real_earnings_index_by_sector bolster.data_sources.nisra.ashe.parse_ashe_annual_change_by_occupation bolster.data_sources.nisra.ashe.parse_ashe_annual_change_by_industry bolster.data_sources.nisra.ashe.parse_ashe_pay_distribution_timeseries bolster.data_sources.nisra.ashe.parse_ashe_pay_distribution_by_classification bolster.data_sources.nisra.ashe.get_real_earnings bolster.data_sources.nisra.ashe.get_real_earnings_change_by_pattern bolster.data_sources.nisra.ashe.get_real_earnings_index_by_sector bolster.data_sources.nisra.ashe.get_annual_change_by_occupation bolster.data_sources.nisra.ashe.get_annual_change_by_industry bolster.data_sources.nisra.ashe.get_pay_distribution_timeseries bolster.data_sources.nisra.ashe.get_pay_distribution_by_classification bolster.data_sources.nisra.ashe.validate_ashe_data Module Contents --------------- .. py:data:: logger .. py:data:: ASHE_BASE_URL :value: 'https://www.nisra.gov.uk/statistics/work-pay-and-benefits/annual-survey-hours-and-earnings' .. py:function:: get_latest_ashe_publication_url() Get the URL of the latest ASHE publication and its year. Scrapes the NISRA ASHE page to find the most recent publication. :returns: Tuple of (publication_url, year) :raises NISRADataNotFoundError: If unable to find the latest publication .. rubric:: Example >>> url, year = get_latest_ashe_publication_url() >>> url.startswith('https://') True .. py:function:: get_ashe_file_url(year, file_type = 'timeseries') Construct URL for ASHE file based on year and file type. :param year: Publication year (e.g., 2025) :param file_type: Type of file - 'timeseries' or 'linked' :returns: URL to the Excel file .. rubric:: Example >>> url = get_ashe_file_url(2025, 'timeseries') >>> url.startswith('https://') True .. py:function:: parse_ashe_timeseries_weekly(file_path) Parse ASHE weekly earnings timeseries. Extracts the weekly earnings data from the timeseries Excel file. :param file_path: Path to the ASHE timeseries Excel file :returns: - year: int - work_pattern: str ('Full-time', 'Part-time', 'All') - median_weekly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'timeseries'), cache_ttl_hours=90*24) >>> df = parse_ashe_timeseries_weekly(path) >>> sorted(df.columns.tolist()) ['median_weekly_earnings', 'work_pattern', 'year'] .. py:function:: parse_ashe_timeseries_hourly(file_path) Parse ASHE hourly earnings timeseries. Extracts the hourly earnings data (excluding overtime) from the timeseries Excel file. :param file_path: Path to the ASHE timeseries Excel file :returns: - year: int - work_pattern: str ('Full-time', 'Part-time', 'All') - median_hourly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'timeseries'), cache_ttl_hours=90*24) >>> df = parse_ashe_timeseries_hourly(path) >>> sorted(df.columns.tolist()) ['median_hourly_earnings', 'work_pattern', 'year'] .. py:function:: parse_ashe_timeseries_annual(file_path) Parse ASHE annual earnings timeseries. Extracts the annual earnings data from the timeseries Excel file. :param file_path: Path to the ASHE timeseries Excel file :returns: - year: int - work_pattern: str ('Full-time', 'Part-time', 'All') - median_annual_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'timeseries'), cache_ttl_hours=90*24) >>> df = parse_ashe_timeseries_annual(path) >>> sorted(df.columns.tolist()) ['median_annual_earnings', 'work_pattern', 'year'] .. py:function:: parse_ashe_geography(file_path, basis = 'workplace', year = None) Parse ASHE geographic earnings data. Extracts earnings by Local Government District from the linked tables file. :param file_path: Path to the ASHE linked tables Excel file :param basis: 'workplace' (MapA) or 'residence' (MapB) :param year: Year of the data (if not provided, will be extracted from file) :returns: - year: int - lgd: str (Local Government District name) - basis: str ('workplace' or 'residence') - median_weekly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_geography(path, basis='workplace', year=year) >>> 'median_weekly_earnings' in df.columns True .. py:function:: parse_ashe_sector(file_path) Parse ASHE public vs private sector earnings. Extracts public and private sector earnings timeseries from the linked tables file. :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - location: str ('Northern Ireland' or 'United Kingdom') - sector: str ('Public' or 'Private') - median_weekly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_sector(path) >>> 'location' in df.columns True .. py:function:: get_latest_ashe_timeseries(metric = 'weekly', force_refresh = False) Get the latest ASHE timeseries data. Downloads and parses the most recent ASHE timeseries publication. Results are cached for 90 days unless force_refresh=True. :param metric: Type of earnings - 'weekly', 'hourly', or 'annual' :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with timeseries earnings data (1997-present for weekly/hourly, 1999-present for annual) .. rubric:: Example >>> df = get_latest_ashe_timeseries(metric='weekly') >>> sorted(df.columns.tolist()) ['median_weekly_earnings', 'work_pattern', 'year'] .. py:function:: get_latest_ashe_geography(basis = 'workplace', force_refresh = False) Get the latest ASHE geographic earnings data. Downloads and parses the most recent ASHE linked tables publication. Results are cached for 90 days unless force_refresh=True. :param basis: 'workplace' (where employees work) or 'residence' (where employees live) :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with earnings by Local Government District .. rubric:: Example >>> df = get_latest_ashe_geography(basis='workplace') >>> 'median_weekly_earnings' in df.columns True .. py:function:: get_latest_ashe_sector(force_refresh = False) Get the latest ASHE public vs private sector earnings data. Downloads and parses the most recent ASHE linked tables publication. Results are cached for 90 days unless force_refresh=True. :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with public and private sector earnings timeseries (2005-present) .. rubric:: Example >>> df = get_latest_ashe_sector() >>> 'location' in df.columns True .. py:function:: get_earnings_by_year(df, year) Filter earnings data for a specific year. :param df: ASHE DataFrame :param year: Year to filter for :returns: DataFrame with only the specified year's data .. rubric:: Example >>> df = get_latest_ashe_timeseries('weekly') >>> df_2025 = get_earnings_by_year(df, 2025) >>> 'median_weekly_earnings' in df_2025.columns True .. py:function:: calculate_growth_rates(df, periods = 1) Calculate year-on-year growth rates for earnings. :param df: ASHE DataFrame with 'year' and earnings column :param periods: Number of years for comparison (default: 1 for YoY) :returns: DataFrame with additional growth rate column .. rubric:: Example >>> df = get_latest_ashe_timeseries('weekly') >>> df_growth = calculate_growth_rates(df) >>> 'earnings_yoy_growth' in df_growth.columns True .. py:function:: parse_ashe_gender_pay_gap(file_path) Parse ASHE gender pay gap timeseries (NI and UK), any publication year. Extracts the NI and UK all-employee gender pay gap from 2005 to present. The gap is defined as the difference between male and female median hourly earnings as a percentage of male median hourly earnings (all employees, excluding overtime). Note: methodological changes occurred in 2006, 2011 and 2021 — these are annotated in NISRA publications and should be considered when interpreting trend breaks. :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - location: str ('Northern Ireland' or 'United Kingdom') - gender_pay_gap_pct: float — GPG as % of male earnings (positive = men paid more) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_gender_pay_gap(path) >>> sorted(df.columns.tolist()) ['gender_pay_gap_pct', 'location', 'year'] .. py:function:: parse_ashe_hourly_earnings_by_sector_gender(file_path) Parse ASHE hourly earnings by sector and gender timeseries. Identified by column signature ['Year', 'Male public', 'Female public', 'Male private', 'Female private'] — stable across all publication years. :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - sector: str ('Public' or 'Private') - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£, excluding overtime) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_hourly_earnings_by_sector_gender(path) >>> sorted(df.columns.tolist()) ['median_hourly_earnings', 'sector', 'sex', 'year'] .. py:function:: parse_ashe_hourly_earnings_by_age_gender(file_path) Parse ASHE hourly earnings by age group and gender, latest year snapshot. Identified by column signature ['Age group', 'Female', 'Male'] with subtitle containing 'age'. Present in all publication years, though in different figure slots. :param file_path: Path to the ASHE linked tables Excel file :returns: - age_group: str (e.g. '18-21', '22-29', '30-39', '40-49', '50-59', '60+') - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£, excluding overtime) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_hourly_earnings_by_age_gender(path) >>> sorted(df.columns.tolist()) ['age_group', 'median_hourly_earnings', 'sex'] .. py:function:: parse_ashe_hourly_earnings_by_occupation_gender(file_path) Parse ASHE hourly earnings by occupation and gender, latest year snapshot. Identified by column signature ['Occupation', 'Female', 'Male'] with subtitle containing 'occupation'. Present in all publication years. :param file_path: Path to the ASHE linked tables Excel file :returns: - occupation: str (SOC major group label) - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£, excluding overtime) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_hourly_earnings_by_occupation_gender(path) >>> sorted(df.columns.tolist()) ['median_hourly_earnings', 'occupation', 'sex'] .. py:function:: parse_ashe_hourly_earnings_by_pattern_gender(file_path) Parse ASHE hourly earnings by working pattern and gender, latest year snapshot. Identified by column signature ['Working pattern', 'Female', 'Male'] with subtitle containing 'working pattern' (excluding pay gap / hours tables which share similar columns). Present in all publication years. Note: part-time females earn *more* per hour than part-time males in NI — a reversal of the full-time pattern, documented across 2022–2025. :param file_path: Path to the ASHE linked tables Excel file :returns: - work_pattern: str ('Full-time', 'Part-time', 'All Employees') - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£, excluding overtime) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_hourly_earnings_by_pattern_gender(path) >>> sorted(df.columns.tolist()) ['median_hourly_earnings', 'sex', 'work_pattern'] .. py:function:: parse_ashe_ni_uk_earnings_comparison(file_path) Parse ASHE NI vs UK full-time weekly earnings timeseries. Identified by column signature ['Year', 'UK', 'NI'] with subtitle containing 'weekly' and 'full-time'. Stable across all publication years (always Figure 1). :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - location: str ('NI' or 'UK') - median_weekly_earnings_fulltime: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_ni_uk_earnings_comparison(path) >>> sorted(df.columns.tolist()) ['location', 'median_weekly_earnings_fulltime', 'year'] .. py:function:: parse_ashe_uk_regional_pay_ratio(file_path) Parse ASHE high-to-low pay ratio by UK region, latest year snapshot. Identified by column signature ['Region', 'Ratio']. Present in all years but in different figure slots (Figure 14 in 2022, Figure 16 in 2023, Figure 13 in 2024–2025). :param file_path: Path to the ASHE linked tables Excel file :returns: - region: str (UK region name) - ratio: float (high-paid / low-paid jobs ratio) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_uk_regional_pay_ratio(path) >>> sorted(df.columns.tolist()) ['ratio', 'region'] .. py:function:: parse_ashe_hours_distribution(file_path) Parse ASHE distribution of total weekly paid hours, NI, latest year snapshot. Identified by column signature ['Paid hours worked', 'Percentage']. Present in all years but in different figure slots (Figure 3 in 2022–2023, Figure 9 in 2024–2025). :param file_path: Path to the ASHE linked tables Excel file :returns: - paid_hours_worked: int (hours 0–80) - percentage: float (% of employees) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_hours_distribution(path) >>> sorted(df.columns.tolist()) ['paid_hours_worked', 'percentage'] .. py:function:: parse_ashe_working_pattern_pay_gap(file_path) Parse ASHE working pattern pay gap timeseries, NI vs UK. Identified by column signature ['Year', 'UK', 'NI'] with subtitle containing 'working pattern pay gap'. Present from 2023 onwards (Figure 23 in 2023, Figure 19 in 2024–2025). :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - location: str ('NI' or 'UK') - working_pattern_pay_gap_pct: float (%) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_working_pattern_pay_gap(path) >>> sorted(df.columns.tolist()) ['location', 'working_pattern_pay_gap_pct', 'year'] .. py:function:: parse_ashe_mean_hours_by_pattern_gender(file_path) Parse ASHE mean weekly paid hours by work pattern and gender, NI, latest year. Identified by column signature ['Working pattern', 'Males', 'Females', 'All']. Present in all years (Figure 21 in 2022–2023, Figure 20 in 2024–2025). :param file_path: Path to the ASHE linked tables Excel file :returns: - work_pattern: str ('Part-time', 'Full-time', 'All Employees') - male_mean_hours: float - female_mean_hours: float - all_mean_hours: float :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_mean_hours_by_pattern_gender(path) >>> 'work_pattern' in df.columns True .. py:function:: get_gender_pay_gap(force_refresh = False) Get ASHE gender pay gap timeseries for NI and the UK. Returns the population-level GPG derived from NISRA's ASHE survey — the difference between male and female median hourly earnings as a percentage of male earnings, for all employees. This is survey-based (HMRC PAYE sample) and covers the whole NI economy, complementing the mandatory employer-reported GPG data available via ``bolster.data_sources.gender_pay_gap`` (which covers named employers with 250+ staff only). :param force_refresh: If True, bypass cache and download fresh data :returns: - year: int (2005–present) - location: str ('Northern Ireland' or 'United Kingdom') - gender_pay_gap_pct: float :rtype: DataFrame with columns .. rubric:: Example >>> df = get_gender_pay_gap() >>> 'gender_pay_gap_pct' in df.columns True .. py:function:: get_hourly_earnings_by_sector_gender(force_refresh = False) Get ASHE hourly earnings by sector and gender timeseries for NI. Returns median gross hourly earnings (excl. overtime) for NI employees by public/private sector and sex, from 2005 to present. :param force_refresh: If True, bypass cache and download fresh data :returns: - year: int (2005–present) - sector: str ('Public' or 'Private') - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_hourly_earnings_by_sector_gender() >>> 'median_hourly_earnings' in df.columns True .. py:function:: get_hourly_earnings_by_age_gender(force_refresh = False) Get ASHE hourly earnings by age group and gender for NI, latest year snapshot. Returns median gross hourly earnings (excl. overtime) for NI employees by age band and sex. :param force_refresh: If True, bypass cache and download fresh data :returns: - age_group: str - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_hourly_earnings_by_age_gender() >>> 'median_hourly_earnings' in df.columns True .. py:function:: get_hourly_earnings_by_occupation_gender(force_refresh = False) Get ASHE hourly earnings by occupation and gender for NI, latest year snapshot. Returns median gross hourly earnings (excl. overtime) for NI employees by SOC major occupation group and sex. :param force_refresh: If True, bypass cache and download fresh data :returns: - occupation: str - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_hourly_earnings_by_occupation_gender() >>> 'median_hourly_earnings' in df.columns True .. py:function:: get_hourly_earnings_by_pattern_gender(force_refresh = False) Get ASHE hourly earnings by working pattern and gender for NI, latest year snapshot. Returns median gross hourly earnings (excl. overtime) for NI employees by full-time/part-time and sex. :param force_refresh: If True, bypass cache and download fresh data :returns: - work_pattern: str ('Full-time', 'Part-time', 'All Employees') - sex: str ('Male' or 'Female') - median_hourly_earnings: float (£) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_hourly_earnings_by_pattern_gender() >>> 'median_hourly_earnings' in df.columns True .. py:function:: get_ni_uk_earnings_comparison(force_refresh = False) Get NI vs UK full-time weekly earnings timeseries. :param force_refresh: If True, bypass cache and download fresh data :returns: year, location ('NI'/'UK'), median_weekly_earnings_fulltime :rtype: DataFrame with columns .. rubric:: Example >>> df = get_ni_uk_earnings_comparison() >>> 'median_weekly_earnings_fulltime' in df.columns True .. py:function:: get_uk_regional_pay_ratio(force_refresh = False) Get high-to-low pay ratio by UK region, latest year snapshot. :param force_refresh: If True, bypass cache and download fresh data :returns: region, ratio :rtype: DataFrame with columns .. rubric:: Example >>> df = get_uk_regional_pay_ratio() >>> 'ratio' in df.columns True .. py:function:: get_hours_distribution(force_refresh = False) Get distribution of total weekly paid hours for NI employees, latest year snapshot. :param force_refresh: If True, bypass cache and download fresh data :returns: paid_hours_worked, percentage :rtype: DataFrame with columns .. rubric:: Example >>> df = get_hours_distribution() >>> 'percentage' in df.columns True .. py:function:: get_working_pattern_pay_gap(force_refresh = False) Get working pattern pay gap timeseries for NI vs UK. :param force_refresh: If True, bypass cache and download fresh data :returns: year, location ('NI'/'UK'), working_pattern_pay_gap_pct :rtype: DataFrame with columns .. rubric:: Example >>> df = get_working_pattern_pay_gap() >>> 'working_pattern_pay_gap_pct' in df.columns True .. py:function:: get_mean_hours_by_pattern_gender(force_refresh = False) Get mean weekly paid hours by work pattern and gender for NI, latest year snapshot. :param force_refresh: If True, bypass cache and download fresh data :returns: work_pattern, male_mean_hours, female_mean_hours, all_mean_hours :rtype: DataFrame with columns .. rubric:: Example >>> df = get_mean_hours_by_pattern_gender() >>> 'male_mean_hours' in df.columns True .. py:function:: parse_ashe_real_earnings(file_path) Parse ASHE Figure 2: nominal vs real weekly earnings timeseries for NI. Identified by column signature ['Year', 'Nominal earnings', 'Real earnings']. Real earnings are inflation-adjusted to the latest publication year using NISRA's deflator. The series covers full-time employees, April 2005 onwards. :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - nominal_weekly_earnings: float (£, in nominal terms) - real_weekly_earnings: float (£, inflation-adjusted to latest year) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_real_earnings(path) >>> sorted(df.columns.tolist()) ['nominal_weekly_earnings', 'real_weekly_earnings', 'year'] .. py:function:: parse_ashe_real_earnings_change_by_pattern(file_path) Parse ASHE Figure 3: annual % change in weekly earnings by work pattern (NI). Snapshot for the latest reference year showing nominal vs real (inflation- adjusted) annual change for each working pattern. :param file_path: Path to the ASHE linked tables Excel file :returns: - work_pattern: str ('Part-time', 'Full-time', 'All employees') - nominal_change_pct: float (annual % change in nominal terms) - real_change_pct: float (annual % change in real terms) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_real_earnings_change_by_pattern(path) >>> sorted(df.columns.tolist()) ['nominal_change_pct', 'real_change_pct', 'work_pattern'] .. py:function:: parse_ashe_real_earnings_index_by_sector(file_path) Parse ASHE Figure 6: real earnings index (2019=100) by sector for NI. Identified by column signature ['Year', 'Public', 'Private'] with subtitle containing 'annual index'. Indexed real (inflation-adjusted) median weekly earnings for full-time employees, base year = six years before the latest publication year (typically 2019 in the 2025 publication). :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - sector: str ('Public' or 'Private') - real_earnings_index: float (index, base year = 100) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_real_earnings_index_by_sector(path) >>> sorted(df.columns.tolist()) ['real_earnings_index', 'sector', 'year'] .. py:function:: parse_ashe_annual_change_by_occupation(file_path) Parse ASHE Figure 7: annual % change in weekly earnings by occupation (NI, FT). Full-time employees only. NISRA labels the column "Industry" in the source spreadsheet, but the rows are SOC major occupation groups — disambiguated from Figure 8 by subtitle keyword 'by occupation'. :param file_path: Path to the ASHE linked tables Excel file :returns: - occupation: str (SOC major group label) - annual_change_pct: float (% change in median gross weekly earnings) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_annual_change_by_occupation(path) >>> sorted(df.columns.tolist()) ['annual_change_pct', 'occupation'] .. py:function:: parse_ashe_annual_change_by_industry(file_path) Parse ASHE Figure 8: annual % change in weekly earnings by industry (NI, FT). Full-time employees by SIC industry section. Disambiguated from Figure 7 by subtitle keyword 'by industry'. :param file_path: Path to the ASHE linked tables Excel file :returns: - industry: str (SIC section label) - annual_change_pct: float (% change in median gross weekly earnings) :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_annual_change_by_industry(path) >>> sorted(df.columns.tolist()) ['annual_change_pct', 'industry'] .. py:function:: parse_ashe_pay_distribution_timeseries(file_path) Parse ASHE Figure 11: proportion of low/middle/high-paid employee jobs in NI. Identified by column signature ['Year', 'Low-paid jobs', 'Middle-paid jobs', 'High-paid jobs']. Covers April 2005 to the latest publication year. NISRA defines pay bands relative to UK median hourly earnings: low-paid = below two-thirds, high-paid = above 1.5x, middle-paid = the rest. :param file_path: Path to the ASHE linked tables Excel file :returns: - year: int - low_paid_pct: float - middle_paid_pct: float - high_paid_pct: float :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_pay_distribution_timeseries(path) >>> sorted(df.columns.tolist()) ['high_paid_pct', 'low_paid_pct', 'middle_paid_pct', 'year'] .. py:function:: parse_ashe_pay_distribution_by_classification(file_path) Parse ASHE Figure 12: pay distribution by working pattern, age, industry, occupation. Cross-sectional breakdown of low/middle/high-paid proportions for the latest reference year, with multiple classification axes stacked in a single sheet. :param file_path: Path to the ASHE linked tables Excel file :returns: - classification: str (e.g. 'Working pattern', 'Age group', 'Industry', 'Occupation') - subset: str (the specific category, e.g. 'Full-time', '22-29', 'Education') - low_paid_pct: float - middle_paid_pct: float - high_paid_pct: float :rtype: DataFrame with columns .. rubric:: Example >>> _, year = get_latest_ashe_publication_url() >>> path = download_file(get_ashe_file_url(year, 'linked'), cache_ttl_hours=90*24) >>> df = parse_ashe_pay_distribution_by_classification(path) >>> sorted(df.columns.tolist()) ['classification', 'high_paid_pct', 'low_paid_pct', 'middle_paid_pct', 'subset'] .. py:function:: get_real_earnings(force_refresh = False) Get ASHE nominal vs real weekly earnings timeseries for NI (Figure 2). :param force_refresh: If True, bypass cache and download fresh data :returns: year, nominal_weekly_earnings, real_weekly_earnings :rtype: DataFrame with columns .. rubric:: Example >>> df = get_real_earnings() >>> 'real_weekly_earnings' in df.columns True .. py:function:: get_real_earnings_change_by_pattern(force_refresh = False) Get ASHE annual % change in weekly earnings by work pattern (Figure 3). :param force_refresh: If True, bypass cache and download fresh data :returns: work_pattern, nominal_change_pct, real_change_pct :rtype: DataFrame with columns .. rubric:: Example >>> df = get_real_earnings_change_by_pattern() >>> 'real_change_pct' in df.columns True .. py:function:: get_real_earnings_index_by_sector(force_refresh = False) Get ASHE real earnings index by sector for NI (Figure 6). :param force_refresh: If True, bypass cache and download fresh data :returns: year, sector, real_earnings_index :rtype: DataFrame with columns .. rubric:: Example >>> df = get_real_earnings_index_by_sector() >>> 'real_earnings_index' in df.columns True .. py:function:: get_annual_change_by_occupation(force_refresh = False) Get ASHE annual % change in weekly earnings by occupation, NI (Figure 7). :param force_refresh: If True, bypass cache and download fresh data :returns: occupation, annual_change_pct :rtype: DataFrame with columns .. rubric:: Example >>> df = get_annual_change_by_occupation() >>> 'annual_change_pct' in df.columns True .. py:function:: get_annual_change_by_industry(force_refresh = False) Get ASHE annual % change in weekly earnings by industry, NI (Figure 8). :param force_refresh: If True, bypass cache and download fresh data :returns: industry, annual_change_pct :rtype: DataFrame with columns .. rubric:: Example >>> df = get_annual_change_by_industry() >>> 'annual_change_pct' in df.columns True .. py:function:: get_pay_distribution_timeseries(force_refresh = False) Get ASHE low/middle/high-paid proportion timeseries for NI (Figure 11). :param force_refresh: If True, bypass cache and download fresh data :returns: year, low_paid_pct, middle_paid_pct, high_paid_pct :rtype: DataFrame with columns .. rubric:: Example >>> df = get_pay_distribution_timeseries() >>> 'low_paid_pct' in df.columns True .. py:function:: get_pay_distribution_by_classification(force_refresh = False) Get ASHE pay distribution by classification, NI, latest year (Figure 12). :param force_refresh: If True, bypass cache and download fresh data :returns: classification, subset, low_paid_pct, middle_paid_pct, high_paid_pct :rtype: DataFrame with columns .. rubric:: Example >>> df = get_pay_distribution_by_classification() >>> 'classification' in df.columns True .. py:function:: validate_ashe_data(df) Validate ASHE earnings data integrity. :param df: DataFrame from ASHE functions :returns: True if validation passes, False otherwise