bolster.data_sources.ni_house_price_index ========================================= .. py:module:: bolster.data_sources.ni_house_price_index .. autoapi-nested-parse:: Northern Ireland House Price Index Data. Provides access to quarterly house price index, standardised prices, and sales volumes for Northern Ireland with breakdowns by: - Property type (Detached, Semi-Detached, Terrace, Apartment) - New vs Existing dwellings - Local Government District (11 LGDs) - Urban vs Rural areas Data Source: **Publication Page**: https://www.finance-ni.gov.uk/publications/ni-house-price-index-statistical-reports The module automatically scrapes this page to find the latest quarterly Excel file, which contains multiple worksheets with different data breakdowns. Update Frequency: Quarterly Geographic Coverage: Northern Ireland Reference Period: Q1 2005 - present Index Base: Q1 2023 = 100 See [here](https://andrewbolster.info/2022/03/NI-House-Price-Index.html) for more details. .. rubric:: Example >>> from bolster.data_sources.ni_house_price_index import get_hpi_trends, get_sales_volumes >>> hpi = get_hpi_trends() >>> 'NI House Price Index' in hpi.columns True >>> sales = get_sales_volumes() >>> 'Total' in sales.columns True Attributes ---------- .. autoapisummary:: bolster.data_sources.ni_house_price_index.logger bolster.data_sources.ni_house_price_index.DEFAULT_URL bolster.data_sources.ni_house_price_index.CACHE_DIR bolster.data_sources.ni_house_price_index.TABLE_TRANSFORMATION_MAP Exceptions ---------- .. autoapisummary:: bolster.data_sources.ni_house_price_index.NIHPIDataError bolster.data_sources.ni_house_price_index.NIHPIDataNotFoundError Functions --------- .. autoapisummary:: bolster.data_sources.ni_house_price_index.clear_cache bolster.data_sources.ni_house_price_index.get_source_url bolster.data_sources.ni_house_price_index.pull_sources bolster.data_sources.ni_house_price_index.basic_cleanup bolster.data_sources.ni_house_price_index.cleanup_contents bolster.data_sources.ni_house_price_index.cleanup_price_by_property_type_agg bolster.data_sources.ni_house_price_index.cleanup_price_by_property_type bolster.data_sources.ni_house_price_index.cleanup_with_munged_quarters_and_total_rows bolster.data_sources.ni_house_price_index.cleanup_with_LGDs bolster.data_sources.ni_house_price_index.cleanup_combined_year_quarter bolster.data_sources.ni_house_price_index.cleanup_missing_year_quarter bolster.data_sources.ni_house_price_index.transform_sources bolster.data_sources.ni_house_price_index.get_all_tables bolster.data_sources.ni_house_price_index.get_hpi_trends bolster.data_sources.ni_house_price_index.get_sales_volumes bolster.data_sources.ni_house_price_index.get_average_prices bolster.data_sources.ni_house_price_index.get_hpi_by_lgd bolster.data_sources.ni_house_price_index.get_hpi_by_property_type bolster.data_sources.ni_house_price_index.build Module Contents --------------- .. py:data:: logger .. py:data:: DEFAULT_URL :value: 'https://www.finance-ni.gov.uk/publications/ni-house-price-index-statistical-reports' .. py:data:: CACHE_DIR .. py:data:: TABLE_TRANSFORMATION_MAP .. py:exception:: NIHPIDataError Bases: :py:obj:`Exception` Base exception for NI HPI data errors. Initialize self. See help(type(self)) for accurate signature. .. py:exception:: NIHPIDataNotFoundError Bases: :py:obj:`NIHPIDataError` Data file not available. Initialize self. See help(type(self)) for accurate signature. .. py:function:: clear_cache() Clear all cached HPI data files. .. py:function:: get_source_url(base_url=DEFAULT_URL) Find the URL of the latest HPI Excel file from the publication page. :param base_url: URL of the publication listing page :returns: URL of the Excel file :raises NIHPIDataNotFoundError: If no Excel file found .. py:function:: pull_sources(base_url = DEFAULT_URL, force_refresh = False, cache_ttl_hours = 24 * 7) Pull raw NI House Price Index Excel from finance-ni.gov.uk. Downloads the latest HPI Excel file and returns all worksheets as a dictionary of DataFrames. Files are cached locally to avoid repeated downloads. :param base_url: URL of the publication listing page :param force_refresh: If True, bypass cache and download fresh data :param cache_ttl_hours: Cache validity in hours (default: 7 days) :returns: Dictionary mapping sheet names to raw DataFrames :raises NIHPIDataNotFoundError: If source file not found or download fails .. py:function:: basic_cleanup(df, offset=1) Generic cleanup operations for NI HPI data. Operations performed: - Re-header from Offset row and translate table to eliminate incorrect headers - Remove any columns with 'Nan' or 'None' in the given offset-row - If 'NI' appears and all the values are 100, remove it - Remove any rows below and including the first 'all nan' row (gets most tail-notes) - If 'Sale Year','Sale Quarter' appear in the columns, replace with 'Year','Quarter' respectively - For Year; forward fill any none/nan values - If Year/Quarter appear, add a new composite 'Period' column with a PeriodIndex columns representing the year/quarter (i.e. 2022-Q1) - Reset and drop the index - Attempt to infer the new/current column object types :param df: DataFrame to clean :param offset: Row offset to find headers :returns: Cleaned DataFrame .. py:function:: cleanup_contents(df) Fix Contents table of NI HPI Stats. - Shift/rebuild headers - Strip Figures because they're gonna be broken anyway :param df: Raw DataFrame from Excel :returns: Cleaned DataFrame .. py:function:: cleanup_price_by_property_type_agg(df, offset = 2) NI HPI & Standardised Price Statistics by Property Type (Aggregate Table). Standard cleanup with a split to remove trailing index date data. :param df: Raw DataFrame from Excel :param offset: Row offset to find headers (default: 2) :returns: Cleaned DataFrame .. py:function:: cleanup_price_by_property_type(df, offset = 2) NI HPI & Standardised Price Statistics by Property Type (Per Class). Standard cleanup, removing the property class from the table columns. :param df: Raw DataFrame from Excel :param offset: Row offset to find headers (default: 2) :returns: Cleaned DataFrame with simplified column names .. py:function:: cleanup_with_munged_quarters_and_total_rows(df, offset=3) Number of Verified Residential Property Sales. - Regex 'Quarter X' to 'QX' in future 'Sales Quarter' column - Drop Year Total rows - Clear any Newlines from the future 'Sales Year' column - Call ``basic_cleanup`` with offset=3 :param df: Raw DataFrame from Excel :param offset: Number of header rows to skip during cleanup :returns: Cleaned DataFrame .. py:function:: cleanup_with_LGDs(df, offset = 2) Standardised House Price & Index for each Local Government District. Builds multi-index of LGD / Metric [Index,Price] for the 11 NI LGDs. :param df: Raw DataFrame from Excel :param offset: Row offset to find headers (default: 2) :returns: Cleaned DataFrame with LGD multi-index columns .. py:function:: cleanup_combined_year_quarter(df, offset = 2) Cleanup tables with combined 'Q1 2005' year/quarter format. Parses the combined format into Period, Year, and Quarter columns for consistency with other tables. :param df: Raw DataFrame from Excel :param offset: Row offset to find headers (default: 2) :returns: Cleaned DataFrame with Period, Year, Quarter columns .. py:function:: cleanup_missing_year_quarter(df, offset = 1) Standardised House Price & Index for Rural Areas by drive times. Inserts Year/Quarter headers and cleans normally. :param df: Raw DataFrame from Excel :param offset: Row offset to find headers (default: 1) :returns: Cleaned DataFrame .. py:function:: transform_sources(source_df) Transform all raw tables using registered transformation functions. :param source_df: Dictionary of raw DataFrames from Excel file :returns: Dictionary of cleaned/transformed DataFrames :raises RuntimeError: If transformation fails for any table .. py:function:: get_all_tables(force_refresh = False) Get all HPI tables as a dictionary of DataFrames. This is the main entry point for accessing NI House Price Index data. Returns all available tables in a dictionary keyed by table name. :param force_refresh: If True, bypass cache and download fresh data :returns: Table 1 (NI HPI Trends), Table 2/2a-d (HPI by Property Type), Table 3/3a-c (New/Existing Dwelling), Table 4 (Sales Volumes by Property Type), Table 5/5a (HPI/Sales by LGD), Table 6-8 (Urban/Rural/Drive Times), Table 9/9a-d (Average Sale Prices), Table 10a-k (Sales Volumes by Property Type per LGD). :rtype: Dictionary mapping table names to cleaned DataFrames. Tables include .. rubric:: Example >>> tables = get_all_tables() >>> 'Table 1' in tables True .. py:function:: get_hpi_trends(force_refresh = False) Get NI House Price Index trends over time (Table 1). Returns quarterly HPI values, standardised prices, and percentage changes from Q1 2005 to present. :param force_refresh: If True, bypass cache and download fresh data :returns: - Period: Quarterly period (e.g., 2005Q1) - Year: Year - Quarter: Quarter (Q1-Q4) - NI House Price Index: Index value (Q1 2023 = 100) - NI House Standardised Price: Price in GBP - Quarterly Change: Percentage change from previous quarter - Annual Change: Percentage change from same quarter previous year :rtype: DataFrame with columns .. rubric:: Example >>> hpi = get_hpi_trends() >>> 'NI House Price Index' in hpi.columns True .. py:function:: get_sales_volumes(force_refresh = False) Get property sales volumes by type (Table 4). Returns quarterly counts of verified residential property sales broken down by property type. :param force_refresh: If True, bypass cache and download fresh data :returns: - Period: Quarterly period - Year: Year - Quarter: Quarter - Detached: Detached house sales - Semi-Detached: Semi-detached sales - Terrace: Terraced house sales - Apartment: Apartment sales - Total: Total sales :rtype: DataFrame with columns .. rubric:: Example >>> sales = get_sales_volumes() >>> 'Total' in sales.columns True .. py:function:: get_average_prices(force_refresh = False) Get NI average sale prices over time (Table 9). Returns simple mean, median, and standardised (HPI) prices for all property sales. :param force_refresh: If True, bypass cache and download fresh data :returns: - Period: Quarterly period - Year: Year - Quarter: Quarter - Simple Mean: Average sale price - Simple Median: Median sale price - Standardised Price (HPI): Quality-adjusted price :rtype: DataFrame with columns .. rubric:: Example >>> prices = get_average_prices() >>> 'Simple Median' in prices.columns True .. py:function:: get_hpi_by_lgd(force_refresh = False) Get HPI and prices for each Local Government District (Table 5). Returns standardised house prices and HPI for all 11 NI LGDs. :param force_refresh: If True, bypass cache and download fresh data :returns: - Period, Year, Quarter: Time dimensions - {LGD_Name}: For each of the 11 LGDs - Index: HPI value - Price: Standardised price LGDs include: Antrim and Newtownabbey, Ards and North Down, Armagh City Banbridge and Craigavon, Belfast, Causeway Coast and Glens, Derry City and Strabane, Fermanagh and Omagh, Lisburn and Castlereagh, Mid and East Antrim, Mid Ulster, Newry Mourne and Down :rtype: DataFrame with multi-index columns .. rubric:: Example >>> lgd = get_hpi_by_lgd() >>> 'Period' in lgd.columns True .. py:function:: get_hpi_by_property_type(force_refresh = False) Get HPI summary by property type (Table 2). Returns latest quarter's HPI and price statistics broken down by property type (Detached, Semi-Detached, Terrace, Apartment). :param force_refresh: If True, bypass cache and download fresh data :returns: - Property Type: Type of property - Index: HPI value - Percentage Change on Previous Quarter - Percentage Change over 12 months - Standardised Price :rtype: DataFrame with columns .. rubric:: Example >>> by_type = get_hpi_by_property_type() >>> 'Property Type' in by_type.columns True .. py:function:: build() Pulls and cleans up the latest NI House Price Index Data. .. deprecated:: Use :func:`get_all_tables` instead for a more descriptive API. :returns: Dictionary of cleaned DataFrames keyed by table name.