bolster.data_sources.nisra.planning_statistics ============================================== .. py:module:: bolster.data_sources.nisra.planning_statistics .. autoapi-nested-parse:: Northern Ireland Planning Activity Statistics. Quarterly and annual planning application statistics for Northern Ireland, published by the Department for Infrastructure (DfI). Provides counts of planning applications received, decided, approved and withdrawn, both NI-wide as a quarterly time series back to Q1 2002/03 and broken down by the 11 local councils. Data Source: **Hub page**: https://www.infrastructure-ni.gov.uk/articles/planning-activity-statistics The module scrapes the hub page for the latest publication, then scrapes that publication page for the quarterly statistical tables Excel file. Update Frequency: Quarterly (provisional) plus an annual final release after each financial year ends (April-March). Geographic Coverage: Northern Ireland - whole-country totals plus the 11 local council areas (Antrim & Newtownabbey, Ards & North Down, Armagh City Banbridge & Craigavon, Belfast, Causeway Coast & Glens, Derry City & Strabane, Fermanagh & Omagh, Lisburn & Castlereagh, Mid & East Antrim, Mid Ulster, Newry Mourne & Down). Time Series: Q1 2002/03 onwards for the NI-wide series (sheet 1.1). Recent quarters plus current/prior financial year for council-area data (sheet 1.2). .. rubric:: Example >>> from bolster.data_sources.nisra import planning_statistics >>> df = planning_statistics.get_latest_data() >>> 'applications_received' in df.columns True Attributes ---------- .. autoapisummary:: bolster.data_sources.nisra.planning_statistics.logger bolster.data_sources.nisra.planning_statistics.PLANNING_HUB_URL bolster.data_sources.nisra.planning_statistics.INFRA_BASE_URL Functions --------- .. autoapisummary:: bolster.data_sources.nisra.planning_statistics.get_latest_publication_url bolster.data_sources.nisra.planning_statistics.get_latest_xlsx_url bolster.data_sources.nisra.planning_statistics.parse_planning_applications bolster.data_sources.nisra.planning_statistics.parse_planning_by_council bolster.data_sources.nisra.planning_statistics.get_latest_data bolster.data_sources.nisra.planning_statistics.get_latest_council_data bolster.data_sources.nisra.planning_statistics.validate_data bolster.data_sources.nisra.planning_statistics.get_annual_totals bolster.data_sources.nisra.planning_statistics.get_council_summary Module Contents --------------- .. py:data:: logger .. py:data:: PLANNING_HUB_URL :value: 'https://www.infrastructure-ni.gov.uk/articles/planning-activity-statistics' .. py:data:: INFRA_BASE_URL :value: 'https://www.infrastructure-ni.gov.uk' .. py:function:: get_latest_publication_url() Scrape the planning statistics hub page for the latest publication page URL. The hub lists publications newest-first; the first non-guidance link with a "Northern Ireland planning statistics" title is the latest release (provisional quarterly or final annual). :returns: URL of the latest publication page (containing the XLSX/ODS files). :raises NISRADataNotFoundError: If the hub page cannot be fetched or no publication link can be located. .. rubric:: Example >>> url = get_latest_publication_url() >>> url.startswith("https://www.infrastructure-ni.gov.uk/publications/") True .. py:function:: get_latest_xlsx_url(publication_url = None) Find the XLSX file URL on a planning statistics publication page. :param publication_url: Publication page URL. If None, calls :func:`get_latest_publication_url` to find the most recent. :returns: URL of the quarterly statistical tables Excel file. :raises NISRADataNotFoundError: If the publication page has no XLSX link. .. rubric:: Example >>> url = get_latest_xlsx_url() >>> url.endswith(".xlsx") True .. py:function:: parse_planning_applications(file_path) Parse the NI-wide quarterly applications time series (sheet ``1.1``). Sheet 1.1 contains the quarterly headline series back to Q1 2002/03 with a merged-cell Year column (forward-filled here) plus an annual total row per financial year (filtered out). :param file_path: Path to a downloaded planning statistics XLSX file. :returns: - ``date`` (datetime): First day of the calendar quarter - ``financial_year`` (str): e.g. ``"2024/25"`` - ``quarter`` (str): One of ``Q1``-``Q4`` - ``year`` (int): Calendar year of ``date`` - ``applications_received`` (int) - ``applications_decided`` (int) - ``applications_approved`` (int) - ``applications_withdrawn`` (int) - ``approval_rate`` (float): Proportion 0.0-1.0 - ``mid_year_population`` (int): NI mid-year population estimate used for that financial year's per-10,000 rates - ``applications_per_10k`` (float): Applications received per 10,000 population :rtype: DataFrame with one row per quarter and columns :raises NISRAValidationError: If the sheet structure is unexpected. .. rubric:: Example >>> import bolster.data_sources.nisra.planning_statistics as ps >>> path = ps.get_latest_data.__wrapped__ if False else None # docs only .. py:function:: parse_planning_by_council(file_path) Parse the council-area planning applications data (sheet ``1.2``). Sheet 1.2 stacks five sub-tables (received / decided / approved / approval-rate / withdrawn) with one row per quarter and one column per of the 11 NI councils. This function unpivots all five into a tidy long DataFrame. :param file_path: Path to a downloaded planning statistics XLSX file. :returns: - ``date`` (datetime), ``financial_year`` (str), ``quarter`` (str), ``year`` (int) - ``council`` (str): Council name - ``applications_received`` (int | None) - ``applications_decided`` (int | None) - ``applications_approved`` (int | None) - ``applications_withdrawn`` (int | None) - ``approval_rate`` (float | None): Proportion 0.0-1.0 :rtype: DataFrame with one row per (date, council) and columns :raises NISRAValidationError: If the sheet structure is unexpected (no council header rows / parseable date rows found). .. py:function:: get_latest_data(force_refresh = False) Download and parse the NI-wide quarterly planning applications series. :param force_refresh: If True, bypass the local cache and re-download. :returns: DataFrame from :func:`parse_planning_applications` (NI-wide quarterly time series, sheet 1.1). :raises NISRADataNotFoundError: If the latest publication or XLSX cannot be located. :raises NISRAValidationError: If the downloaded file cannot be parsed. .. rubric:: Example >>> df = get_latest_data() >>> 'applications_received' in df.columns True .. py:function:: get_latest_council_data(force_refresh = False) Download and parse the council-area planning applications data. :param force_refresh: If True, bypass the local cache and re-download. :returns: DataFrame from :func:`parse_planning_by_council` (one row per date, council). :raises NISRADataNotFoundError: If the latest publication or XLSX cannot be located. :raises NISRAValidationError: If the downloaded file cannot be parsed. .. rubric:: Example >>> df = get_latest_council_data() >>> 'council' in df.columns True .. py:function:: validate_data(df) Validate an NI-wide planning applications DataFrame. :param df: DataFrame from :func:`get_latest_data` or :func:`parse_planning_applications`. :returns: ``True`` if all checks pass. :raises NISRAValidationError: If the DataFrame is empty, missing required columns, has implausible values, or has too short a time series. .. rubric:: Example >>> df = get_latest_data() >>> validate_data(df) True .. py:function:: get_annual_totals(df) Aggregate a quarterly DataFrame to annual (financial-year) totals. :param df: DataFrame from :func:`get_latest_data`. :returns: - ``financial_year`` (str) - ``applications_received`` (int) - ``applications_decided`` (int) - ``applications_approved`` (int) - ``applications_withdrawn`` (int) - ``approval_rate`` (float): Weighted by decisions (approved / decided) - ``quarters`` (int): Number of quarters aggregated (4 except for the current in-progress year) :rtype: DataFrame with one row per financial year and columns .. rubric:: Example >>> df = get_latest_data() >>> annual = get_annual_totals(df) >>> 'applications_received' in annual.columns True .. py:function:: get_council_summary(council_df, financial_year = None) Summarise council-area data by council across all (or one) financial year. :param council_df: DataFrame from :func:`get_latest_council_data`. :param financial_year: Optional financial year to filter to (e.g. ``"2024/25"``). If None, summarises across all available quarters. :returns: DataFrame with one row per council, sorted by ``applications_received`` descending. .. rubric:: Example >>> council_df = get_latest_council_data() >>> summary = get_council_summary(council_df, financial_year='2024/25') >>> 'council' in summary.columns True