bolster.data_sources.psni.pace ============================== .. py:module:: bolster.data_sources.psni.pace .. autoapi-nested-parse:: PSNI Police and Criminal Evidence (PACE) Order Statistics. Provides access to annual PACE statistics for Northern Ireland, covering: - Stop and search activity (monthly counts by reason: stolen articles, offensive weapons/blade or point, going equipped/prohibited articles, fireworks) - Arrests under PACE by quarter, gender, and whether a solicitor or friend/relative was requested during detention Each annual Excel workbook covers a single financial year (April–March) and is published by PSNI Statistics Branch each May on the PSNI publications index: https://www.psni.police.uk/about-us/our-publications-and-reports/official-statistics/police-and-criminal-evidence-pace-order **URL discovery note**: The PSNI publications index page is protected by Cloudflare and cannot be scraped programmatically. Direct asset URLs at ``/sites/default/files/`` *can* be fetched with a browser-like ``User-Agent`` + ``Referer`` header, but the filename portion of the URL is not predictable (includes the year in ``YYYY.YY`` format and may include a revision suffix such as ``a2``). The :data:`PACE_URLS` dict therefore hard-codes confirmed download URLs. It should be updated each May when PSNI publishes the new edition. Use :func:`get_latest_pace_url` to retrieve the most recent known URL. Data Source: PSNI Statistics Branch https://www.psni.police.uk/about-us/our-publications-and-reports/official-statistics/police-and-criminal-evidence-pace-order Update Frequency: Annual (published each May) Geographic Coverage: Northern Ireland (NI-wide aggregate) Time Coverage: One financial year per workbook; ``PACE_URLS`` spans 2024/25–2025/26 .. rubric:: Example >>> from bolster.data_sources.psni import pace >>> url = pace.get_latest_pace_url() >>> url.startswith("https://") True >>> df = pace.get_latest_pace(breakdown="stop_search") >>> "reason" in df.columns True >>> pace.validate_pace(df, "stop_search") True Attributes ---------- .. autoapisummary:: bolster.data_sources.psni.pace.logger bolster.data_sources.psni.pace.PACE_URLS Functions --------- .. autoapisummary:: bolster.data_sources.psni.pace.get_latest_pace_url bolster.data_sources.psni.pace.parse_stop_search bolster.data_sources.psni.pace.parse_arrests bolster.data_sources.psni.pace.get_latest_pace bolster.data_sources.psni.pace.validate_pace Module Contents --------------- .. py:data:: logger .. py:data:: PACE_URLS :type: dict[str, str] .. py:function:: get_latest_pace_url() Return the download URL for the most recent known PACE annual workbook. The URL is drawn from :data:`PACE_URLS`. Update that dict each May when PSNI publishes a new edition. :returns: Direct download URL for the latest PACE Excel workbook. .. rubric:: Example >>> from bolster.data_sources.psni.pace import get_latest_pace_url >>> url = get_latest_pace_url() >>> url.startswith("https://www.psni.police.uk/") True .. py:function:: parse_stop_search(file_path) Parse Table 1 (monthly stop & search counts) from a PACE Excel workbook. The table covers stop and search activity for a single financial year, broken down by month (Apr–Mar) and search reason. :param file_path: Local path to the downloaded PACE Excel workbook. :returns: - ``financial_year``: e.g. ``"2025/26"`` - ``year``: int, start year of financial year (e.g. ``2025``) - ``month``: month abbreviation, e.g. ``"Apr"`` - ``reason``: search reason category - ``metric``: ``"Searches"`` or ``"Arrests"`` - ``count``: integer count :rtype: DataFrame with columns :raises PSNIValidationError: If the expected table structure is not found. .. rubric:: Example >>> import tempfile, pathlib >>> df = parse_stop_search("/tmp/pace_2025_26.xlsx") # doctest: +SKIP >>> list(df.columns) # doctest: +SKIP ['financial_year', 'year', 'month', 'reason', 'metric', 'count'] .. py:function:: parse_arrests(file_path) Parse Table 2 (quarterly PACE arrests) from a PACE Excel workbook. The table covers arrests under PACE for a single financial year, broken down by quarter and category (total, male, female, unknown/other, and whether a solicitor or friend/relative was requested during detention). :param file_path: Local path to the downloaded PACE Excel workbook. :returns: - ``financial_year``: e.g. ``"2025/26"`` - ``year``: int, start year of financial year (e.g. ``2025``) - ``quarter``: quarter label, e.g. ``"Q1 (Apr–Jun)"`` - ``category``: demographic/request category - ``count``: integer count :rtype: DataFrame with columns :raises PSNIValidationError: If the expected table structure is not found. .. rubric:: Example >>> df = parse_arrests("/tmp/pace_2025_26.xlsx") # doctest: +SKIP >>> list(df.columns) # doctest: +SKIP ['financial_year', 'year', 'quarter', 'category', 'count'] .. py:function:: get_latest_pace(breakdown = 'stop_search', force_refresh = False) Download and return the latest PACE statistics. Downloads the most recent PACE Excel workbook (from :data:`PACE_URLS`), caches it locally for one year, and returns either the stop & search or the arrests breakdown. :param breakdown: Which table to return — ``"stop_search"`` (Table 1, monthly stop & search counts) or ``"arrests"`` (Table 2, quarterly arrest demographics). Default: ``"stop_search"``. :param force_refresh: If ``True``, bypass the cache and re-download. Default: ``False``. :returns: DataFrame — see :func:`parse_stop_search` or :func:`parse_arrests` for column descriptions. :raises ValueError: If ``breakdown`` is not ``"stop_search"`` or ``"arrests"``. :raises PSNIDataNotFoundError: If the download fails. :raises PSNIValidationError: If the workbook structure is not as expected. .. rubric:: Example >>> df = get_latest_pace(breakdown="stop_search") # doctest: +SKIP >>> "reason" in df.columns True >>> df = get_latest_pace(breakdown="arrests") # doctest: +SKIP >>> "category" in df.columns True .. py:function:: validate_pace(df, breakdown) Validate a PACE DataFrame for structural integrity. Checks that the DataFrame has the required columns and contains at least some data. :param df: DataFrame returned by :func:`parse_stop_search` or :func:`parse_arrests`. :param breakdown: ``"stop_search"`` or ``"arrests"`` — selects the expected column set. :returns: ``True`` if the DataFrame passes all checks. :raises PSNIValidationError: If any check fails (empty DataFrame, missing columns, non-positive counts). .. rubric:: Example >>> import pandas as pd >>> from bolster.data_sources.psni.pace import validate_pace, PSNIValidationError >>> validate_pace(pd.DataFrame(), "stop_search") Traceback (most recent call last): ... bolster.data_sources.psni._base.PSNIValidationError: PACE DataFrame is empty