bolster.data_sources.psni.police_ombudsman ========================================== .. py:module:: bolster.data_sources.psni.police_ombudsman .. autoapi-nested-parse:: PSNI Police Ombudsman Complaint Statistics. Provides access to complaint statistics published by the Police Ombudsman for Northern Ireland (PONI), covering: - Annual complaint totals back to 2000/01 - Complaints by policing district (2011/12 onwards) - Allegations by type and subtype (2011/12 onwards) - Complaint closures by outcome (2011/12 onwards) - Quarterly complaint and allegation counts (latest 5 years) Data Source: **Annual**: Police Ombudsman annual statistics bulletin https://www.policeombudsman.org/statistics-and-research/complaint-statistics-in-northern-ireland **Quarterly**: Police Ombudsman quarterly statistical bulletin https://www.policeombudsman.org/statistics-and-research/quarterly-reports Published under the Open Government Licence v3.0. Update Frequency: - Annual: once per year (summer, covering previous financial year) - Quarterly: four times per year Geographic Coverage: Northern Ireland — 11 Policing Districts aligned with LGDs. Time Coverage: - Totals: 2000/01 to present - District / allegation / outcome breakdowns: 2011/12 to present - Quarterly: latest 5 financial years .. rubric:: Example >>> from bolster.data_sources.psni import police_ombudsman >>> df = police_ombudsman.get_latest_complaints() >>> 'year' in df.columns True >>> url = police_ombudsman.get_annual_publication_url() >>> url.startswith("https://") True Attributes ---------- .. autoapisummary:: bolster.data_sources.psni.police_ombudsman.logger Functions --------- .. autoapisummary:: bolster.data_sources.psni.police_ombudsman.get_quarterly_publication_url bolster.data_sources.psni.police_ombudsman.get_annual_publication_url bolster.data_sources.psni.police_ombudsman.parse_annual bolster.data_sources.psni.police_ombudsman.parse_quarterly bolster.data_sources.psni.police_ombudsman.get_latest_complaints bolster.data_sources.psni.police_ombudsman.validate_complaints Module Contents --------------- .. py:data:: logger .. py:function:: get_quarterly_publication_url() Scrape the quarterly-reports page for the latest .xlsx download link. policeombudsman.org returns 403 to default User-Agents; this function uses a browser-like UA via ``bolster.utils.web.session``. :returns: Absolute URL of the latest quarterly Excel spreadsheet. :raises PSNIDataNotFoundError: If the page cannot be retrieved or no .xlsx link is found. .. rubric:: Example >>> url = get_quarterly_publication_url() >>> url.startswith("https://") True .. py:function:: get_annual_publication_url() Scrape the complaint-statistics page for the latest .xlsx download link. policeombudsman.org returns 403 to default User-Agents; this function uses a browser-like UA via ``bolster.utils.web.session``. :returns: Absolute URL of the latest annual Excel spreadsheet. :raises PSNIDataNotFoundError: If the page cannot be retrieved or no .xlsx link is found. .. rubric:: Example >>> url = get_annual_publication_url() >>> url.startswith("https://") True .. py:function:: parse_annual(file_path) Parse the annual Police Ombudsman statistics Excel workbook. Extracts four key tables from the workbook: - ``totals``: total complaints 2000/01 onwards (T1) - ``by_district``: complaints by policing district, 2011/12 onwards (T8) - ``by_allegation_type``: allegations by type & subtype, 2011/12+ (T10) - ``by_outcome``: complaint closures by outcome, 2011/12 onwards (T12) :param file_path: Local path (or file-like) to the downloaded ``.xlsx`` file. :returns: Dict mapping breakdown name to tidy DataFrame. All DataFrames include ``year`` (int, financial-year start) and ``year_label`` (e.g. ``"2024/25"``) columns. :raises PSNIDataNotFoundError: If required sheets cannot be found. .. rubric:: Example >>> from bolster.data_sources.psni import police_ombudsman >>> result = parse_annual.__doc__ # placeholder >>> 'totals' in result False .. py:function:: parse_quarterly(file_path) Parse a quarterly Police Ombudsman statistics Excel workbook. Extracts three tables: - ``complaints``: complaints received by quarter × year - ``allegations``: allegations received by quarter × year - ``by_district``: complaints by policing district × year The quarterly workbook covers the latest five financial years, with four quarters per year plus totals. :param file_path: Local path (or file-like) to the downloaded ``.xlsx`` file. :returns: Dict mapping key name to long-form DataFrame. Each DataFrame includes ``year_label`` (e.g. ``"2024/25"``) and ``year`` (int start year). :raises PSNIDataNotFoundError: If required sheets cannot be found. .. rubric:: Example >>> from bolster.data_sources.psni import police_ombudsman >>> True # real call requires downloaded file True .. py:function:: get_latest_complaints(breakdown = 'totals', force_refresh = False) Download and return the latest Police Ombudsman complaint data. For ``totals``, ``by_district``, ``by_allegation_type``, and ``by_outcome`` the annual publication is used (richest historical coverage). For ``quarterly`` the latest quarterly bulletin is used. :param breakdown: One of: - ``"totals"`` — total complaints 2000/01 to present (default) - ``"by_district"`` — complaints by policing district, 2011/12+ - ``"by_allegation_type"`` — allegations by type, 2011/12+ - ``"by_outcome"`` — closures by outcome, 2011/12+ - ``"quarterly"`` — quarterly complaints, latest 5 financial years :param force_refresh: If ``True``, bypass cache and re-download the source file. :returns: Tidy DataFrame for the requested breakdown. :raises ValueError: If *breakdown* is not one of the recognised values. :raises PSNIDataNotFoundError: If the source cannot be downloaded. .. rubric:: Example >>> df = get_latest_complaints() >>> set(["year", "complaints"]).issubset(df.columns) True >>> df_d = get_latest_complaints("by_district") >>> "district" in df_d.columns True .. py:function:: validate_complaints(df, breakdown) Validate a Police Ombudsman complaints DataFrame. Checks that: - The DataFrame is non-empty. - Required columns for the given *breakdown* are present. - The ``year`` column contains plausible financial-year start years. - Complaint / allegation counts are non-negative. :param df: DataFrame to validate (as returned by :func:`get_latest_complaints`). :param breakdown: One of ``"totals"``, ``"by_district"``, ``"by_allegation_type"``, ``"by_outcome"``, ``"quarterly"``. :returns: ``True`` if validation passes. :raises PSNIValidationError: If any check fails. .. rubric:: Example >>> import pandas as pd >>> df = pd.DataFrame({"year": [2020, 2021], "complaints": [3000, 3100]}) >>> validate_complaints(df, "totals") True