bolster.data_sources.psni.stop_and_search ========================================= .. py:module:: bolster.data_sources.psni.stop_and_search .. autoapi-nested-parse:: PSNI Stop and Search Statistics. Provides access to Police Service of Northern Ireland stop and search data, covering individual stop and search records from 2017/18 to the latest available financial year. Data includes: - Financial year and quarter (quarterly breakdowns) - Legislation used (Misuse of Drugs Act, PACE, Justice & Security Act, etc.) - PACE-specific reasons for search (stolen articles, prohibited articles, blade/point, fireworks) - Subject demographics: age group and gender - Geographic level: Northern Ireland-wide (no district breakdown in this dataset) Data Source: **Primary Source**: OpenDataNI — Stop and Search Statistics 2017/18–2024/25 https://www.opendatani.gov.uk/dataset/stop-and-search Data is published by the PSNI under the Open Government Licence v3.0. Update Frequency: Annual (full dataset refreshed with each release) Geographic Coverage: Northern Ireland (NI-wide only — no district breakdown) Time Coverage: 2017/18 financial year to present Row count: ~199,000 individual stop and search records .. rubric:: Example >>> from bolster.data_sources.psni import stop_and_search >>> df = stop_and_search.get_latest_stop_and_search() >>> 'financial_year' in df.columns True >>> stop_and_search.validate_stop_and_search(df) True Attributes ---------- .. autoapisummary:: bolster.data_sources.psni.stop_and_search.logger bolster.data_sources.psni.stop_and_search.OPENDATANI_API bolster.data_sources.psni.stop_and_search.DATASET_ID bolster.data_sources.psni.stop_and_search.FALLBACK_CSV_URL bolster.data_sources.psni.stop_and_search.CACHE_TTL_HOURS bolster.data_sources.psni.stop_and_search.COLUMN_RENAMES bolster.data_sources.psni.stop_and_search.QUARTER_ORDER bolster.data_sources.psni.stop_and_search.AGE_GROUP_ORDER Functions --------- .. autoapisummary:: bolster.data_sources.psni.stop_and_search.get_latest_dataset_url bolster.data_sources.psni.stop_and_search.get_latest_stop_and_search bolster.data_sources.psni.stop_and_search.validate_stop_and_search Module Contents --------------- .. py:data:: logger .. py:data:: OPENDATANI_API :value: 'https://admin.opendatani.gov.uk/api/3/action' .. py:data:: DATASET_ID :value: '421d96c1-fa5b-43e7-914c-b9a13e163d33' .. py:data:: FALLBACK_CSV_URL :value: 'https://admin.opendatani.gov.uk/dataset/421d96c1-fa5b-43e7-914c-b9a13e163d33/resource/73fcba18-4... .. py:data:: CACHE_TTL_HOURS :value: 720 .. py:data:: COLUMN_RENAMES :type: dict[str, str] .. py:data:: QUARTER_ORDER :value: ['April to June', 'July to September', 'October to December', 'January to March'] .. py:data:: AGE_GROUP_ORDER :value: ['Under 18', '18 to 25', '26 to 35', '36 to 45', '46 to 55', '56 to 65', 'Over 65', 'Not Specified'] .. py:function:: get_latest_dataset_url() Query the OpenDataNI CKAN API to find the latest Stop and Search CSV URL. Fetches resource metadata for the stop-and-search dataset from the OpenDataNI CKAN API and returns the download URL for the CSV resource. Falls back to the known direct URL if the API request fails. :returns: Download URL for the latest stop and search CSV file. .. rubric:: Example >>> url = get_latest_dataset_url() >>> url.startswith("https://") True >>> url.endswith(".csv") True .. py:function:: get_latest_stop_and_search(force_refresh = False) Download and return the latest PSNI Stop and Search dataset. Fetches the current stop and search data from OpenDataNI, caches it locally for ~30 days, and returns a cleaned DataFrame with snake_case column names and appropriate dtypes. The dataset covers individual stop and search records for Northern Ireland from financial year 2017/18 to the most recently published year. Note that the dataset does **not** include a district-level geographic breakdown — all records are at Northern Ireland level. :param force_refresh: If True, bypass the local cache and re-download the data. :returns: - financial_year (category): e.g. ``"2023/24"`` - geographical_level (category): always ``"Northern Ireland"`` - legislation (category): legislation under which the search was conducted - pace_reason_stolen_articles (bool): PACE reason — stolen articles - pace_reason_prohibited_articles (bool): PACE reason — prohibited articles - pace_reason_blade_or_point (bool): PACE reason — blade or point - pace_reason_fireworks (bool): PACE reason — fireworks - quarter (Categorical[ordered]): quarter label, e.g. ``"April to June"`` - age_group (Categorical[ordered]): age band of the subject - gender (category): subject gender :rtype: DataFrame with columns :raises PSNIDataNotFoundError: If the download fails. :raises PSNIValidationError: If the downloaded file does not match the expected schema. .. rubric:: Example >>> df = get_latest_stop_and_search() >>> len(df) > 100_000 True >>> sorted(df['financial_year'].cat.categories.tolist()) # doctest: +SKIP ['2017/18', '2018/19', '2019/20', '2020/21', '2021/22', '2022/23', '2023/24', '2024/25'] .. py:function:: validate_stop_and_search(df) Validate the integrity of a Stop and Search DataFrame. Checks that the DataFrame has the expected shape, required columns, a sensible set of financial years, no unexpected null values in key fields, and that PACE boolean columns contain only booleans. :param df: DataFrame to validate (e.g. from :func:`get_latest_stop_and_search`). :returns: ``True`` if all checks pass. :raises PSNIValidationError: If any check fails, with a descriptive message. .. rubric:: Example >>> df = get_latest_stop_and_search() >>> validate_stop_and_search(df) True