bolster.data_sources.psni.crime_statistics

PSNI Police Recorded Crime Statistics.

Provides access to police recorded crime statistics for Northern Ireland.

Data includes: - Monthly crime counts by crime type and policing district - Geographic breakdown by 11 policing districts (aligned with LGDs) - Outcome data (charges, cautions, etc.) by district - Historical time series from April 2001 to December 2021 - Integration with NISRA datasets via LGD and NUTS3 codes

Data Source:

Primary Source: OpenDataNI - Police Recorded Crime in Northern Ireland

https://www.opendatani.gov.uk/dataset/police-recorded-crime-in-northern-ireland

DATA LIMITATION — STALE SINCE JANUARY 2022:

The OpenDataNI dataset was last updated 27 January 2022 and only contains data through December 2021. PSNI stopped pushing updates to OpenDataNI after that date. The PSNI official statistics page publishes quarterly Excel files with current data, but psni.police.uk is protected by Cloudflare which blocks automated downloads.

Calling get_latest_crime_statistics() will raise PSNIDataStaleError to make this limitation explicit. The historical data (Apr 2001–Dec 2021) remains accessible via get_historical_crime_statistics().

For 2022+ data, consult PSNI directly: - Official stats page: https://www.psni.police.uk/about-us/our-publications-and-reports/official-statistics/police-recorded-crime-statistics - Contact: statistics@psni.police.uk

Update Frequency: Quarterly (end of Jan, May, Jul, Oct) — STALE SINCE 2022 Geographic Coverage: Northern Ireland (11 policing districts + NI total) Reference Date: Month of crime occurrence Time Coverage: April 2001 to December 2021

Example

>>> from bolster.data_sources.psni import crime_statistics
>>> df = crime_statistics.get_historical_crime_statistics()
>>> sorted(df.columns.tolist())
['calendar_year', 'count', 'crime_type', 'data_measure', 'date', 'lgd_code', 'month', 'nuts3_code', 'nuts3_name', 'policing_district']
>>> belfast_lgd = crime_statistics.get_lgd_code('Belfast City')
>>> belfast_lgd
'N09000003'

Attributes

logger

CRIME_STATISTICS_URL

DATA_GUIDE_URL

PSNI_OFFICIAL_STATS_URL

PSNI_STATISTICS_EMAIL

Functions

get_data_source_info()

Get information about crime statistics data sources.

parse_crime_statistics_file(file_path[, ...])

Parse PSNI crime statistics CSV file.

get_latest_crime_statistics([force_refresh, ...])

Raises PSNIDataStaleError — use get_historical_crime_statistics() instead.

get_historical_crime_statistics([force_refresh, ...])

Get historical police recorded crime statistics (April 2001 – December 2021).

validate_crime_statistics(df)

Validate crime statistics data integrity.

filter_by_district(df, district)

Filter crime statistics to specific policing district(s).

filter_by_crime_type(df, crime_type)

Filter crime statistics to specific crime type(s).

filter_by_date_range(df[, start_date, end_date])

Filter crime statistics to a date range.

get_total_crimes_by_district(df[, year])

Calculate total recorded crimes by policing district.

get_crime_trends(df[, crime_type, district, measure])

Get monthly crime trends for a specific crime type and district.

get_outcome_rates_by_district(df[, year, crime_type])

Calculate crime outcome rates by policing district.

get_available_crime_types(df)

Get list of all crime types in the dataset.

get_available_districts(df)

Get list of all policing districts in the dataset.

Module Contents

bolster.data_sources.psni.crime_statistics.logger[source]
bolster.data_sources.psni.crime_statistics.CRIME_STATISTICS_URL = 'https://admin.opendatani.gov.uk/dataset/80dc9542-7b2a-48f5-bbf4-ccc7040d36af/resource/6fd51851-d...[source]
bolster.data_sources.psni.crime_statistics.DATA_GUIDE_URL = 'https://admin.opendatani.gov.uk/dataset/80dc9542-7b2a-48f5-bbf4-ccc7040d36af/resource/51cd6a9e-6...[source]
bolster.data_sources.psni.crime_statistics.PSNI_OFFICIAL_STATS_URL = 'https://www.psni.police.uk/about-us/our-publications-and-reports/official-statistics/police-reco...[source]
bolster.data_sources.psni.crime_statistics.PSNI_STATISTICS_EMAIL = 'statistics@psni.police.uk'[source]
bolster.data_sources.psni.crime_statistics.get_data_source_info()[source]

Get information about crime statistics data sources.

Returns a dictionary with URLs and contact information for accessing PSNI crime statistics. Use this when you need data beyond December 2021.

Returns:

  • opendatani_url: OpenDataNI dataset URL (data through Dec 2021)

  • data_guide_url: PDF data guide URL

  • psni_official_url: PSNI official statistics page (current data)

  • contact_email: PSNI Statistics Branch email

  • data_limitation: Description of OpenDataNI data limitations

  • last_update: Last known update date for OpenDataNI

Return type:

Dictionary with keys

Example

>>> info = get_data_source_info()
>>> sorted(info.keys())
['contact_email', 'data_guide_url', 'data_limitation', 'last_update', 'opendatani_url', 'psni_official_url']
bolster.data_sources.psni.crime_statistics.parse_crime_statistics_file(file_path, add_geographic_codes=True)[source]

Parse PSNI crime statistics CSV file.

The file is in long format with columns for year, month, district, crime type, data measure, and count. This function reads the CSV, cleans column names, adds date parsing, and optionally adds LGD and NUTS3 geographic codes for cross-dataset integration.

Parameters:
  • file_path (str | pathlib.Path) – Path to the crime statistics CSV file

  • add_geographic_codes (bool) – If True, add LGD and NUTS3 code columns

Returns:

  • calendar_year: int (year of crime)

  • month: str (month name: Apr, May, …, Dec)

  • policing_district: str (district name or “Northern Ireland”)

  • crime_type: str (Home Office crime classification)

  • data_measure: str (type of measure - crime count, outcome number, outcome rate)

  • count: float (value - can be count or percentage)

  • date: datetime (first day of month)

  • lgd_code: str (ONS LGD code, if add_geographic_codes=True)

  • nuts3_code: str (NUTS3 region code, if add_geographic_codes=True)

  • nuts3_name: str (NUTS3 region name, if add_geographic_codes=True)

Return type:

DataFrame with columns

Raises:

PSNIValidationError – If file structure is unexpected

Example

>>> path = download_file(CRIME_STATISTICS_URL, cache_ttl_hours=24*7)
>>> df = parse_crime_statistics_file(path)
>>> 'crime_type' in df.columns
True
>>> len(df) > 0
True
bolster.data_sources.psni.crime_statistics.get_latest_crime_statistics(force_refresh=False, add_geographic_codes=True)[source]

Raises PSNIDataStaleError — use get_historical_crime_statistics() instead.

The OpenDataNI source was last updated January 2022. PSNI’s official site publishes current data but is Cloudflare-protected and inaccessible to automated downloads. Use get_historical_crime_statistics() to access the data available (Apr 2001–Dec 2021).

Raises:

PSNIDataStaleError – Always — this data source has no accessible update.

bolster.data_sources.psni.crime_statistics.get_historical_crime_statistics(force_refresh=False, add_geographic_codes=True)[source]

Get historical police recorded crime statistics (April 2001 – December 2021).

Downloads the crime statistics CSV from OpenDataNI. This dataset covers April 2001 through December 2021 and has not been updated since January 2022. For 2022+ data, consult PSNI directly.

Parameters:
  • force_refresh (bool) – If True, bypass cache and download fresh data

  • add_geographic_codes (bool) – If True, add LGD and NUTS3 code columns

Returns:

date, calendar_year, month, policing_district, crime_type, data_measure, count, lgd_code, nuts3_code, nuts3_name

Return type:

DataFrame with columns

Raises:

Example

>>> df = get_historical_crime_statistics()
>>> sorted(df.columns.tolist())
['calendar_year', 'count', 'crime_type', 'data_measure', 'date', 'lgd_code', 'month', 'nuts3_code', 'nuts3_name', 'policing_district']
>>> df['date'].max().year
2021
bolster.data_sources.psni.crime_statistics.validate_crime_statistics(df)[source]

Validate crime statistics data integrity.

Performs sanity checks on the crime statistics data: - Non-negative crime counts - Reasonable date ranges - Expected policing districts present - No unexpected missing data

Parameters:

df (pandas.DataFrame) – DataFrame from parse_crime_statistics_file or get_latest_crime_statistics

Returns:

True if validation passes

Raises:

PSNIValidationError – If validation fails

Return type:

bool

Example

>>> df = get_latest_crime_statistics()
>>> validate_crime_statistics(df)
True
bolster.data_sources.psni.crime_statistics.filter_by_district(df, district)[source]

Filter crime statistics to specific policing district(s).

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

  • district (str | list[str]) – District name(s) to filter (e.g., “Belfast City” or [“Belfast City”, “Derry City & Strabane”])

Returns:

Filtered DataFrame

Return type:

pandas.DataFrame

Example

>>> df = get_latest_crime_statistics()
>>> belfast = filter_by_district(df, "Belfast City")
>>> belfast['policing_district'].unique().tolist()
['Belfast City']
>>>
>>> # Multiple districts
>>> cities = filter_by_district(df, ["Belfast City", "Derry City & Strabane"])
>>> len(cities['policing_district'].unique()) == 2
True
bolster.data_sources.psni.crime_statistics.filter_by_crime_type(df, crime_type)[source]

Filter crime statistics to specific crime type(s).

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

  • crime_type (str | list[str]) – Crime type(s) to filter (e.g., “Burglary” or [“Violence with injury”, “Robbery”])

Returns:

Filtered DataFrame

Return type:

pandas.DataFrame

Example

>>> df = get_latest_crime_statistics()
>>> violence = filter_by_crime_type(df, "Violence with injury (including homicide & death/serious injury by unlawful driving)")
>>> len(violence) > 0
True
bolster.data_sources.psni.crime_statistics.filter_by_date_range(df, start_date=None, end_date=None)[source]

Filter crime statistics to a date range.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

  • start_date (str | datetime.datetime | None) – Start date (inclusive), e.g., “2020-01-01” or datetime

  • end_date (str | datetime.datetime | None) – End date (inclusive), e.g., “2021-12-31” or datetime

Returns:

Filtered DataFrame

Return type:

pandas.DataFrame

Example

>>> df = get_latest_crime_statistics()
>>> # Get 2020 data
>>> df_2020 = filter_by_date_range(df, "2020-01-01", "2020-12-31")
>>> df_2020['calendar_year'].unique().tolist()
[2020]
>>>
>>> # Get data from 2018 onwards
>>> recent = filter_by_date_range(df, start_date="2018-01-01")
>>> len(recent) > 0
True
bolster.data_sources.psni.crime_statistics.get_total_crimes_by_district(df, year=None)[source]

Calculate total recorded crimes by policing district.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

  • year (int | None) – Optional year to filter (uses all years if None)

Returns:

policing_district, lgd_code, nuts3_code, total_crimes

Return type:

DataFrame with columns

Example

>>> df = get_latest_crime_statistics()
>>> totals_2021 = get_total_crimes_by_district(df, year=2021)
>>> sorted(totals_2021.columns.tolist())
['lgd_code', 'nuts3_code', 'policing_district', 'total_crimes']

Get monthly crime trends for a specific crime type and district.

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

  • crime_type (str) – Crime type to analyze (default: total crimes)

  • district (str) – Policing district (default: Northern Ireland total)

  • measure (str) – Data measure to use (default: Police Recorded Crime)

Returns:

date, calendar_year, month, count

Return type:

DataFrame with columns

Example

>>> df = get_latest_crime_statistics()
>>> trends = get_crime_trends(df, district="Belfast City")
>>> sorted(trends.columns.tolist())
['calendar_year', 'count', 'date', 'month']
>>> len(trends) > 0
True
bolster.data_sources.psni.crime_statistics.get_outcome_rates_by_district(df, year=None, crime_type='Total police recorded crime')[source]

Calculate crime outcome rates by policing district.

Outcome rate represents the percentage of crimes with an outcome (charge, caution, community resolution, etc.)

Parameters:
  • df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

  • year (int | None) – Optional year to filter (uses all years if None)

  • crime_type (str) – Crime type to analyze (default: total crimes)

Returns:

policing_district, lgd_code, average_outcome_rate

Return type:

DataFrame with columns

Example

>>> df = get_latest_crime_statistics()
>>> outcomes = get_outcome_rates_by_district(df, year=2021)
>>> 'average_outcome_rate' in outcomes.columns
True
bolster.data_sources.psni.crime_statistics.get_available_crime_types(df)[source]

Get list of all crime types in the dataset.

Parameters:

df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

Returns:

Sorted list of crime type names

Return type:

list[str]

Example

>>> df = get_latest_crime_statistics()
>>> crime_types = get_available_crime_types(df)
>>> isinstance(crime_types, list)
True
>>> 'Total police recorded crime' in crime_types
True
bolster.data_sources.psni.crime_statistics.get_available_districts(df)[source]

Get list of all policing districts in the dataset.

Parameters:

df (pandas.DataFrame) – DataFrame from get_latest_crime_statistics

Returns:

Sorted list of district names

Return type:

list[str]

Example

>>> df = get_latest_crime_statistics()
>>> districts = get_available_districts(df)
>>> isinstance(districts, list)
True
>>> 'Northern Ireland' in districts
True