bolster.data_sources.dva ======================== .. py:module:: bolster.data_sources.dva .. autoapi-nested-parse:: DVA (Driver & Vehicle Agency) Monthly Tests Statistics Module. This module provides access to Northern Ireland's Driver & Vehicle Agency monthly test statistics, including vehicle tests, driver tests, and theory tests. Data is published monthly by the Department for Infrastructure (DfI) Northern Ireland. Data Coverage: - Vehicle Tests (Full & Retests): April 2014 - Present - Driver Tests: April 2014 - Present - Theory Tests: April 2014 - Present - Test breakdowns by category and test centre Data Source: Department for Infrastructure Northern Ireland provides Driver & Vehicle Agency statistics through their publications portal at https://www.infrastructure-ni.gov.uk/publications?f%5B0%5D=type%3Astatisticalreports. The DVA publishes monthly test statistics covering vehicle tests, driver tests, and theory tests conducted across Northern Ireland, providing comprehensive data on driving and vehicle testing performance. Update Frequency: Monthly publications are released covering the previous month's test statistics. DVA data is published by the Department for Infrastructure Analytics Branch approximately 4-6 weeks after the reference month ends, providing consistent monthly updates on driving test performance and vehicle testing statistics across Northern Ireland. Publication Details: - Published by: Department for Infrastructure (DfI) - Analytics Branch - Data Source: DVA Business & Regulatory Statistics .. rubric:: Example >>> from bolster.data_sources import dva >>> # Get latest vehicle test statistics >>> df = dva.get_latest_vehicle_tests() >>> 'tests_conducted' in df.columns True >>> # Get latest driver test statistics >>> df = dva.get_latest_driver_tests() >>> len(df) > 0 True >>> # Get latest theory test statistics >>> df = dva.get_latest_theory_tests() >>> len(df) > 0 True >>> # Get all test types combined >>> data = dva.get_latest_all_tests() >>> sorted(data.keys()) ['driver', 'theory', 'vehicle'] Attributes ---------- .. autoapisummary:: bolster.data_sources.dva.logger bolster.data_sources.dva.CACHE_DIR bolster.data_sources.dva.DVA_PUBLICATIONS_URL bolster.data_sources.dva.DVA_SEARCH_TERM Exceptions ---------- .. autoapisummary:: bolster.data_sources.dva.DVADataError bolster.data_sources.dva.DVADataNotFoundError Functions --------- .. autoapisummary:: bolster.data_sources.dva.get_latest_dva_publication_url bolster.data_sources.dva.parse_vehicle_tests bolster.data_sources.dva.parse_driver_tests bolster.data_sources.dva.parse_theory_tests bolster.data_sources.dva.get_latest_vehicle_tests bolster.data_sources.dva.get_latest_driver_tests bolster.data_sources.dva.get_latest_theory_tests bolster.data_sources.dva.get_latest_all_tests bolster.data_sources.dva.get_tests_by_year bolster.data_sources.dva.get_tests_by_month bolster.data_sources.dva.calculate_growth_rates bolster.data_sources.dva.get_summary_statistics bolster.data_sources.dva.validate_dva_test_data Module Contents --------------- .. py:data:: logger .. py:data:: CACHE_DIR .. py:data:: DVA_PUBLICATIONS_URL :value: 'https://www.infrastructure-ni.gov.uk/publications/type/statistics' .. py:data:: DVA_SEARCH_TERM :value: 'driver-and-vehicle-agency-monthly-tests-conducted' .. py:exception:: DVADataError Bases: :py:obj:`Exception` Base exception for DVA data errors. Initialize self. See help(type(self)) for accurate signature. .. py:exception:: DVADataNotFoundError Bases: :py:obj:`DVADataError` Data file not available. Initialize self. See help(type(self)) for accurate signature. .. py:function:: get_latest_dva_publication_url() Get the URL of the latest DVA Monthly Tests publication. Attempts to find the most recent DVA monthly tests statistics publication by trying recent months in reverse order. :returns: Tuple of (excel_url, publication_title, publication_date) :raises DVADataNotFoundError: If unable to find any recent publication .. rubric:: Example >>> url, title, pub_date = get_latest_dva_publication_url() >>> url.startswith('https://') True .. py:function:: parse_vehicle_tests(file_path) Parse DVA vehicle tests data from Excel file. Extracts full vehicle tests conducted from Table 1.1a. :param file_path: Path to the DVA Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - tests_conducted: int (full tests conducted) - rolling_12_month_total: int (optional, rolling 12-month sum) :rtype: DataFrame with columns .. rubric:: Example >>> url, _, _ = get_latest_dva_publication_url() >>> path = _download_file(url) >>> df = parse_vehicle_tests(path) >>> 'tests_conducted' in df.columns True >>> len(df) > 0 True .. py:function:: parse_driver_tests(file_path) Parse DVA driver tests data from Excel file. Extracts driver tests conducted from Table 2.1. :param file_path: Path to the DVA Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - tests_conducted: int (driver tests conducted) - rolling_12_month_total: int (optional, rolling 12-month sum) :rtype: DataFrame with columns .. rubric:: Example >>> url, _, _ = get_latest_dva_publication_url() >>> path = _download_file(url) >>> df = parse_driver_tests(path) >>> 'tests_conducted' in df.columns True >>> len(df) > 0 True .. py:function:: parse_theory_tests(file_path) Parse DVA theory tests data from Excel file. Extracts theory tests conducted from Table 3.1. :param file_path: Path to the DVA Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - tests_conducted: int (theory tests conducted) - rolling_12_month_total: int (optional, rolling 12-month sum) :rtype: DataFrame with columns .. rubric:: Example >>> url, _, _ = get_latest_dva_publication_url() >>> path = _download_file(url) >>> df = parse_theory_tests(path) >>> 'tests_conducted' in df.columns True >>> len(df) > 0 True .. py:function:: get_latest_vehicle_tests(force_refresh = False) Get the latest vehicle test statistics. Downloads and parses the most recent DVA monthly tests publication. Results are cached for 7 days unless force_refresh=True. :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with monthly vehicle test data .. rubric:: Example >>> df = get_latest_vehicle_tests() >>> 'tests_conducted' in df.columns True .. py:function:: get_latest_driver_tests(force_refresh = False) Get the latest driver test statistics. Downloads and parses the most recent DVA monthly tests publication. Results are cached for 7 days unless force_refresh=True. :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with monthly driver test data .. rubric:: Example >>> df = get_latest_driver_tests() >>> len(df) > 0 True .. py:function:: get_latest_theory_tests(force_refresh = False) Get the latest theory test statistics. Downloads and parses the most recent DVA monthly tests publication. Results are cached for 7 days unless force_refresh=True. :param force_refresh: If True, bypass cache and download fresh data :returns: DataFrame with monthly theory test data .. rubric:: Example >>> df = get_latest_theory_tests() >>> len(df) > 0 True .. py:function:: get_latest_all_tests(force_refresh = False) Get all test types (vehicle, driver, theory) from the latest publication. Downloads the file once and parses all three test types. :param force_refresh: If True, bypass cache and download fresh data :returns: Dictionary with keys 'vehicle', 'driver', 'theory' containing DataFrames .. rubric:: Example >>> data = get_latest_all_tests() >>> sorted(data.keys()) ['driver', 'theory', 'vehicle'] .. py:function:: get_tests_by_year(df, year) Filter test data for a specific year. :param df: Test statistics DataFrame :param year: Year to filter for :returns: DataFrame with only the specified year's data .. rubric:: Example >>> df = get_latest_vehicle_tests() >>> df_2024 = get_tests_by_year(df, 2024) >>> 'tests_conducted' in df_2024.columns True .. py:function:: get_tests_by_month(df, month, year) Get test data for a specific month and year. :param df: Test statistics DataFrame :param month: Month name (e.g., 'January', 'December') :param year: Year :returns: DataFrame with single row for the specified month .. rubric:: Example >>> df = get_latest_vehicle_tests() >>> dec_2025 = get_tests_by_month(df, 'December', 2025) >>> 'tests_conducted' in dec_2025.columns True .. py:function:: calculate_growth_rates(df, periods = 12) Calculate year-on-year growth rates for test statistics. :param df: Test statistics DataFrame :param periods: Number of months for comparison (default: 12 for YoY) :returns: - yoy_growth: Percentage change vs same month previous year :rtype: DataFrame with additional column .. rubric:: Example >>> df = get_latest_vehicle_tests() >>> df_growth = calculate_growth_rates(df) >>> 'yoy_growth' in df_growth.columns True .. py:function:: get_summary_statistics(df, start_year = None, end_year = None) Calculate summary statistics for test data. :param df: Test statistics DataFrame :param start_year: Optional start year for summary :param end_year: Optional end year for summary :returns: - period: Time period covered - total_tests: Total tests in period - monthly_mean: Average monthly tests - monthly_min: Minimum monthly tests - monthly_max: Maximum monthly tests - months_count: Number of months included :rtype: Dictionary with summary statistics .. rubric:: Example >>> df = get_latest_vehicle_tests() >>> stats = get_summary_statistics(df, start_year=2020) >>> sorted(stats.keys()) ['monthly_max', 'monthly_mean', 'monthly_min', 'months_count', 'period', 'total_tests'] .. py:function:: validate_dva_test_data(df) Validate DVA test statistics data integrity. :param df: DataFrame from DVA test functions (vehicle, driver, or theory tests) :returns: True if validation passes, False otherwise