bolster.data_sources.daera_waste ================================ .. py:module:: bolster.data_sources.daera_waste .. autoapi-nested-parse:: DAERA NI Local Authority Collected (LAC) Municipal Waste Statistics. Quarterly time-series data on local-authority-collected municipal waste management across Northern Ireland, published by the Department of Agriculture, Environment and Rural Affairs (DAERA). Data Source: **Discovery page**: https://www.daera-ni.gov.uk/publications/northern-ireland-local-authority-collected-municipal-waste-management-statistics-time-series-data The module scrapes the DAERA publications page to auto-discover the current CSV URL (which changes with each release, e.g. ``2026-04/...``). It then downloads the time-series CSV and returns a tidy long-format DataFrame. Update Frequency: Quarterly (provisional) with finalised annual revisions. The current series runs from Q1 2006/07 to the most recent available quarter. Geographic Coverage: All NI council areas including both pre- and post-2015 boundaries, plus a Northern Ireland aggregate row. The 11 post-2015 LGD councils are: Antrim & Newtownabbey, Ards & North Down, Armagh City Banbridge & Craigavon, Belfast, Causeway Coast & Glens, Derry City & Strabane, Fermanagh & Omagh, Lisburn & Castlereagh, Mid & East Antrim, Mid Ulster, Newry Mourne & Down. .. rubric:: Example >>> from bolster.data_sources import daera_waste >>> df = daera_waste.get_latest_waste_statistics() >>> 'council_area' in df.columns True >>> 'tonnes' in df.columns True Attributes ---------- .. autoapisummary:: bolster.data_sources.daera_waste.logger bolster.data_sources.daera_waste.DAERA_PUBLICATION_PAGE bolster.data_sources.daera_waste.DAERA_BASE_URL bolster.data_sources.daera_waste.NI_COUNCILS_POST_2015 Exceptions ---------- .. autoapisummary:: bolster.data_sources.daera_waste.DAERADataNotFoundError bolster.data_sources.daera_waste.DAERAValidationError Functions --------- .. autoapisummary:: bolster.data_sources.daera_waste.get_waste_publication_url bolster.data_sources.daera_waste.parse_waste_file bolster.data_sources.daera_waste.get_latest_waste_statistics bolster.data_sources.daera_waste.validate_waste_data Module Contents --------------- .. py:data:: logger .. py:data:: DAERA_PUBLICATION_PAGE :value: 'https://www.daera-ni.gov.uk/publications/northern-ireland-local-authority-collected-municipal-wa... .. py:data:: DAERA_BASE_URL :value: 'https://www.daera-ni.gov.uk' .. py:data:: NI_COUNCILS_POST_2015 .. py:exception:: DAERADataNotFoundError Bases: :py:obj:`Exception` DAERA data file or publication page could not be located. Initialize self. See help(type(self)) for accurate signature. .. py:exception:: DAERAValidationError Bases: :py:obj:`Exception` DAERA DataFrame failed validation checks. Initialize self. See help(type(self)) for accurate signature. .. py:function:: get_waste_publication_url(prefer = 'csv') Scrape the DAERA publications page for the latest LAC waste CSV/Excel URL. The URL contains a date component (e.g. ``2026-04/``) that changes with each release, so this function fetches the page and finds the current link. :param prefer: Preferred file type — ``"csv"`` (default) or ``"xlsx"``. :returns: Absolute URL of the latest time-series file. :raises DAERADataNotFoundError: If the publication page cannot be fetched or no matching link is found. .. rubric:: Example >>> url = get_waste_publication_url() >>> url.endswith(".csv") or url.endswith(".xlsx") True >>> "daera-ni.gov.uk" in url True .. py:function:: parse_waste_file(file_path) Parse a DAERA LAC municipal waste time-series CSV file. Reads the CSV (which uses commas as thousands separators in numeric columns), renames columns to clean internal names, and returns a tidy long-format DataFrame. Metadata columns (``QuarterCode``, ``QuarterName``, ``FinancialYear``, ``AreaCode``, ``AreaName``, ``WasteManagementGroup``, ``DataStatus``) are retained alongside all numeric waste metric columns. :param file_path: Path to a downloaded ``.csv`` waste time-series file. :returns: DataFrame with one row per (quarter, council area) and columns including ``financial_year``, ``quarter_code``, ``quarter_name``, ``area_code``, ``council_area``, ``waste_management_group``, ``data_status``, plus numeric waste metrics. :raises DAERAValidationError: If the file cannot be read or lacks the expected structure. .. rubric:: Example >>> import tempfile, pathlib >>> # In practice, use get_latest_waste_statistics() instead >>> # parse_waste_file(pathlib.Path("/path/to/download.csv")) .. py:function:: get_latest_waste_statistics(force_refresh = False) Download and parse the latest DAERA LAC municipal waste statistics. Scrapes the DAERA publications page for the current CSV URL (handling date-stamped paths that change with each release), downloads the file with 30-day caching, and returns a parsed DataFrame. :param force_refresh: If ``True``, bypass the local cache and re-download. :returns: DataFrame from :func:`parse_waste_file`. :raises DAERADataNotFoundError: If the publication page or file cannot be fetched. :raises DAERAValidationError: If the downloaded file cannot be parsed. .. rubric:: Example >>> df = get_latest_waste_statistics() >>> 'council_area' in df.columns True >>> (df['lac_waste_arisings_tonnes'] >= 0).all() True .. py:function:: validate_waste_data(df) Validate a DAERA LAC municipal waste DataFrame. :param df: DataFrame from :func:`get_latest_waste_statistics` or :func:`parse_waste_file`. :returns: ``True`` if all checks pass. :raises DAERAValidationError: If the DataFrame is empty, missing required columns, has negative tonnage values, lacks expected NI councils, or covers an implausibly short time span. .. rubric:: Example >>> df = get_latest_waste_statistics() >>> validate_waste_data(df) True