bolster.data_sources.nisra.tourism.occupancy ============================================ .. py:module:: bolster.data_sources.nisra.tourism.occupancy .. autoapi-nested-parse:: NISRA Monthly Occupancy Statistics Data Source. Provides access to monthly hotel and accommodation occupancy data for Northern Ireland. Data includes: - Hotel room and bed occupancy rates from 2011 to present - Rooms and beds sold monthly (hotels) - Small Service Accommodation (SSA) occupancy from 2013 to present (B&Bs, guest houses, and similar establishments) - Rooms and beds sold monthly (SSA) The survey provides indicative monthly occupancy rates that are revised and finalised in the Annual Publication. Data Source: **Mother Page**: https://www.nisra.gov.uk/statistics/tourism/occupancy-surveys This page lists all occupancy survey publications. The module automatically scrapes this page to find the latest "Hotel Occupancy" or "Small Service Accommodation" publication, then downloads the Excel file. Update Frequency: Monthly (published around the 15th of the following month) Geographic Coverage: Northern Ireland Reference Date: Month of survey .. rubric:: Example >>> from bolster.data_sources.nisra.tourism import occupancy >>> df = occupancy.get_latest_hotel_occupancy() >>> 'room_occupancy' in df.columns True >>> df_sold = occupancy.get_latest_rooms_beds_sold() >>> 'rooms_sold' in df_sold.columns True >>> df_combined = occupancy.get_combined_occupancy() >>> 'accommodation_type' in df_combined.columns True Attributes ---------- .. autoapisummary:: bolster.data_sources.nisra.tourism.occupancy.logger bolster.data_sources.nisra.tourism.occupancy.OCCUPANCY_BASE_URL Functions --------- .. autoapisummary:: bolster.data_sources.nisra.tourism.occupancy.get_latest_hotel_occupancy_publication_url bolster.data_sources.nisra.tourism.occupancy.get_latest_ssa_occupancy_publication_url bolster.data_sources.nisra.tourism.occupancy.parse_hotel_occupancy_rates bolster.data_sources.nisra.tourism.occupancy.parse_rooms_beds_sold bolster.data_sources.nisra.tourism.occupancy.get_latest_hotel_occupancy bolster.data_sources.nisra.tourism.occupancy.get_latest_rooms_beds_sold bolster.data_sources.nisra.tourism.occupancy.parse_ssa_occupancy_rates bolster.data_sources.nisra.tourism.occupancy.parse_ssa_rooms_beds_sold bolster.data_sources.nisra.tourism.occupancy.get_latest_ssa_occupancy bolster.data_sources.nisra.tourism.occupancy.get_latest_ssa_rooms_beds_sold bolster.data_sources.nisra.tourism.occupancy.get_combined_occupancy bolster.data_sources.nisra.tourism.occupancy.compare_accommodation_types bolster.data_sources.nisra.tourism.occupancy.get_occupancy_by_year bolster.data_sources.nisra.tourism.occupancy.get_occupancy_summary_by_year bolster.data_sources.nisra.tourism.occupancy.get_seasonal_patterns bolster.data_sources.nisra.tourism.occupancy.validate_occupancy_data Module Contents --------------- .. py:data:: logger .. py:data:: OCCUPANCY_BASE_URL :value: 'https://www.nisra.gov.uk/statistics/tourism/occupancy-surveys' .. py:function:: get_latest_hotel_occupancy_publication_url() Scrape NISRA occupancy surveys page to find the latest hotel occupancy file. Navigates the publication structure: 1. Scrapes mother page for latest hotel occupancy publication 2. Follows link to publication detail page 3. Finds hotel occupancy Excel file :returns: Tuple of (excel_file_url, publication_date) :raises NISRADataNotFoundError: If publication or file not found .. py:function:: get_latest_ssa_occupancy_publication_url() Scrape NISRA occupancy surveys page to find the latest SSA file. SSA = Small Service Accommodation (B&Bs, guest houses, etc.) Navigates the publication structure: 1. Scrapes mother page for latest SSA occupancy publication 2. Follows link to publication detail page 3. Finds SSA occupancy Excel file :returns: Tuple of (excel_file_url, publication_date) :raises NISRADataNotFoundError: If publication or file not found .. py:function:: parse_hotel_occupancy_rates(file_path) Parse NISRA hotel occupancy rates from Excel file (Table 1). :param file_path: Path to the hotel occupancy Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - room_occupancy: float (room occupancy rate, 0-1) - bed_occupancy: float (bed occupancy rate, 0-1) :rtype: DataFrame with columns :raises NISRAValidationError: If file structure is unexpected .. py:function:: parse_rooms_beds_sold(file_path) Parse NISRA hotel rooms and beds sold from Excel file (Table 3). :param file_path: Path to the hotel occupancy Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - rooms_sold: float (number of rooms sold) - beds_sold: float (number of beds sold) :rtype: DataFrame with columns :raises NISRAValidationError: If file structure is unexpected .. py:function:: get_latest_hotel_occupancy(force_refresh = False) Get the latest monthly hotel occupancy rates data. Automatically discovers and downloads the most recent hotel occupancy data from the NISRA website. :param force_refresh: If True, bypass cache and download fresh data :returns: - date: datetime (first day of month) - year: int - month: str (month name) - room_occupancy: float (room occupancy rate, 0-1) - bed_occupancy: float (bed occupancy rate, 0-1) :rtype: DataFrame with columns :raises NISRADataNotFoundError: If latest publication cannot be found :raises NISRAValidationError: If file structure is unexpected .. rubric:: Example >>> df = get_latest_hotel_occupancy() >>> 'room_occupancy' in df.columns True .. py:function:: get_latest_rooms_beds_sold(force_refresh = False) Get the latest monthly rooms and beds sold data. :param force_refresh: If True, bypass cache and download fresh data :returns: - date: datetime (first day of month) - year: int - month: str (month name) - rooms_sold: float (number of rooms sold) - beds_sold: float (number of beds sold) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_latest_rooms_beds_sold() >>> 'rooms_sold' in df.columns True .. py:function:: parse_ssa_occupancy_rates(file_path) Parse NISRA SSA occupancy rates from Excel file (Table 1). SSA = Small Service Accommodation (B&Bs, guest houses, etc.) :param file_path: Path to the SSA occupancy Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - room_occupancy: float (room occupancy rate, 0-1) - bed_occupancy: float (bed occupancy rate, 0-1) :rtype: DataFrame with columns :raises NISRAValidationError: If file structure is unexpected .. py:function:: parse_ssa_rooms_beds_sold(file_path) Parse NISRA SSA rooms and beds sold from Excel file (Table 2). Note: SSA uses Table 2 for rooms/beds sold, while Hotel uses Table 3. :param file_path: Path to the SSA occupancy Excel file :returns: - date: datetime (first day of month) - year: int - month: str (month name) - rooms_sold: float (number of rooms sold) - beds_sold: float (number of beds sold) :rtype: DataFrame with columns :raises NISRAValidationError: If file structure is unexpected .. py:function:: get_latest_ssa_occupancy(force_refresh = False) Get the latest monthly SSA occupancy rates data. SSA = Small Service Accommodation (B&Bs, guest houses, etc.) Automatically discovers and downloads the most recent SSA occupancy data from the NISRA website. :param force_refresh: If True, bypass cache and download fresh data :returns: - date: datetime (first day of month) - year: int - month: str (month name) - room_occupancy: float (room occupancy rate, 0-1) - bed_occupancy: float (bed occupancy rate, 0-1) :rtype: DataFrame with columns :raises NISRADataNotFoundError: If latest publication cannot be found :raises NISRAValidationError: If file structure is unexpected .. rubric:: Example >>> df = get_latest_ssa_occupancy() >>> 'room_occupancy' in df.columns True .. py:function:: get_latest_ssa_rooms_beds_sold(force_refresh = False) Get the latest monthly SSA rooms and beds sold data. SSA = Small Service Accommodation (B&Bs, guest houses, etc.) :param force_refresh: If True, bypass cache and download fresh data :returns: - date: datetime (first day of month) - year: int - month: str (month name) - rooms_sold: float (number of rooms sold) - beds_sold: float (number of beds sold) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_latest_ssa_rooms_beds_sold() >>> 'rooms_sold' in df.columns True .. py:function:: get_combined_occupancy(force_refresh = False) Get combined hotel and SSA occupancy data with accommodation type column. This function fetches both hotel and SSA occupancy data and combines them into a single DataFrame with an 'accommodation_type' column to distinguish between the two accommodation types. :param force_refresh: If True, bypass cache and download fresh data :returns: - date: datetime (first day of month) - year: int - month: str (month name) - room_occupancy: float (room occupancy rate, 0-1) - bed_occupancy: float (bed occupancy rate, 0-1) - accommodation_type: str ('hotel' or 'ssa') :rtype: DataFrame with columns .. rubric:: Example >>> df = get_combined_occupancy() >>> 'accommodation_type' in df.columns True .. py:function:: compare_accommodation_types(df, metric = 'room_occupancy') Compare occupancy between hotel and SSA by year. :param df: DataFrame from get_combined_occupancy() :param metric: Which occupancy metric to compare :returns: - year: int - hotel_{metric}: float - ssa_{metric}: float - difference: float (hotel - ssa) - ratio: float (hotel / ssa) :rtype: DataFrame with columns .. rubric:: Example >>> df = get_combined_occupancy() >>> comparison = compare_accommodation_types(df) >>> 'difference' in comparison.columns True .. py:function:: get_occupancy_by_year(df, year) Filter occupancy data for a specific year. :param df: DataFrame from get_latest_hotel_occupancy() :param year: Year to filter :returns: Filtered DataFrame .. py:function:: get_occupancy_summary_by_year(df) Calculate annual occupancy averages and statistics. :param df: DataFrame from get_latest_hotel_occupancy() :returns: - year: int - avg_room_occupancy: float - avg_bed_occupancy: float - months_reported: int :rtype: DataFrame with columns .. py:function:: get_seasonal_patterns(df) Calculate average occupancy by month across all years. :param df: DataFrame from get_latest_hotel_occupancy() :returns: - month: str (month name) - avg_room_occupancy: float - avg_bed_occupancy: float :rtype: DataFrame with columns .. rubric:: Example >>> df = get_latest_hotel_occupancy() >>> seasonal = get_seasonal_patterns(df) >>> 'avg_room_occupancy' in seasonal.columns True .. py:function:: validate_occupancy_data(df) Validate tourism occupancy data integrity. :param df: DataFrame from get_latest_occupancy_data :returns: True if validation passes, False otherwise