asi_core.make_dataset ===================== .. py:module:: asi_core.make_dataset .. autoapi-nested-parse:: This module provides functions to create ASI datasets, e.g., for machine learning applications. Attributes ---------- .. autoapisummary:: asi_core.make_dataset.Q25_LEFT_CENTRATION_THRESHOLD asi_core.make_dataset.Q25_RIGHT_CENTRATION_THRESHOLD Functions --------- .. autoapisummary:: asi_core.make_dataset.load_asi_list asi_core.make_dataset.read_asi_meta_data asi_core.make_dataset.check_asi_list asi_core.make_dataset.check_asi asi_core.make_dataset.load_transform_save_asi asi_core.make_dataset.create_asi_list asi_core.make_dataset.create_asi_dataset asi_core.make_dataset.read_asi_dataset asi_core.make_dataset.merge_meteo_and_asi_data asi_core.make_dataset.map_asi_to_timestamps asi_core.make_dataset.select_by_dni_var_classes asi_core.make_dataset.check_Q25_asi_cropping asi_core.make_dataset.get_dates_from_csv asi_core.make_dataset.filter_timestamps_by_sun_elevation Module Contents --------------- .. py:data:: Q25_LEFT_CENTRATION_THRESHOLD :value: 200 .. py:data:: Q25_RIGHT_CENTRATION_THRESHOLD :value: 400 .. py:function:: load_asi_list(csv_files, asi_root=None, col_timestamp='timestamp', col_rel_path='rel_path', col_filename='file_name') Loads list of ASI from csv files. If asi_root is passed, the absolute path of each ASI image is appended to the resulting dataframe. :param csv_files: list of csv files (full paths). :param asi_root: root directory of asi images. :param col_timestamp: column name of timestamps of ASI. :param col_rel_path: column name of relative path of ASI wrt root directory. :param col_filename: column name of asi file names. :return: dataframe of merged csv files. .. py:function:: read_asi_meta_data(filename, is_mobotix=True, name_convention='dlr', tz='UTC+0100') Extracts meta data of an all-sky image from its name. :param filename: file path of all-sky image. :param name_convention: determines naming convention of all-sky images. :return: meta data as dict. .. py:function:: check_asi_list(asi_list, tz=1, limit_exp_time=None, name_convention='dlr', n_workers=0) Checks a list of all-sky image files for corruption and returns dataframe with additional data. :param asi_list: list of asi files. :param tz: timezone as int (+/- UTC). :param limit_exp_time: limit of valid exposure time. :param name_convention: asi file name convention. :param n_workers: number of workers to use for parallel processing. :return: meta data of asi files as dataframe. .. py:function:: check_asi(filename, tz='UTC+0100', is_mobotix=True, limit_exp_time=None, name_convention='dlr') Checks a single ASI (All-Sky Image) file for corruption and extracts metadata. :param filename: Path to the ASI image file. :param tz: Timezone for parsing metadata timestamps. Default is "UTC+0100". :param is_mobotix: Boolean indicating whether the image follows the Mobotix format. Default is True. :param limit_exp_time: Optional threshold for maximum exposure time. If exceeded, a warning is logged. :param name_convention: Naming convention used for parsing metadata. Default is 'dlr'. :return: A dictionary containing: - 'name': Extracted image name from metadata (or NaN if unavailable). - 'timestamp': Extracted timestamp from metadata (or NaN if unavailable). - 'exposure_time': Extracted exposure time from metadata (or NaN if unavailable). - 'illuminance': Extracted illuminance value from metadata (or NaN if unavailable). - 'width': Image width in pixels. - 'height': Image height in pixels. - 'corrupted': Boolean indicating if the image is corrupted (True if corrupted, False otherwise). :raises Warning: Logs a warning if the exposure time exceeds `limit_exp_time`. .. py:function:: load_transform_save_asi(rel_path, all_sky_imager, source_dir, target_dir) Loads, transforms and saves transformed all-sky image. :param rel_path: relative file path of image. :param all_sky_imager: camera used to take image. :type all_sky_imager: AllSkyImager. :param source_dir: directory of raw images. :param target_dir: directory of transformed images. :return: True/False depending on success. .. py:function:: create_asi_list(asi_root, do_check=False, name_convention='dlr', csv_file=None, n_workers=0) Gets all asi within asi_root and save the list to csv. :param asi_root: root folder where images are stored. :param do_check: if true, all images are checked for validity. :param name_convention: asi file name convention. :param csv_file: csv file to save results. :param n_workers: number of workers to use for parallel processing. :return: None. .. py:function:: create_asi_dataset(asi_series, source_dir, target_dir, camera_data_dir, n_workers=0, asi_tfms=None) Creates an ASI dataset from all passed filenames in target_dir. :param asi_series: pd.Series of all-sky images, with timestamp of acquisition as index and camera name as name. :param source_dir: directory of raw images. :param target_dir: directory of transformed images. :param camera_data_dir: directory of yaml files containing camera data. :param n_workers: number of workers to use for parallel processing. :param kwargs: kwargs for applying transformation. :return: pd.Series of successfully saved (transformed) images. .. py:function:: read_asi_dataset(csv_file, img_dir=None, asi_path_col='rel_path', drop_asi_filepath=True, filter_dates=None) Reads an ASI dataset from a CSV file and optionally filters by date. :param csv_file: Path to the CSV file containing ASI metadata. :param img_dir: Optional directory path where ASI images are stored. If provided, file paths will be adjusted accordingly. :param asi_path_col: Column name in the CSV that contains the relative file paths of ASI images. Default is 'rel_path'. :param drop_asi_filepath: Whether to drop the ASI file path column from the returned DataFrame. Default is True. :param filter_dates: Optional list of dates to filter the dataset. Only entries matching these dates will be retained. :return: - asi_files: A Pandas Series containing file paths to ASI images. - df: A Pandas DataFrame with metadata, optionally filtered and with the ASI path column removed. .. py:function:: merge_meteo_and_asi_data(df_meteo, df_asi, temporal_resolution='30s', max_delta_t=15, parameters_to_cast=None) Merges meteorological data with ASI data based on timestamps. :param df_meteo: Pandas DataFrame containing meteorological data indexed by timestamp. :param df_asi: Pandas DataFrame containing ASI metadata indexed by timestamp. :param temporal_resolution: Time rounding resolution for ASI timestamps (e.g., '30s' for 30 seconds). Default is '30s'. :param max_delta_t: Maximum allowed time difference (in seconds) for matching ASI data to meteorological data. Default is 15 seconds. :param parameters_to_cast: Optional dictionary specifying data types for certain parameters after merging. :return: A Pandas DataFrame with meteorological and ASI data merged, indexed by timestamp. .. py:function:: map_asi_to_timestamps(df, round_to='60s', max_delta_t=10, valid_exp_times=None, max_delta_exp_time=10, multi_exposure=False, inplace=False) Maps asi acqusition time to a rounded timestamp. :param df: dataframe containing a column 'timestamp'. :param round_to: string of resolution to round timestamps to. :param max_delta_t: maximal allowed deviation to rounded timestamp in seconds. :param valid_exp_times: tuple of valid exposure times to be considered. :param inplace: if true, overwrites existing dataframe. :return: dataframe with rounded timestamp as index. .. py:function:: select_by_dni_var_classes(dni_var_classes, selected_classes, include_by='H') Selects timestamp by dni variability class. A timestamp is selected if the timestamp itself or the included time frame has a dni variability class contained in selected_classes. :param dni_var_classes: pd.Series of dni var classes with DatetimeIndex :param selected_classes: list of dni var classes to filter by. :param include_by: determines size of time frame (e.g., 'H' means hour) :return: selected timestamps as DatetimeIndex. .. py:function:: check_Q25_asi_cropping(rel_path_to_image, asi_root) This function can be used to determine how asis from the Q25 all sky imager have been cropped for custom resolution. The function is used for images of the Kontas camera in the interval from '20160920' to '20190612'. :param rel_path_to_image: path to asi image, relative to asi root directory :param asi_root: absolute path to asi root directory :return string specifying how image has been cropped (left, center, right). If the image can't be opened the function returns nan. .. py:function:: get_dates_from_csv(csv_file, col_name='date') Extracts unique dates from a specified column in a CSV file. :param csv_file: Path to the CSV file containing date information. :param col_name: Column name in the CSV that contains date values. Default is 'date'. :return: A NumPy array of unique dates extracted from the specified column. .. py:function:: filter_timestamps_by_sun_elevation(ts, min_el, sun_el=None, latitude=None, longitude=None, altitude=None) Filters timestamps based on minimum solar elevation. :param ts: Pandas DatetimeIndex of timestamps to be filtered. :param min_el: Minimum solar elevation angle (in degrees) required for timestamps to be retained. :param sun_el: Optional Pandas Series containing precomputed solar elevations for the timestamps. If None, solar elevation will be computed using latitude, longitude, and altitude. :param latitude: Latitude of the location (required if sun_el is not provided). :param longitude: Longitude of the location (required if sun_el is not provided). :param altitude: Altitude of the location in meters (optional, used when computing solar elevation). :return: A filtered Pandas DatetimeIndex containing only timestamps where solar elevation exceeds min_el.