API Reference

Core Modules

soft.soft

soft.soft.array_row_intersection(a: ndarray, b: ndarray) → ndarray[source]

Finds the intersection of rows between two 2D numpy arrays.

This function identifies rows that are present in both input arrays a and b and returns the rows from a that are also in b. The function is optimized for performance using numpy operations.

Parameters: a (numpy.ndarray): A 2D numpy array. b (numpy.ndarray): A 2D numpy array.

Returns: numpy.ndarray: A 2D numpy array containing the rows from a that are also present in b.

Note: This function is adapted from a solution provided by Vasilis Lemonidis on Stack Overflow. All credit goes to the original author. Source: https://stackoverflow.com/a/40600991

Example: a = np.array([[1, 2], [3, 4], [5, 6]]) b = np.array([[3, 4], [7, 8]]) result = array_row_intersection(a, b) # result will be array([[3, 4]])

PSA: This docstring has been written with the assistance of AI.

soft.soft.associate(datapath: str, verbose: bool = False, number_of_workers: int = None) → None[source]

Perform association of FITS files using parallel processing.

This function processes FITS files in datapath/02-id directory, performing association using the back_and_forth_matching_PARALLEL function. It divides the data into subgroups, processes each subgroup in parallel using multiprocessing, and saves the associated results in datapath/03-assoc directory.

Parameters: datapath (str): The base directory path containing the data. verbose (bool): set to true to print additional informations. number_of_workers (int): number of workers for the parallel work

Returns: None

Note: This function assumes the presence of necessary directories (02-id and 03-assoc) and uses multiprocessing for parallelization. It also assumes the availability of back_and_forth_matching_PARALLEL function and color for colored console output.

Example: associate(“/path/to/data/”)

PSA: This docstring has been written with the assistance of AI.

soft.soft.back_and_forth_matching_PARALLEL(fname1: str, fname2: str, round: int, datapath: str, verbose: bool = False) → None[source]

Performs parallel forward and backward matching of unique IDs between two FITS files.

This function reads two FITS files, performs forward and backward matching of unique IDs based on the largest intersection of pixel coordinates, and saves the modified second FITS file.

Parameters: fname1 (str): File path to the first FITS file. fname2 (str): File path to the second FITS file. round (int): Round number or identifier for the current processing round. datapath (str): Directory path where temporary and output files will be saved.

Returns: None

Note: This function assumes array_row_intersection function is defined elsewhere in your code. It uses tqdm for progress monitoring and assumes numpy arrays for file operations.

Example: back_and_forth_matching_PARALLEL(‘file1.fits’, ‘file2.fits’, 1, ‘/path/to/data/’)

PSA: This docstring has been written with the assistance of AI.

class soft.soft.color[source]

Bases: object

BLUE = '\x1b[94m'

BOLD = '\x1b[1m'

CYAN = '\x1b[96m'

DARKCYAN = '\x1b[36m'

END = '\x1b[0m'

GREEN = '\x1b[92m'

PURPLE = '\x1b[95m'

RED = '\x1b[91m'

UNDERLINE = '\x1b[4m'

YELLOW = '\x1b[93m'

soft.soft.detection(img: ndarray, l_thr: float, h_thr: float, min_distance: int, sign: str = 'both', separation: bool = False, verbose: bool = False) → tuple[ndarray, ndarray, ndarray] | tuple[ndarray, ndarray][source]

Detects features in an image using a threshold and watershed algorithm based on the specified sign of features.

This function processes an input image to detect features either of positive values, negative values, or both. The detection is based on thresholding and the watershed algorithm, which can be applied with or without separation.

Parameters: img (numpy.ndarray): The input image to process. l_thr (int): The threshold value for detection. min_distance (int): The minimum distance between features for the watershed algorithm. sign (str, optional): Specifies the type of features to detect. Options are “both”, “pos”, or “neg”.

Default is “both”.

separation (bool, optional): If True, applies separation in the watershed routine. Default is False.

Returns: Union[tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray], tuple[numpy.ndarray, numpy.ndarray]]:

If sign is “both”: Returns a tuple of three elements:
labels (numpy.ndarray): The combined labels of detected features. coords_pos (numpy.ndarray): Coordinates of positive features. coords_neg (numpy.ndarray): Coordinates of negative features.

If sign is “pos” or “neg”: Returns a tuple of two elements:
labels (numpy.ndarray): The labels of detected features. coords (numpy.ndarray): Coordinates of the detected features.

Raises: ValueError: If sign is not “both”, “pos”, or “neg”.

Example: labels, coords_pos, coords_neg = detection(img, l_thr=50, min_distance=10, sign=”both”, separation=True)

PSA: This docstring has been written with the assistance of AI.

soft.soft.housekeeping(datapath: str) → None[source]

Ensures the existence and proper state of specific directories and their contents within a given data path.

This function performs the following tasks:

Checks if the directories “01-mask”, “02-id”, and “03-assoc” exist within the specified datapath. If none of these directories exist, it creates them.
If the directories exist, it checks for files within them.
- If all three directories contain the same number of files and are not empty, it prompts a warning message indicating that the directories are not empty and proceeds to delete all files in these directories.
- If the number of files in “01-mask” and “02-id” do not match, it deletes all files in these two directories.
- If the directories are empty, it prints a message indicating so.

Parameters: datapath (str): The path where the directories “01-mask”, “02-id”, and “03-assoc” are located or should be created.

Note: This function assumes the existence of a color class with BOLD, RED, GREEN, and END attributes for formatting output.

Example: housekeeping(“/path/to/root/data/folder/”)

PSA: This docstring has been written with the assistance of AI.

soft.soft.identification(labels: ndarray, min_size: int, verbose: bool = False) → ndarray[source]

Identifies and filters clumps in the input label array based on a minimum size threshold.

This function processes the input label array, retaining only those clumps (connected components) that meet the specified minimum size. Clumps smaller than the minimum size are removed (set to zero).

Parameters: labels (numpy.ndarray): The input array of labels representing different clumps. min_size (int): The minimum size (number of pixels) a clump must have to be retained.

Returns: numpy.ndarray: The filtered label array with only clumps meeting the minimum size retained.

Raises: ValueError: If no clumps survive the identification process.

Note: Future versions may include a “verbose” option to print the number of clumps removed.

Example: filtered_labels = identification(labels, min_size=50)

PSA: This docstring has been written with the assistance of AI.

soft.soft.img_pre_neg(img: ndarray, l_thr: float) → ndarray[source]

soft.soft.img_pre_pos(img: ndarray, thr: float) → ndarray[source]

soft.soft.peak_local_max(img, min_dist, h_thr, sign)[source]

Detect local maxima (or minima) in an image while enforcing a minimum distance between detected peaks.

Parameters:

img (ndarray) – Input image (2D).
min_dist (int, optional) – Minimum allowed Euclidean distance between detected peaks. Default is 5.
h_thr (float, optional) – Threshold value for preprocessing peaks. Default is 0.5.
sign ({"pos", "neg"}) – Whether to detect positive (“pos”) or negative (“neg”) peaks.

Returns:

centroids – Array of (row, col) centroid coordinates of detected peaks.

Return type:

ndarray of shape (N, 2)

soft.soft.process_image(datapath: str, data: str, l_thr: float, h_thr: float, min_distance: int, sign: str = 'both', separation: bool = True, min_size: int = 4, verbose: bool = False) → None[source]

Processes an astronomical image by detecting and identifying clumps within it, and saves the results.

This function performs the following steps:

Reads the input FITS image file.
Detects clumps in the image using the specified detection parameters.
Saves the detected clumps to the “01-mask” directory.
Identifies and filters clumps based on the minimum size.
Saves the filtered clumps to the “02-id” directory.

Parameters: datapath (str): The path where the results will be saved. data (str): The path to the input FITS image file. l_thr (int): The threshold value for clump detection. min_distance (int): The minimum distance between detected clumps. sign (str, optional): Specifies the type of clumps to detect. Options are “both”, “pos”, or “neg”.

Default is “both”.

separation (bool, optional): If True, applies separation in the detection routine. Default is True. min_size (int, optional): The minimum size (number of pixels) a clump must have to be retained. Default is 4.

Returns: None

Example: process_image(“/path/to/data/”, “image.fits”, l_thr=50, min_distance=10, sign=”both”, separation=True, min_size=4)

PSA: This docstring has been written with the assistance of AI.

soft.soft.tabulation(files: str, filesB: str, dx: float, dt: float, cores: int, minliftime: int = 4) → DataFrame[source]

Analyzes a series of FITS files and their corresponding background files to extract properties of labeled regions across frames, merge common labels, compute lifetime, and calculate velocities.

Parameters: - files (list of str): List of paths to FITS files containing labeled regions. - filesB (list of str): List of paths to background FITS files corresponding to ‘files’. - dx (float): Pixel size in the x-direction (spatial resolution). - dt (float): Time interval between frames (temporal resolution). - minliftime (int, optional): Minimum lifetime (number of frames a label persists) to consider. Default is 4.

Returns: - pd.DataFrame: DataFrame containing tabulated data with columns:

[‘label’, ‘Lifetime’, ‘X’, ‘Y’, ‘Area’, ‘Flux’, ‘Frames’, ‘Vx’, ‘Vy’, ‘stdVx’, ‘stdVy’].

Raises: - ValueError: If there are inconsistencies in the data, such as non-consecutive frames or multiple labels in a group.

Notes: - The function assumes that ‘files’ and ‘filesB’ are aligned, i.e., each entry in ‘files’ corresponds to the

same index in ‘filesB’ for background data.

‘dx’ and ‘dt’ are used to compute velocities (‘Vx’, ‘Vy’) based on numerical differentiation of positions (‘X’, ‘Y’).

PSA: This docstring has been written with the aid of AI.

soft.soft.tabulation_parallel(files: list, filesB: list, dx: float, dt: float, cores: int, minliftime: int = 4) → DataFrame[source]

Process segmentation maps (files) and source FITS files (filesB) in parallel. Extracts blob properties (centroid, area, flux, eccentricity), tracks them across frames, and computes velocities.

Parameters:

files (list) – List of segmentation FITS files (each pixel labeled by blob ID).
filesB (list) – List of source FITS files with intensity values.
dx (float) – Spatial resolution (e.g. arcsec/pixel).
dt (float) – Temporal resolution (frame cadence).
cores (int) – Number of CPU cores for parallel processing.
minliftime (int, optional) – Minimum number of frames a blob must persist to be included (default=4).

Returns:

Summary table with one row per tracked blob: label, Lifetime, arrays of X, Y, Area, Flux, Frames, ecc, Vx, Vy, stdVx, stdVy.

Return type:

pandas.DataFrame

soft.soft.tabulation_parallel_doppler(files: str, filesD: str, filesB: str, dx: float, dt: float, cores: int, minliftime: int = 4) → DataFrame[source]

soft.soft.track_all(datapath: str, cores: int, min_distance: int, l_thr: float, h_thr: float, min_size: int, dx: float, dt: float, sign: str, separation: bool, verbose: bool = False, doppler: bool = False) → None[source]

Executes a pipeline for feature detection, identification, association, tabulation, and data storage based on astronomical FITS files.

Parameters: - datapath (str): Path to the main data directory. - cores (int): Number of CPU cores to utilize for parallel processing. - min_distance (int): Minimum distance between features for detection. - l_thr (int): Threshold value for feature detection. - min_size (int): Minimum size threshold for identified features. - dx (float): Pixel size in the x-direction (spatial resolution) for velocity computation. - dt (float): Time interval between frames (temporal resolution) for velocity computation. - sign (str): Sign convention for feature detection (‘positive’, ‘negative’, or ‘both’). - separation (int): Separation threshold for feature detection. - verbose (bool, optional): If True, displays detailed progress information. Default is False. - doppler (bool, optional): If True, includes Doppler files for tabulation. Default is False.

Returns: - None: Outputs are saved as FITS files and a JSON file containing tabulated data.

Raises: - FileNotFoundError: If there are issues with file paths or missing directories.

Notes:

This function coordinates the detection, identification, association, tabulation, and storage of astronomical features across multiple FITS files in the specified ‘datapath’.
The process involves multiple subprocesses, including cleaning up temporary files, feature detection, ID assignment, association of features across frames, tabulation of feature properties (such as position, area, and flux), and saving the resulting tabulated data as a JSON file.

PSA: This docstring has been written with the aid of AI.

soft.soft.unique_id(id_data: str, datapath: str, verbose: bool = False) → None[source]

Assigns unique IDs to clumps in a list of FITS image files and saves the modified files.

This function processes each FITS file in the provided list, replacing each unique non-zero clump identifier with a globally unique ID. The modified images are saved in the “02-id” directory within the specified datapath.

Parameters: id_data (list): A list of paths to the FITS files to be processed. datapath (str): The path where the modified files will be saved.

Returns: None

Example: unique_id([“image1.fits”, “image2.fits”], “/path/to/data/”)

PSA: This docstring has been written with the assistance of AI.

soft.soft.watershed_routine(img: ndarray, l_thr: float, h_thr: float, min_dist: int, sign: str, separation: bool = False) → tuple[ndarray, ndarray][source]

soft.largesets

soft.largesets.associate_LF(img_n1, matches_1, matches_2)[source]

soft.largesets.back_and_forth_matching_LF(file1, file2)[source]

class soft.largesets.color[source]

Bases: object

BLUE = '\x1b[94m'

BOLD = '\x1b[1m'

CYAN = '\x1b[96m'

DARKCYAN = '\x1b[36m'

END = '\x1b[0m'

GREEN = '\x1b[92m'

PURPLE = '\x1b[95m'

RED = '\x1b[91m'

UNDERLINE = '\x1b[4m'

YELLOW = '\x1b[93m'

soft.largesets.detection_ss(img, l_thr, sign='both')[source]

soft.largesets.housekeeping_LF(datapath: str) → None[source]

soft.largesets.identification(labels, min_size, verbose=False)[source]

soft.largesets.img_pre_neg(img, l_thr)[source]

soft.largesets.img_pre_pos(img, l_thr)[source]

soft.largesets.process_image_ss(datapath: str, data: str, l_thr: int, sign, min_size: int = 4, verbose=False) → None[source]

soft.largesets.simple_labels(img)[source]

soft.largesets.tabulation_parallel_doppler(files: str, filesD: str, filesB: str, dx: float, dt: float, cores: int, minliftime: int = 4) → DataFrame[source]

soft.largesets.track_all_LF(datapath, cores, l_thr, sign, m_size, dx, dt, N, verbose=False)[source]

soft.largesets.unique_id(id_data, datapath, verbose)[source]

soft.sunspots

soft.sunspots.array_row_intersection(a: ndarray, b: ndarray) → ndarray[source]

Finds the intersection of rows between two 2D numpy arrays.

This function identifies rows that are present in both input arrays a and b and returns the rows from a that are also in b. The function is optimized for performance using numpy operations.

Parameters: a (numpy.ndarray): A 2D numpy array. b (numpy.ndarray): A 2D numpy array.

Returns: numpy.ndarray: A 2D numpy array containing the rows from a that are also present in b.

Note: This function is adapted from a solution provided by Vasilis Lemonidis on Stack Overflow. All credit goes to the original author. Source: https://stackoverflow.com/a/40600991

Example: a = np.array([[1, 2], [3, 4], [5, 6]]) b = np.array([[3, 4], [7, 8]]) result = array_row_intersection(a, b) # result will be array([[3, 4]])

PSA: This docstring has been written with the assistance of AI.

soft.sunspots.associate(datapath: str, verbose: bool = False) → None[source]

Perform association of FITS files using parallel processing.

This function processes FITS files in datapath/02-id directory, performing association using the back_and_forth_matching_PARALLEL function. It divides the data into subgroups, processes each subgroup in parallel using multiprocessing, and saves the associated results in datapath/03-assoc directory.

Parameters: datapath (str): The base directory path containing the data.

Returns: None

Note: This function assumes the presence of necessary directories (02-id and 03-assoc) and uses multiprocessing for parallelization. It also assumes the availability of back_and_forth_matching_PARALLEL function and color for colored console output.

Example: associate(“/path/to/data/”)

PSA: This docstring has been written with the assistance of AI.

soft.sunspots.back_and_forth_matching_PARALLEL(fname1: str, fname2: str, round: int, datapath: str, verbose: bool = False) → None[source]

Performs parallel forward and backward matching of unique IDs between two FITS files.

This function reads two FITS files, performs forward and backward matching of unique IDs based on the largest intersection of pixel coordinates, and saves the modified second FITS file.

Parameters: fname1 (str): File path to the first FITS file. fname2 (str): File path to the second FITS file. round (int): Round number or identifier for the current processing round. datapath (str): Directory path where temporary and output files will be saved.

Returns: None

Note: This function assumes array_row_intersection function is defined elsewhere in your code. It uses tqdm for progress monitoring and assumes numpy arrays for file operations.

Example: back_and_forth_matching_PARALLEL(‘file1.fits’, ‘file2.fits’, 1, ‘/path/to/data/’)

PSA: This docstring has been written with the assistance of AI.

class soft.sunspots.color[source]

Bases: object

BLUE = '\x1b[94m'

BOLD = '\x1b[1m'

CYAN = '\x1b[96m'

DARKCYAN = '\x1b[36m'

END = '\x1b[0m'

GREEN = '\x1b[92m'

PURPLE = '\x1b[95m'

RED = '\x1b[91m'

UNDERLINE = '\x1b[4m'

YELLOW = '\x1b[93m'

soft.sunspots.detection_ss(img: ndarray, l_thr: int) → tuple[ndarray, ndarray, ndarray] | tuple[ndarray, ndarray][source]

soft.sunspots.housekeeping(datapath: str) → None[source]

Ensures the existence and proper state of specific directories and their contents within a given data path.

This function performs the following tasks:

Checks if the directories “01-mask”, “02-id”, and “03-assoc” exist within the specified datapath. If none of these directories exist, it creates them.
If the directories exist, it checks for files within them.
- If all three directories contain the same number of files and are not empty, it prompts a warning message indicating that the directories are not empty and proceeds to delete all files in these directories.
- If the number of files in “01-mask” and “02-id” do not match, it deletes all files in these two directories.
- If the directories are empty, it prints a message indicating so.

Parameters: datapath (str): The path where the directories “01-mask”, “02-id”, and “03-assoc” are located or should be created.

Note: This function assumes the existence of a color class with BOLD, RED, GREEN, and END attributes for formatting output.

Example: housekeeping(“/path/to/root/data/folder/”)

PSA: This docstring has been written with the assistance of AI.

soft.sunspots.identification(labels: ndarray, min_size: int, verbose: bool = False) → ndarray[source]