API Reference¶

Data ¶

Image ¶

dl_utils.data.image.byte_imread(data: bytes) → ndarray[source]¶

dl_utils.data.image.byte_imwrite(image: ndarray, format='PNG', **kwargs) → bytes[source]¶

Video ¶

dl_utils.data.video.save_video(frames: np.ndarray | torch.Tensor, save_path: str | PathLike[str] | Path, fps: int | float = 30, codec: str = 'avc1')[source]¶

Parameters:

frames – Video frames in shape (F, H, W, C). The pixel values should be in range [0, 255].
save_path – Path to save video.
fps – FPS of video, default 30.
codec – Codec of video, default avc1.

dl_utils.data.video.load_video(video_path: str | PathLike[str] | Path, resize: Tuple[int, int] | int = None, center_crop: Tuple[int, int] | int = None, max_frames: int = None) → ndarray[source]¶

Load a video file.

Parameters:

video_path – Path to the video file.
resize – Resize frames to the specified size. If None, no resizing. Accepts (width, height) or int.
center_crop – Center crop frames to the specified size. If None, no cropping. Accepts (width, height) or int.
max_frames – Maximum number of frames to load. If None, load all frames.

Returns:

Frames as a NumPy array with shape (F, H, W, C). Pixel values are in [0, 255], color order is RGB.

Note

If the video is grayscale, the color channel will be replicated to 3.

dl_utils.data.video.get_video_fps(video_path: str | PathLike[str] | Path) → float[source]¶

Retrieve the FPS of a video.

Parameters:: video_path – Path to the video file.
Returns:: The FPS of the video.

dl_utils.data.video.get_video_frame_count(video_path: str | PathLike[str] | Path) → int[source]¶

Retrieve the total number of frames in a video.

Parameters:: video_path – Path to the video file.
Returns:: The number of frames in the video.

dl_utils.data.video.get_video_duration(video_path: str | PathLike[str] | Path) → Tuple[float, int, float][source]¶

Retrieve the FPS, frame count, and duration (in seconds) of a video.

Parameters:: video_path – Path to the video file.
Returns:: A tuple containing FPS, frame count, and duration in seconds.

dl_utils.data.video.get_video_duration_batch(video_paths: List[str | PathLike[str] | Path]) → List[float][source]¶

Get duration of videos in batch.

Parameters:: video_paths – List of paths to videos.
Returns:: A list of tuples, each containing FPS, frame count, and duration in seconds.

dl_utils.data.video.convert_to_h265(input_file: AnyStr, output_file: AnyStr, ffmpeg_exec: AnyStr = '/usr/bin/ffmpeg', keyint: int = None, overwrite: bool = False, verbose: bool = False) → None[source]¶: convert video to h265 format using ffmpeg @param input_file: input path @param output_file: output path @param ffmpeg_exec: @param keyint: @param overwrite: overwrite the existing file @param verbose: show ffmpeg output

dl_utils.data.video.convert_to_h264(input_file: AnyStr, output_file: AnyStr, ffmpeg_exec: AnyStr = '/usr/bin/ffmpeg', keyint: int = None, overwrite: bool = False, verbose: bool = False) → None[source]¶

Array ¶

dl_utils.data.array.to_numpy(array: ndarray | Tensor) → ndarray[source]¶: Convert array-like object to numpy array.

dl_utils.data.array.to_tensor(array: ndarray | Tensor) → Tensor[source]¶: Convert array-like object to torch tensor.

dl_utils.data.array.to_original(array: ndarray | Tensor, ori_dtype) → ndarray | Tensor[source]¶: Convert array-like object to original type.

Normalization ¶

Normalize the input array (usually image or video).

Parameters:

data – Input array, can be a NumPy array or a PyTorch tensor.
mean – Scalar or vector of means for each channel.
std – Scalar or vector of standard deviations for each channel.
dim – The channel dimension to normalize along. Default is -1 (last dimension).

Returns:

Normalized image or video in the same type as input (NumPy array or PyTorch tensor).

Examples

>>> import numpy as np
>>> from dl_utils import normalize
>>> img = np.array([[[0, 128, 255]]], dtype=np.float32)
>>> normalize(img, mean=128, std=64)
array([[[-2.      ,  0.      ,  1.984375]]])

>>> import torch
>>> from dl_utils import normalize
>>> img_t = torch.tensor([[[0, 128, 255]]], dtype=torch.float32)
>>> normalize(img_t, mean=torch.tensor([0, 128, 255]), std=torch.tensor([1, 64, 255]))
tensor([[[0., 0., 0.]]], dtype=torch.float64)

Inverse normalize the input array (usually image or video).

Parameters:

data – Input array, can be a NumPy array or a PyTorch tensor, which has been previously normalized.
mean – Scalar or vector of means used in the original normalization.
std – Scalar or vector of standard deviations used in the original normalization.
dim – The channel dimension along which normalization was applied. Default is -1 (last dimension).

Returns:

Denormalized image or video in the same type as input (NumPy array or PyTorch tensor).

Json ¶

dl_utils.data.json.load_json(file)[source]¶

dl_utils.data.json.save_json(data, file, save_pretty=False, sort_keys=False)[source]¶

LMDB ¶

class dl_utils.data.lmdb.JsonLmdb(env: Environment, autogrow: bool)[source]¶

Bases: Lmdb

https://pypi.org/project/lmdbm/

Pickle ¶

dl_utils.data.pickle.save_pickle(obj, file)[source]¶

dl_utils.data.pickle.load_pickle(file)[source]¶

Sampling ¶

Evenly sample N elements from input_data. Supports list, numpy array, or torch tensor. The input_data can be empty, and n can be less than or equal to 0, in which case it will return empty data.

Parameters:

input_data – List, numpy array, or torch tensor to sample from.
n – Number of elements to sample.

Returns:

Sampled data in the same type as input_data.

Randomly sample N elements from input_data. Supports list, numpy array, or torch tensor.

Parameters:

input_data – List, numpy array, or torch tensor to sample from.
n – Number of elements to sample.
ordered – Whether to return sampled elements in the original order.
seed – Random seed for reproducibility.
put_back – If True, sample with replacement.

Returns:

Sampled data in the same type as input_data.

Text ¶

dl_utils.data.text.save_text(text: str, file)[source]¶

dl_utils.data.text.load_text(file) → str[source]¶

Distributed ¶

dl_utils.distributed.gather_objects(list_object: List[Any]) → List[Any][source]¶: gather a list of something from multiple GPU.

dl_utils.distributed.rank0_wrapper(fn)[source]¶: Wrap any function to only run on rank 0.

dl_utils.distributed.get_master_addr() → str | None[source]¶

dl_utils.distributed.dist_breakpoint(rank: int = 0)[source]¶: Breakpoint for distributed training. Enter the breakpoint only if the current rank is rank, and block all other processes using distributed barrier.

dl_utils.distributed.get_local_rank() → int[source]¶: Get the local rank, the local index of the GPU.

dl_utils.distributed.get_device() → device[source]¶: Get current rank device.

dl_utils.distributed.rank0_log(fn, *args, **kwargs)[source]¶: Log only on rank 0.

dl_utils.distributed.dist_info(print_fn: ~typing.Callable[[str], ~typing.Any] = <built-in function print>, prefix: str = '')[source]¶: Print torch distributed information for debugging.

dl_utils.distributed.rank0() → bool[source]¶: Global rank 0

dl_utils.distributed.barrier_if_distributed(*args, **kwargs)[source]¶: Synchronizes all processes if under distributed context.

dl_utils.distributed.get_global_rank() → int[source]¶: Get the global rank, the global index of the GPU.

dl_utils.distributed.local_rank0() → bool[source]¶: Local rank 0 (of each node)

dl_utils.distributed.get_master_port() → int | None[source]¶

dl_utils.distributed.get_world_size() → int[source]¶: Get (global) world size, the total amount of GPUs.

dl_utils.distributed.rank0_print(*args, **kwargs)[source]¶: Print only on rank 0

dl_utils.distributed.recursive_to(obj: Any, device: str | device = None) → Any[source]¶

Recursively move all torch.Tensor in obj to the given device. Supports: Tensor, list, tuple, dict, set. Leaves other objects intact.

Parameters:

obj – The object to move.
device – The device to move to. If None, uses the current device if gpu is available, else “cpu”.

Returns:

The object with all torch.Tensor moved to the given device.

File System ¶

dl_utils.fs.list_files_multithread(directory, n_jobs=16, depth: int | None = None)[source]¶

List all files in a directory recursively using multiple threads. Useful for list files on NFS.

Parameters:

directory – The directory to search.
n_jobs – Number of parallel jobs (threads) to use.
depth – Maximum recursion depth. If None, no depth limit.

Returns: List of all file paths found under the directory.

dl_utils.fs.list_files(path: str, depth: int | None = None) → List[str][source]¶

List all files in a folder recursively.

Parameters:

path – Root path to start the search.
depth – Maximum depth to search. If None, there is no depth limit. If 0 or less, stop searching deeper.

Returns: A List of file paths found under the given path.

dl_utils.fs.make_parent_dirs(path: PathLike)[source]¶

Visualize ¶

dl_utils.visualize.plot_distribution(data, remove_outlier=False, percent_range=(0.1, 99.9))[source]¶

API Reference¶

Data ¶

Image ¶

Video ¶

Array ¶

Normalization ¶

Json ¶

LMDB ¶

Pickle ¶

Sampling ¶

Text ¶

Distributed ¶

File System ¶

Visualize ¶

dl-utils

Navigation

Related Topics