crumpets.torch.dataloader module¶

class crumpets.torch.dataloader.TorchTurboDataLoader(iterable, batch_size, worker_template, nworkers, length=None, num_mini_batches=1, start_iteration=0, device='cuda:0', gpu_augmentation=False, shared_memory=True)[source]¶

Bases: crumpets.dataloader.TurboDataLoader

TorchTurboDataLoader is a subclass of TurboDataLoader intended for use with the Pytorch framework. It produces torch tensors instead of numpy arrays.

See TurboDataLoader for more details on its operation.

Parameters

iterable – An iterable providing a sample per iteration.
batch_size – The amount of samples per batch.
worker_template – An actual worker instance, determines the kind of processing. Has to inherit crumpets.broker.Worker.
nworkers – Number of workers processing the samples simultaneously. worker_template is copied to create them.
length – Specifies the length of the dataset. Defaults to the actual length of iterable (if available). If given differs from default, the number of iterations per epoch is modified accordingly.
num_mini_batches – Number of mini_batches per batch.
start_iteration – Start the iteration counter from this number. Useful when resuming training.
shared_memory – Whether to use shared memory to transfer data from workers. If 0 or False, shared memory is disabled. If True, 2*nworkers shared buffers will be used. If any number > 0, that number of buffers will be used. A value of 1 is strongly discouraged to prevent deadlocks. Permanently storing values returned by a loader may also cause deadlocks.
device – torch device to use, Defaults to ‘cuda:0’.
gpu_augmentation – Use a Randomizer to calculate certain data augmentation operations on GPU. This disables said operations on the CPU side.