elektronn2.training.parallelisation module

class elektronn2.training.parallelisation.BackgroundProc(target, dtypes=None, shapes=None, n_proc=1, target_args=(), target_kwargs={}, profile=False)[source]

Bases: elektronn2.training.parallelisation.SharedMem

Data structure to manage repeated background tasks by reusing a fixed number of initially created background process with the same arguments at every time. (E.g. retrieving an augmented batch) Remember to call BackgroundProc.shutdown after use to avoid zombie process and RAM clutter.

Parameters:
  • dtypes – list of dtypes of the target return values
  • shapes – list of shapes of the target return values
  • n_proc (int) – number of background procs to use
  • target (callable) – target function for background proc. Can even be a method of an object, if object data is read-only (then data will not be copied in RAM and the new process is lean). If several procs use random modules, new seeds must be created inside target because they have the same random state at the beginning.
  • target_args (tuple) – Proc args (constant)
  • target_kwargs (dict) – Proc kwargs (constant)
  • profile (Bool) – Whether to print timing results in to stdout

Examples

Use case to retrieve batches from a data structure D:

>>> data, label = D.getbatch(2, strided=False, flip=True, grey_augment_channels=[0])
>>> kwargs = {'strided': False, 'flip': True, 'grey_augment_channels': [0]}
>>> bg = BackgroundProc([np.float32, np.int16], [data.shape,label.shape],                             D.getbatch, n_proc=2, target_args=(2,),                             target_kwargs=kwargs, profile=False)
>>> for i in range(100):
>>>    data, label = bg.get()
get(timeout=False)[source]

This gets the next result from a background process and blocks until the corresponding proc has finished.

reset()[source]

Should be called after an exception (e.g. by pressing ctrl+c) was raised.

shutdown()[source]

Must be called to free memory if the background tasks are no longer needed

class elektronn2.training.parallelisation.SharedQ(n_proc=0, profile=False)[source]

Bases: elektronn2.training.parallelisation.SharedMem

FIFO Queue to process np.ndarrays in the background (also pre-loading of data from disk)

procs must accept list of mp.Array and make items np.ndarray using SharedQ.shm2ndarray, for this the shapes are required as too. The target requires the signature:

>>> target(mp_arrays, shapes, *args, **kwargs)

Whereas mp_array and shape are automatically added internally

All parameters are optional:

Parameters:
  • n_proc (int) – If larger than 0, a message is printed if to few processes are running
  • profile (Bool) – Whether to print timing results in terminal

Examples

Automatic use:

>>> Q = SharedQ(n_proc=2)
>>> Q.startproc(target=, shape= args=, kwargs=)
>>> Q.startproc(target=, shape= args=, kwargs=)
>>> for i in range(5):
>>>     Q.startproc(target=, shape= args=, kwargs=)
>>>     item = Q.get() # starts as many new jobs as to maintain n_proc
>>>     dosomethingelse(item) # processes work in background to pre-fetch data for next iteration
get()[source]

This gets the first results in the queue and blocks until the corresponding proc has finished. If a n_proc value is defined this then new procs must be started before to avoid a warning message.

startproc(dtypes, shapes, target, target_args=(), target_kwargs={})[source]

Starts a new process

procs must accept list of mp.Array and make items np.ndarray using SharedQ.shm2ndarray, or this the shapes are required as too. The target requires the signature:

target(mp_arrays, shapes, *args, **kwargs)

Whereas mp_array and shape are automatically added internally

exception elektronn2.training.parallelisation.TimeoutError(*args, **kwargs)[source]

Bases: exceptions.RuntimeError