Making Predictions¶
Predicting with a trained model¶
Once you have found a good neural network architecture for your task
and you have already trained it with elektronn2-train
, you can load
the saved model from its training directory and make predictions on
arbitrary data in the correct format.
In general, use elektronn2.neuromancer.model.modelload()
to load a
trained model file and
elektronn2.neuromancer.model.Model.predict_dense()
to create
dense predictions for images/volumes of arbitrary sizes (the input
image must however be larger than the input patch size). Normally,
predictions can only be made with some offset w.r.t. the input image
extent (due to the convolutions) but this method provides an option to mirror
the raw data such that the returned prediction covers the full extent of
the input image (this might however introduce some artifacts because
mirroring is not a natural continuation of the image).
For making predictions, you can write a custom prediction script. Here is a small example:
Prediction example for neuro3d
¶
(Optional) If you want to predict on a GPU that is not already assigned in
~/.theanorc
or ~/.elektronn2rc
, you need to initialise it
before the other imports:
from elektronn2.utils.gpu import initgpu
initgpu('auto') # or "initgpu('0')" for the first GPU etc.
Find the save directory and choose a model file (you probably want the
one called <save_name>-FINAL.mdl
, or alternatively <save_name>-LAST.mdl
if the training process is not finished yet) and a file that contains the raw
images on which you want to execute the neural network, e.g.:
model_path = '~/elektronn2_training/neuro3d/neuro3d-FINAL.mdl'
raw_path = '~/neuro_data_zxy/raw_2.h5' # raw cube for validation
For loading data from (reasonably small) hdf5 files, you can use the
h5load()
utility
function. Here we load the 3D numpy array called ‘raw’` from the
input file:
from elektronn2.utils import h5load
raw3d = h5load(raw_path, 'raw')
Now we load the neural network model:
from elektronn2 import neuromancer as nm
model = nm.model.modelload(model_path)
Input sizes should be at least as large as the spatial input shape of
the model’s input node (which you can query by
model.input_node.shape.spatial_shape
). Smaller inputs are
automatically padded. Here we take an arbitrary 32x160x160
subvolume of the raw data to demonstrate the predictions:
raw3d = raw3d[:32, :160, :160]
To match the input node’s expected input shape, we need to prepend an empty axis for the single input channel. An empty axis is sufficient because we trained with only 1 input channel here (the uint8 pixel intensities):
raw4d = raw3d[None, :, :, :] # shape: (f=1, z=32, x=160, y=160)
pred = model.predict_dense(raw4d)
The numpy array pred
now contains the predicted output in the
shape (f=2, z=18, x=56, y=56)
(same axis order but different sizes than
the input raw4d
due to convolution padding etc.).
Optimal patch sizes¶
Prediction speed benefits greatly from larger input patch sizes and MFP
(see below). It is recommended to impose a larger patch size when making
predictions by loading an already trained model with with the
imposed_patch_size
argmument:
from elektronn2 import neuromancer as nm
ps = (103,201,201)
model = nm.model.modelload(model_path, imposed_patch_size=ps)
During the network initialization that is launched by calling
modelload()
,
invalid values of imposed_patch_size
will be rejected and
the first dimension which needs to be changed will be shown. If one of the dimensions
does not fit the model, you should be able to find a valid one by
by trial and error (either in an IPython session or with a script that loops over
possible values until the model compiles successfully),
To find an optimal patch size that works on your hardware, you can use the
elektronn2-profile
command, which varies the input size of a given
network model until the RAM limit is reached. The script creates a
CSV table of the respective speeds. You can find the fastest input size that
just fits in your RAM in that table and use it to make predictions.
Theoretically, predicting the whole image in a single patch, instead of several tiles, would be fastest. For each tile some calculations have to be repeated and the larger the tiles, the more intermediate results can be shared. But this is obviously impossible due to limited GPU-RAM.
Note
GPU-RAM usage can be lowered by enabling garbage collection (set
linker = cvm
in the [global]
section of .theanorc
) and by using cuDNN.
Max Fragment Pooling (MFP)¶
MFP is the computationally optimal way to avoid redundant calculations when making predictions with strided output (as arises from pooling). It requires more GPU RAM (you may need to adjust the input size) but it can speed up predictions by a factor of 2 - 10. The larger the patch size (i.e. the more RAM you have) the faster. Compilation time is significantly longer.