lfcnn.training.multi_task package

Submodules

lfcnn.training.multi_task.grad_norm module

A keras.Model subclass implementing the GradNorm [1] training strategy for adaptive multi-task loss weighting.

[1]: Chen, Zhao, et al. “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks.” International Conference on Machine Learning. 2018.

class lfcnn.training.multi_task.grad_norm.GradNorm(alpha=1.0, layers_name='last_shared', min_val=1e-07, **kwargs)[source]

Bases: Model

A Model subclass for adaptive loss weighting via GradNorm [1].

In the created model, there needs to be one or more layers with names containing the string “last_shared”. The trainable variables of these layers will be used to calculate the gradients of the individual loss terms. As suggested in the original paper, usually the last layer that is shared by all tasks is used. If you want to specify more shared layers, e.g. for residual layers, you can specify multiple names starting with “last_shared”.

Currently, this implementation does not support distributed strategies via tf.distribute.Strategy such as multi-GPU training.

[1]: Chen, Zhao, et al. “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks.” International Conference on Machine Learning. 2018.

Parameters:
  • alpha (float) – Symmetry factor as specified in original paper [1].

  • layers_name – Name of the layers used for gradient norm calculation. By default, uses the last shared layers indicated by “last_shared” layer names. Otherwise, pass string that is contained in all layers you want to use, e.g. “shared”.

  • min_val – Minimum value to clip the loss weights with to avoid weights to become zero.

  • **kwargs – kwargs passed to keras.Model init.

compile(**kwargs)[source]

Overwrite keras.Model compile().

set_weights(weights)[source]

Set the weights for GradNorm Model. Since the loss weights are added as trainable weights, they have to first be detached from the given weights.

train_step(data)[source]

Overwrite keras.Model train_step() which is called in fit().

lfcnn.training.multi_task.multi_task_uncertainty module

A keras.Model subclass implementing the GradNorm [1] training strategy for adaptive multi-task loss weighting.

[1]: Chen, Zhao, et al. “Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks.” International Conference on Machine Learning. 2018.

class lfcnn.training.multi_task.multi_task_uncertainty.MultiTaskUncertainty(*args, **kwargs)[source]

Bases: Model

A Model subclass for Multi-Task Training with Uncertainty [1].

The loss is calculated as

L = sum_i 1/(2*sigma^2_i)*L_i + ln(simga_i)

where sigma_i are trainable variables and L_i are the single task losses. Minimizing L is equivalent to minimizing, substituting w_i = 1/(2*sigma^2_i)

L = sum_i w_i L_i - 0.5 ln(w_i)

where w_i are trainable variables.

[1] Alex Kendall, Yarin Gal, and Roberto Cipolla: “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics.” IEEE Conference on Computer Vision and Pattern Recognition. 2018.

Parameters:

**kwargs – kwargs passed to keras.Model init.

compile(**kwargs)[source]

Overwrite keras.Model compile().

Module contents

The LFCNN multi-task training strategies.

lfcnn.training.multi_task.get(strategy)[source]

Given a strategy name, returns a Keras model subclass.

Parameters:

model – Name of the strategy.

Returns:

Keras model subclass.