VGG

We provide an implementation and pretrained weights for the VGG models.

Paper: Very Deep Convolutional Networks For Large-Scale Image Recognition. [arXiv:1409.1556].

This code has been ported from the timm implementation.

class VGGConfig(name='', url='', nb_classes=1000, in_channels=3, input_size=(224, 224), layers=(), nb_features=4096, mlp_ratio=1.0, global_pool='avg', drop_rate=0.0, norm_layer='', act_layer='relu', crop_pct=0.875, interpolation='bilinear', mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), first_conv='features/0', classifier='head/fc')[source]

Configuration class for VGG models.

Parameters:
  • name (str) – Name of the model.

  • url (str) – URL for pretrained weights.

  • nb_classes (int) – Number of classes for classification head.

  • in_channels (int) – Number of input image channels.

  • input_size (Tuple[int, int]) – Input image size (height, width)

  • layers (Tuple) – List with number of filters for conv layers and “M” for pooling layers.

  • nb_features (int) – Number of features in pre-classification head.

  • mlp_ratio (float) – Ratio for expanding nb_features in pre-classification head.

  • global_pool (str) – Global pooling layers.

  • drop_rate (float) – Dropout rate.

  • norm_layer (str) – Normalization layer. See norm_layer_factory() for possible values.

  • act_layer (str) – Activation function. See act_layer_factory() for possible values.

  • crop_pct (float) – Crop percentage for ImageNet evaluation.

  • interpolation (str) – Interpolation method for ImageNet evaluation.

  • mean (Tuple[float, float, float]) – Defines preprocessing function. If x is an image with pixel values in (0, 1), the preprocessing function is (x - mean) / std.

  • std (Tuple[float, float, float]) – Defines preprpocessing function.

  • first_conv (str) – Name of first convolutional layer. Used by create_model() to adapt the number in input channels when loading pretrained weights.

  • classifier (str) – Name of classifier layer. Used by create_model() to adapt the classifier when loading pretrained weights.

class VGG(*args, **kwargs)[source]

Class implementing a VGG network.

Paper: Very Deep Convolutional Networks For Large-Scale Image Recognition. [arXiv:1409.1556].

Parameters:
  • cfg (VGGConfig) – Configuration class for the model.

  • **kwargs – Arguments are passed to tf.keras.Model.

call(x, training=False, return_features=False)[source]

Forward pass through the full model.

Parameters:
  • x – Input to model

  • training (bool) – Training or inference phase?

  • return_features (bool) – If True, we return not only the model output, but a dictionary with intermediate features.

Returns:

If return_features=True, we return a tuple (y, features), where y is the model output and features is a dictionary with intermediate features.

If return_features=False, we return only y.

property dummy_inputs: Tensor[source]

Returns a tensor of the correct shape for inference.

property feature_names: List[str][source]

Names of features, returned when calling call with return_features=True.

forward_features(x, training=False, return_features=False)[source]

Forward pass through model, excluding the classifier layer. This function is useful if the model is used as input for downstream tasks such as object detection.

Parameters:
  • x – Input to model

  • training (bool) – Training or inference phase?

  • return_features (bool) – If True, we return not only the model output, but a dictionary with intermediate features.

Returns:

If return_features=True, we return a tuple (y, features), where y is the model output and features is a dictionary with intermediate features.

If return_features=False, we return only y.