EfficientNet

We provide an implementation and pretrained weights for the EfficientNet family of models.

Paper: EfficientNet: Rethinking Model Scaling for CNNs. [arXiv:1905.11946].

This code and weights have been ported from the timm implementation. It does mean that some model weights have undergone the journey from TF (original weights from the Google Brain team) to PyTorch (timm library) back to TF (tfimm port of timm).

The following models are available.

MobileNet-V2 models. These models correspond to the mobilenetv2_... models in timm.
- mobilenet_v2_{050, 100, 140}. These are MobileNet-V2 models with channel multiplier set to 0.5, 1.0 and 1.4 respectively.
- mobilenet_v2_{110d, 120d}. These are MobileNet-V2 models with (channel, depth) multipliers set to (1.1, 1.2) and (1.2, 1.4) respectively.
Original EfficientNet models. These models correspond to the models tf_... in timm.
- efficientnet_{b0, b1, b2, b3, b4, b5, b6, b7, b8}
EfficientNet AdvProp models, trained with adversarial examples. These models correspond to the tf_... models in timm.
- efficientnet_{b0, ..., b8}_ap
EfficientNet NoisyStudent models, trained via semi-supervised learning. These models correspond to the tf_... models in timm.
- efficientnet_{b0, ..., b7}_ns
- efficientnet_l2_ns_475
- efficientnet_l2
PyTorch versions of the EfficientNet models. These models use symmetric padding rather than “same” padding that is default in TF. They correspond to the efficientnet_... models in timm.
- pt_efficientnet_{b0, ..., b4}
EfficientNet-EdgeTPU models, optimized for inference on Google’s Edge TPU hardware. These models correspond to the tf_... models in timm.
- efficientnet_es
- efficientnet_em
- efficientnet_el
EfficientNet-Lite models, optimized for inference on mobile devices, CPUs and GPUs. These models correspond to the tf_... models in timm.
- efficientnet_lite0
- efficientnet_lite1
- efficientnet_lite2
- efficientnet_lite3
- efficientnet_lite4
EfficientNet-V2 models. These models correspond to the tf_... models in timm.
- efficientnet_v2_b0
- efficientnet_v2_b1
- efficientnet_v2_b2
- efficientnet_v2_b3
- efficientnet_v2_s
- efficientnet_v2_m
- efficientnet_v2_l
EfficientNet-V2 models, pretrained on ImageNet-21k, fine-tuned on ImageNet-1k. These models correspond to the tf_... models in timm.
- efficientnet_v2_s_in21ft1k
- efficientnet_v2_m_in21ft1k
- efficientnet_v2_l_in21ft1k
- efficientnet_v2_xl_in21ft1k
EfficientNet-V2 models, pretrained on ImageNet-21k. These models correspond to the tf_... models in timm.
- efficientnet_v2_s_in21k
- efficientnet_v2_m_in21k
- efficientnet_v2_l_in21k
- efficientnet_v2_xl_in21k

class EfficientNetConfig(name='', url='', nb_classes=1000, in_channels=3, input_size=(224, 224), stem_size=32, architecture=(), channel_multiplier=1.0, depth_multiplier=1.0, fix_first_last=False, nb_features=1280, drop_rate=0.0, drop_path_rate=0.0, norm_layer='batch_norm', act_layer='swish', padding='symmetric', crop_pct=0.875, interpolation='bicubic', mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), first_conv='conv_stem', classifier='classifier')[source]

Configuration class for EfficientNet models.

Parameters:

name (str) – Name of the model.
url (str) – URL for pretrained weights.
nb_classes (int) – Number of classes for classification head.
in_channels (int) – Number of input image channels.
input_size (Tuple[int, int]) – Input image size (height, width)
stem_size (int) – Number of filters in first convolution.
architecture (Tuple[Tuple[str, ...], ...]) – Tuple of tuple of strings defining the architecture of residual blocks. The outer tuple defines the stages while the inner tuple defines the blocks per stage.
channel_multiplier (float) – Multiplier for channel scaling. One of the three dimensions of EfficientNet scaling.
depth_multiplier (float) – Multiplier for depth scaling. One of the three dimensions of EfficientNet scaling.
fix_first_last (bool) – Fix first and last block depths when multiplier is applied.
nb_features (int) – Number of features before the classifier layer.
drop_rate (float) – Dropout rate.
drop_path_rate (float) – Dropout rate for stochastic depth.
norm_layer (str) – Normalization layer. See norm_layer_factory() for possible values.
act_layer (str) – Activation function. See act_layer_factory() for possible values.
padding (str) – Type of padding to use for convolutional layers. Can be one of “same”, “valid” or “symmetric” (PyTorch-style symmetric padding).
crop_pct (float) – Crop percentage for ImageNet evaluation.
interpolation (str) – Interpolation method for ImageNet evaluation.
mean (Tuple[float, float, float]) – Defines preprocessing function. If x is an image with pixel values in (0, 1), the preprocessing function is (x - mean) / std.
std (Tuple[float, float, float]) – Defines preprpocessing function.
first_conv (str) – Name of first convolutional layer. Used by create_model() to adapt the number in input channels when loading pretrained weights.
classifier (str) – Name of classifier layer. Used by create_model() to adapt the classifier when loading pretrained weights.

class EfficientNet(*args, **kwargs)[source]

Generic EfficientNet implementation supporting depth and width scaling and flexible architecture definitions, including

EfficientNet B0-B7.

Parameters:

cfg (EfficientNetConfig) – Configuration class for the model.
**kwargs – Arguments are passed to tf.keras.Model.

call(x, training=False, return_features=False)[source]

Forward pass through the full model.

Parameters:

x – Input to model
training (bool) – Training or inference phase?
return_features (bool) – If True, we return not only the model output, but a dictionary with intermediate features.

Returns:

If return_features=True, we return a tuple (y, features), where y is the model output and features is a dictionary with intermediate features.

If return_features=False, we return only y.

property dummy_inputs: Tensor[source]: Returns a tensor of the correct shape for inference.

property feature_names: List[str][source]: Names of features, returned when calling call with return_features=True.

forward_features(x, training=False, return_features=False)[source]

Forward pass through model, excluding the classifier layer. This function is useful if the model is used as input for downstream tasks such as object detection.

Parameters:

x – Input to model
training (bool) – Training or inference phase?
return_features (bool) – If True, we return not only the model output, but a dictionary with intermediate features.

Returns:

If return_features=True, we return a tuple (y, features), where y is the model output and features is a dictionary with intermediate features.

If return_features=False, we return only y.