ConvNeXt
We provide an implementation and pretrained weights for the ConvNeXt models.
Paper: A ConvNet for the 2020s. [arXiv:2201.03545].
Original pytorch code and weights from Facebook Research.
This code has been ported from the timm implementation.
The following models are available.
Models trained on ImageNet-1k
convnext_tinyconvnext_smallconvnext_baseconvnext_large
Models trained on ImageNet-22k, fine-tuned on ImageNet-1k
convnext_tiny_in22ft1kconvnext_small_in22ft1kconvnext_base_in22ft1kconvnext_large_in22ft1kconvnext_xlarge_in22ft1k
Models trained on ImageNet-22k, fine-tuned on ImageNet-1k at 384 resolution
convnext_tiny_384_in22ft1kconvnext_small_384_in22ft1kconvnext_base_384_in22ft1kconvnext_large_384_in22ft1kconvnext_xlarge_384_in22ft1k
Models trained on ImageNet-22k
convnext_tiny_in22kconvnext_small_in22kconvnext_base_in22kconvnext_large_in22kconvnext_xlarge_in22k
- class ConvNeXtConfig(name='', url='', nb_classes=1000, in_channels=3, input_size=(224, 224), patch_size=4, embed_dim=(96, 192, 384, 768), nb_blocks=(3, 3, 9, 3), mlp_ratio=4.0, conv_mlp_block=False, drop_rate=0.0, drop_path_rate=0.1, norm_layer='layer_norm_eps_1e-6', act_layer='gelu', init_scale=1e-06, crop_pct=0.875, interpolation='bicubic', mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), first_conv='stem/0', classifier='head/fc')[source]
Configuration class for ConvNeXt models.
- Parameters:
name (str) – Name of the model.
url (str) – URL for pretrained weights.
nb_classes (int) – Number of classes for classification head.
in_channels (int) – Number of input image channels.
input_size (Tuple[int, int]) – Input image size (height, width)
patch_size (int) – Patchifying the image is implemented via a convolutional layer with kernel size and stride equal to
patch_size.embed_dim (Tuple) – Feature dimensions at each stage.
nb_blocks (Tuple) – Number of blocks at each stage.
mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim
conv_mlp_block (bool) – There are two equivalent implementations of the ConvNeXt block, using either (1) 1x1 convolutions or (2) fully connected layers. In PyTorch option (2) also requires permuting channels, which is not needed in TensorFlow. We offer both implementations here, because some
timmmodels use (1) while others use (2).drop_rate (float) – Dropout rate.
drop_path_rate (float) – Dropout rate for stochastic depth.
norm_layer (str) – Normalization layer. See
norm_layer_factory()for possible values.act_layer (str) – Activation function. See
act_layer_factory()for possible values.init_scale (float) – Inital value for layer scale weights.
crop_pct (float) – Crop percentage for ImageNet evaluation.
interpolation (str) – Interpolation method for ImageNet evaluation.
mean (Tuple[float, float, float]) – Defines preprocessing function. If
xis an image with pixel values in (0, 1), the preprocessing function is(x - mean) / std.std (Tuple[float, float, float]) – Defines preprpocessing function.
first_conv (str) – Name of first convolutional layer. Used by
create_model()to adapt the number in input channels when loading pretrained weights.classifier (str) – Name of classifier layer. Used by
create_model()to adapt the classifier when loading pretrained weights.
- class ConvNeXt(*args, **kwargs)[source]
Class implementing a ConvNeXt network.
Paper: A ConvNet for the 2020s.
- Parameters:
cfg (ConvNeXtConfig) – Configuration class for the model.
**kwargs – Arguments are passed to
tf.keras.Model.
- call(x, training=False, return_features=False)[source]
Forward pass through the full model.
- Parameters:
x – Input to model
training (bool) – Training or inference phase?
return_features (bool) – If
True, we return not only the model output, but a dictionary with intermediate features.
- Returns:
If
return_features=True, we return a tuple(y, features), whereyis the model output andfeaturesis a dictionary with intermediate features.If
return_features=False, we return onlyy.
- property feature_names: List[str][source]
Names of features, returned when calling
callwithreturn_features=True.
- forward_features(x, training=False, return_features=False)[source]
Forward pass through model, excluding the classifier layer. This function is useful if the model is used as input for downstream tasks such as object detection.
- Parameters:
x – Input to model
training (bool) – Training or inference phase?
return_features (bool) – If
True, we return not only the model output, but a dictionary with intermediate features.
- Returns:
If
return_features=True, we return a tuple(y, features), whereyis the model output andfeaturesis a dictionary with intermediate features.If
return_features=False, we return onlyy.