FastAI Course Lecture 3 Notes

Computer Vision
FastAI
Author

Kanav Sharma

Published

April 14, 2024

This code aims to upscale our last lession.

Previously, we obtained our data from the DuckDuckGo API and built our model around that.

This time, we will retrieve data from a Kaggle dataset, enhance our model, improve our understanding of different available pre-trained vision architectures in PyTorch using the timm library, and implement another model according to our requirements.

Level 1 : Repeat last lesson

1.1 : Download dataset from Kaggle

from google.colab import files

# Upload the Kaggle API key JSON file
uploaded = files.upload()

!pip install kaggle

!mkdir -p ~/.kaggle
!mv kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
Saving kaggle.json to kaggle.json
Requirement already satisfied: kaggle in /usr/local/lib/python3.10/dist-packages (1.5.16)
Requirement already satisfied: six>=1.10 in /usr/local/lib/python3.10/dist-packages (from kaggle) (1.16.0)
Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from kaggle) (2024.2.2)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.10/dist-packages (from kaggle) (2.8.2)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from kaggle) (2.31.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from kaggle) (4.66.2)
Requirement already satisfied: python-slugify in /usr/local/lib/python3.10/dist-packages (from kaggle) (8.0.4)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.10/dist-packages (from kaggle) (2.0.7)
Requirement already satisfied: bleach in /usr/local/lib/python3.10/dist-packages (from kaggle) (6.1.0)
Requirement already satisfied: webencodings in /usr/local/lib/python3.10/dist-packages (from bleach->kaggle) (0.5.1)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.10/dist-packages (from python-slugify->kaggle) (1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->kaggle) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->kaggle) (3.6)

Download dataset :

Open dataset in Kaggle, click on 3 vertical dots (ellipsis) then click on Copy API comamnd, and we are good to go.

!kaggle datasets download -d gpiosenka/cats-in-the-wild-image-classification

# Create a folder called "Big Cat"
!mkdir -p Big_Cat

# Unzip the dataset into the "Big Cat" folder
!unzip cats-in-the-wild-image-classification.zip -d Big_Cat

# Remove the zip file
!rm cats-in-the-wild-image-classification.zip
Downloading cats-in-the-wild-image-classification.zip to /content
 96% 118M/123M [00:01<00:00, 57.4MB/s]
100% 123M/123M [00:01<00:00, 65.0MB/s]
Archive:  cats-in-the-wild-image-classification.zip
  inflating: Big_Cat/EfficientNetB0-10-(224 X 224)-100.00.h5  
  inflating: Big_Cat/MobileNetV3 small-10-(224 X 224)-95.96.h5  
  inflating: Big_Cat/WILDCATS.CSV    
  inflating: Big_Cat/test/AFRICAN LEOPARD/1.jpg  
  inflating: Big_Cat/test/AFRICAN LEOPARD/5.jpg  
  inflating: Big_Cat/test/CARACAL/1.jpg  
  inflating: Big_Cat/test/CARACAL/5.jpg  
  inflating: Big_Cat/test/CHEETAH/1.jpg  
  inflating: Big_Cat/test/CHEETAH/5.jpg  
  inflating: Big_Cat/test/CLOUDED LEOPARD/1.jpg  
  inflating: Big_Cat/test/CLOUDED LEOPARD/5.jpg  
  inflating: Big_Cat/test/JAGUAR/1.jpg  
  inflating: Big_Cat/test/JAGUAR/5.jpg  
  inflating: Big_Cat/test/LIONS/1.jpg  
  inflating: Big_Cat/test/LIONS/5.jpg  
  inflating: Big_Cat/test/OCELOT/1.jpg  
  inflating: Big_Cat/test/OCELOT/5.jpg  
  inflating: Big_Cat/test/PUMA/1.jpg  
  inflating: Big_Cat/test/PUMA/5.jpg  
  inflating: Big_Cat/test/SNOW LEOPARD/1.jpg  
  inflating: Big_Cat/test/SNOW LEOPARD/5.jpg  
  inflating: Big_Cat/test/TIGER/1.jpg  
  inflating: Big_Cat/test/TIGER/5.jpg  
  inflating: Big_Cat/train/AFRICAN LEOPARD/001.jpg  
  inflating: Big_Cat/train/AFRICAN LEOPARD/236.jpg  
  inflating: Big_Cat/train/CARACAL/001.jpg  
  inflating: Big_Cat/train/CARACAL/236.jpg  
  inflating: Big_Cat/train/CHEETAH/001.jpg  
  inflating: Big_Cat/train/CHEETAH/235.jpg  
  inflating: Big_Cat/train/CLOUDED LEOPARD/001.jpg  
  inflating: Big_Cat/train/CLOUDED LEOPARD/229.jpg  
  inflating: Big_Cat/train/JAGUAR/001.jpg  
  inflating: Big_Cat/train/JAGUAR/238.jpg  
  inflating: Big_Cat/train/LIONS/001.jpg  
  inflating: Big_Cat/train/LIONS/228.jpg  
  inflating: Big_Cat/train/OCELOT/001.jpg  
  inflating: Big_Cat/train/OCELOT/233.jpg  
  inflating: Big_Cat/train/PUMA/001.jpg  
  inflating: Big_Cat/train/PUMA/236.jpg  
  inflating: Big_Cat/train/SNOW LEOPARD/001.jpg  
  inflating: Big_Cat/train/SNOW LEOPARD/231.jpg  
  inflating: Big_Cat/train/TIGER/001.jpg  
  inflating: Big_Cat/train/TIGER/237.jpg  
  inflating: Big_Cat/valid/AFRICAN LEOPARD/1.jpg  
  inflating: Big_Cat/valid/AFRICAN LEOPARD/5.jpg  
  inflating: Big_Cat/valid/CARACAL/1.jpg
  inflating: Big_Cat/valid/CARACAL/5.jpg  
  inflating: Big_Cat/valid/CHEETAH/1.jpg  
  inflating: Big_Cat/valid/CHEETAH/5.jpg  
  inflating: Big_Cat/valid/CLOUDED LEOPARD/1.jpg  
  inflating: Big_Cat/valid/CLOUDED LEOPARD/5.jpg  
  inflating: Big_Cat/valid/JAGUAR/1.jpg  
  inflating: Big_Cat/valid/JAGUAR/5.jpg  
  inflating: Big_Cat/valid/LIONS/1.jpg  
  inflating: Big_Cat/valid/LIONS/5.jpg  
  inflating: Big_Cat/valid/OCELOT/1.jpg  
  inflating: Big_Cat/valid/OCELOT/5.jpg  
  inflating: Big_Cat/valid/PUMA/1.jpg  
  inflating: Big_Cat/valid/PUMA/5.jpg  
  inflating: Big_Cat/valid/SNOW LEOPARD/1.jpg  
  inflating: Big_Cat/valid/SNOW LEOPARD/5.jpg  
  inflating: Big_Cat/valid/TIGER/1.jpg  
  inflating: Big_Cat/valid/TIGER/5.jpg  

Remove all files with the “.h5” extension and move images from the “valid” folder to their corresponding subfolders in the “train” folder

import os
import shutil

# Define the absolute paths of main, train and valid folders
big_cat_folder = '/content/Big_Cat/'
train_folder = '/content/Big_Cat/train'
valid_folder = '/content/Big_Cat/valid'

# Remove all files with ".h5" extension
!find {big_cat_folder} -type f -name '*.h5' -delete

# Create a dictionary to track image counts
image_counts = {}

# Get a list of subfolders in the train folder
train_subfolders = [f.path for f in os.scandir(train_folder) if f.is_dir()]

# Get a list of subfolders in the valid folder
valid_subfolders = [f.path for f in os.scandir(valid_folder) if f.is_dir()]

# Move images from valid to their corresponding subfolders in train
for subfolder in valid_subfolders:
    class_name = os.path.basename(subfolder)
    train_subfolder = os.path.join(train_folder, class_name)

    # Create the train subfolder if it doesn't exist
    if not os.path.exists(train_subfolder):
        os.makedirs(train_subfolder)

    # Initialize counts in the dictionary
    image_counts[f"{class_name}_VALID"] = len(os.listdir(subfolder))

    # Move images from valid to train subfolder and update counts
    for file in os.listdir(subfolder):
        file_path = os.path.join(subfolder, file)
        dest_path = os.path.join(train_subfolder, file)
        shutil.move(file_path, dest_path)


# Remove the empty valid subfolders
for subfolder in valid_subfolders:
    os.rmdir(subfolder)

# Get count of images in train folder so that we can understand on how much we are training
for subfolder in train_subfolders:
    class_name = os.path.basename(subfolder)
    image_counts[f"{class_name}_TRAIN"] = len(os.listdir(subfolder))

sorted_image_counts = dict(sorted(image_counts.items()))

# Print the sorted image counts dictionary
print("Sorted Image Counts:")
for key, value in sorted_image_counts.items():
    print(f"{key}: {value}")
Sorted Image Counts:
AFRICAN LEOPARD_TRAIN: 241
AFRICAN LEOPARD_VALID: 5
CARACAL_TRAIN: 241
CARACAL_VALID: 5
CHEETAH_TRAIN: 240
CHEETAH_VALID: 5
CLOUDED LEOPARD_TRAIN: 234
CLOUDED LEOPARD_VALID: 5
JAGUAR_TRAIN: 243
JAGUAR_VALID: 5
LIONS_TRAIN: 233
LIONS_VALID: 5
OCELOT_TRAIN: 238
OCELOT_VALID: 5
PUMA_TRAIN: 241
PUMA_VALID: 5
SNOW LEOPARD_TRAIN: 236
SNOW LEOPARD_VALID: 5
TIGER_TRAIN: 242
TIGER_VALID: 5

Install latest FastAI Version

#hide
! [ -e /content ] && pip install -Uqq fastbook
! pip install timm

import fastbook
fastbook.setup_book()
import timm

#hide
from fastbook import *
from fastai.vision.widgets import *
from fastai.vision.all import *
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m719.8/719.8 kB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m510.5/510.5 kB[0m [31m11.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m116.3/116.3 kB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m194.1/194.1 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m134.8/134.8 kB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m19.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting timm
  Downloading timm-0.9.16-py3-none-any.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m15.2 MB/s[0m eta [36m0:00:00[0m
[?25hRequirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (from timm) (2.2.1+cu121)
Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (from timm) (0.17.1+cu121)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from timm) (6.0.1)
Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.10/dist-packages (from timm) (0.20.3)
Requirement already satisfied: safetensors in /usr/local/lib/python3.10/dist-packages (from timm) (0.4.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->timm) (3.13.3)
Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->timm) (2023.6.0)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->timm) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->timm) (4.66.2)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->timm) (4.10.0)
Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->timm) (24.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch->timm) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch->timm) (3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (3.1.3)
Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (12.1.105)
Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (12.1.105)
Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (12.1.105)
Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (8.9.2.26)
Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (12.1.3.1)
Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (11.0.2.54)
Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (10.3.2.106)
Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (11.4.5.107)
Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (12.1.0.106)
Requirement already satisfied: nvidia-nccl-cu12==2.19.3 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (2.19.3)
Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (12.1.105)
Requirement already satisfied: triton==2.2.0 in /usr/local/lib/python3.10/dist-packages (from torch->timm) (2.2.0)
Requirement already satisfied: nvidia-nvjitlink-cu12 in /usr/local/lib/python3.10/dist-packages (from nvidia-cusolver-cu12==11.4.5.107->torch->timm) (12.4.127)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from torchvision->timm) (1.25.2)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in /usr/local/lib/python3.10/dist-packages (from torchvision->timm) (9.4.0)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch->timm) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface_hub->timm) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface_hub->timm) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface_hub->timm) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->huggingface_hub->timm) (2024.2.2)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch->timm) (1.3.0)
Installing collected packages: timm
Successfully installed timm-0.9.16
Mounted at /content/gdrive

verify_images() will return path of images which are corrupt and using unlink we can remove these files.

path = Path('Big_Cat')

fns = get_image_files(path)
total_imagelength = len(fns)
failed = verify_images(fns)
failed_imagelength = len(failed)

failed.map(Path.unlink)
Image_Count_Dict = {"Total_Image_Count": total_imagelength, "Failed_Image_Count": failed_imagelength}
Image_Count_Dict
/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()





{'Total_Image_Count': 2439, 'Failed_Image_Count': 0}

We have good chunk of images to be trained on

1.2 : Prepare data for model training (Data Loaders, Data Augmentaion, etc.).

Data Loaders

big_cat = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128))
dls = big_cat.dataloaders(path)

dls.valid.show_batch(max_n=8, nrows=2)

png

Data Augmentation

big_cat = big_cat.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms())
big_cat_dls = big_cat.dataloaders(path)
big_cat_dls.train.show_batch(max_n=8, nrows=2)

png

1.3 : Train Model

learn = vision_learner(big_cat_dls, resnet34, metrics=error_rate)
learn.fine_tune(5)
Downloading: "https://download.pytorch.org/models/resnet34-b627a593.pth" to /root/.cache/torch/hub/checkpoints/resnet34-b627a593.pth
100%|██████████| 83.3M/83.3M [00:00<00:00, 164MB/s]
epoch train_loss valid_loss error_rate time
0 1.837715 0.237122 0.088296 00:13
epoch train_loss valid_loss error_rate time
0 0.565063 0.141556 0.041068 00:18
1 0.431256 0.147855 0.051335 00:20
2 0.337753 0.119496 0.036961 00:17
3 0.268090 0.115800 0.032854 00:14

understand structure of model

learn.summary()
Sequential (Input shape: 64 x 3 x 224 x 224)
============================================================================
Layer (type)         Output Shape         Param #    Trainable 
============================================================================
                     64 x 64 x 112 x 112 
Conv2d                                    9408       True      
BatchNorm2d                               128        True      
ReLU                                                           
____________________________________________________________________________
                     64 x 64 x 56 x 56   
MaxPool2d                                                      
Conv2d                                    36864      True      
BatchNorm2d                               128        True      
ReLU                                                           
Conv2d                                    36864      True      
BatchNorm2d                               128        True      
Conv2d                                    36864      True      
BatchNorm2d                               128        True      
ReLU                                                           
Conv2d                                    36864      True      
BatchNorm2d                               128        True      
Conv2d                                    36864      True      
BatchNorm2d                               128        True      
ReLU                                                           
Conv2d                                    36864      True      
BatchNorm2d                               128        True      
____________________________________________________________________________
                     64 x 128 x 28 x 28  
Conv2d                                    73728      True      
BatchNorm2d                               256        True      
ReLU                                                           
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
Conv2d                                    8192       True      
BatchNorm2d                               256        True      
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
ReLU                                                           
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
ReLU                                                           
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
ReLU                                                           
Conv2d                                    147456     True      
BatchNorm2d                               256        True      
____________________________________________________________________________
                     64 x 256 x 14 x 14  
Conv2d                                    294912     True      
BatchNorm2d                               512        True      
ReLU                                                           
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
Conv2d                                    32768      True      
BatchNorm2d                               512        True      
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
ReLU                                                           
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
ReLU                                                           
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
ReLU                                                           
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
ReLU                                                           
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
ReLU                                                           
Conv2d                                    589824     True      
BatchNorm2d                               512        True      
____________________________________________________________________________
                     64 x 512 x 7 x 7    
Conv2d                                    1179648    True      
BatchNorm2d                               1024       True      
ReLU                                                           
Conv2d                                    2359296    True      
BatchNorm2d                               1024       True      
Conv2d                                    131072     True      
BatchNorm2d                               1024       True      
Conv2d                                    2359296    True      
BatchNorm2d                               1024       True      
ReLU                                                           
Conv2d                                    2359296    True      
BatchNorm2d                               1024       True      
Conv2d                                    2359296    True      
BatchNorm2d                               1024       True      
ReLU                                                           
Conv2d                                    2359296    True      
BatchNorm2d                               1024       True      
____________________________________________________________________________
                     64 x 512 x 1 x 1    
AdaptiveAvgPool2d                                              
AdaptiveMaxPool2d                                              
____________________________________________________________________________
                     64 x 1024           
Flatten                                                        
BatchNorm1d                               2048       True      
Dropout                                                        
____________________________________________________________________________
                     64 x 512            
Linear                                    524288     True      
ReLU                                                           
BatchNorm1d                               1024       True      
Dropout                                                        
____________________________________________________________________________
                     64 x 10             
Linear                                    5120       True      
____________________________________________________________________________

Total params: 21,817,152
Total trainable params: 21,817,152
Total non-trainable params: 0

Optimizer used: <function Adam at 0x7b5a37dbbeb0>
Loss function: FlattenedLoss of CrossEntropyLoss()

Model unfrozen

Callbacks:
  - TrainEvalCallback
  - CastToTensor
  - Recorder
  - ProgressCallback

Confusion Metric

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

png

Display Images with highest loss, to Get the picture 😊

interp.plot_top_losses(6, nrows=2, figsize=(18,4))

png

We can observe from both the confusion matrix and visual representation that the model is having difficulty differentiating between the Jaguar and the African Leopard. Even I find it challenging to distinguish between the two. 😵 So, we can let it be.

1.4 : Clear the data

#hide_output
cleaner = ImageClassifierCleaner(learn)
cleaner
VBox(children=(Dropdown(options=('AFRICAN LEOPARD', 'CARACAL', 'CHEETAH', 'CLOUDED LEOPARD', 'JAGUAR', 'LIONS'…

Apply those changes

for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

Level 2 : Understand Computer Vision Architectures

timm is a wonderful library by Ross Wightman which provides state-of-the-art pre-trained computer vision models. It’s like Huggingface Transformers, but for computer vision instead of NLP.

2.1 : Download Data

Let’s download Ross’s GitHub repository, which is regularly updated with benchmark data for computer vision architectures. These benchmark are created on Imagenet.

! git clone --depth 1 https://github.com/rwightman/pytorch-image-models.git
%cd pytorch-image-models/results
Cloning into 'pytorch-image-models'...
remote: Enumerating objects: 572, done.[K
remote: Counting objects: 100% (572/572), done.[K
remote: Compressing objects: 100% (403/403), done.[K
remote: Total 572 (delta 222), reused 341 (delta 163), pack-reused 0[K
Receiving objects: 100% (572/572), 2.59 MiB | 4.87 MiB/s, done.
Resolving deltas: 100% (222/222), done.
/content/pytorch-image-models/results/pytorch-image-models/results
import pandas as pd

Benchmark_Result = pd.read_csv('results-imagenet.csv')
Benchmark_Result['model_org'] = Benchmark_Result['model']
Benchmark_Result['model'] = Benchmark_Result['model'].str.split('.').str[0]
Benchmark_Result.head(5)
model top1 top1_err top5 top5_err param_count img_size crop_pct interpolation model_org
0 eva02_large_patch14_448 90.052 9.948 99.048 0.952 305.08 448 1.0 bicubic eva02_large_patch14_448.mim_m38m_ft_in22k_in1k
1 eva02_large_patch14_448 89.970 10.030 99.012 0.988 305.08 448 1.0 bicubic eva02_large_patch14_448.mim_in22k_ft_in22k_in1k
2 eva_giant_patch14_560 89.786 10.214 98.992 1.008 1,014.45 560 1.0 bicubic eva_giant_patch14_560.m30m_ft_in22k_in1k
3 eva02_large_patch14_448 89.622 10.378 98.950 1.050 305.08 448 1.0 bicubic eva02_large_patch14_448.mim_in22k_ft_in1k
4 eva02_large_patch14_448 89.574 10.426 98.924 1.076 305.08 448 1.0 bicubic eva02_large_patch14_448.mim_m38m_ft_in1k

Let’s add a “family” column that will allow us to group architectures into categories with similar characteristics:

def get_data(part, col):
    df = pd.read_csv(f'benchmark-{part}-amp-nhwc-pt111-cu113-rtx3090.csv').merge(Benchmark_Result, on='model')
    df['secs'] = 1. / df[col]
    df['family'] = df.model.str.extract('^([a-z]+?(?:v2)?)(?:\d|_|$)')
    df = df[~df.model.str.endswith('gn')]
    df.loc[df.model.str.contains('in22'),'family'] = df.loc[df.model.str.contains('in22'),'family'] + '_in22'
    df.loc[df.model.str.contains('resnet.*d'),'family'] = df.loc[df.model.str.contains('resnet.*d'),'family'] + 'd'
    return df[df.family.str.contains('^re[sg]netd?|beit|convnext|levit|efficient|vit|vgg|swin')]

Inference_Data = get_data('infer', 'infer_samples_per_sec')
Inference_Data.head(5)
model infer_samples_per_sec infer_step_time infer_batch_size infer_img_size param_count_x top1 top1_err top5 top5_err param_count_y img_size crop_pct interpolation model_org secs family
12 levit_128s 21485.80 47.648 1024 224 7.78 76.526 23.474 92.872 7.128 7.78 224 0.900 bicubic levit_128s.fb_dist_in1k 0.000047 levit
13 regnetx_002 17821.98 57.446 1024 224 2.68 68.752 31.248 88.542 11.458 2.68 224 0.875 bicubic regnetx_002.pycls_in1k 0.000056 regnetx
15 regnety_002 16673.08 61.405 1024 224 3.16 70.280 29.720 89.530 10.470 3.16 224 0.875 bicubic regnety_002.pycls_in1k 0.000060 regnety
17 levit_128 14657.83 69.849 1024 224 9.21 78.490 21.510 94.012 5.988 9.21 224 0.900 bicubic levit_128.fb_dist_in1k 0.000068 levit
18 regnetx_004 14440.03 70.903 1024 224 5.16 72.402 27.598 90.826 9.174 5.16 224 0.875 bicubic regnetx_004.pycls_in1k 0.000069 regnetx

2.2 : Plot of all the architectures.

Here’s the results for inference performance (see the last section for training performance). In this chart:

  • the x axis shows how many seconds it takes to process one image (note: it’s a log scale)

  • the y axis is the accuracy on Imagenet

  • the size of each bubble is proportional to the size of images used in testing

  • the color shows what “family” the architecture is from.

Hover your mouse over a marker to see details about the model. Double-click in the legend to display just one family. Single-click in the legend to show or hide a family.

import plotly.express as px
w,h = 1000,800

def show_all(Inference_Data, title, size):
    return px.scatter(Inference_Data, width=w, height=h, size=Inference_Data[size]**2, title=title,
        x='secs',  y='top1', log_x=True, color='family', hover_name='model_org', hover_data=[size])
show_all(Inference_Data, 'Inference', 'infer_img_size')

2.3 : Specific Architectures Plot

Let’s create a plot for selected architectures which we would like to use normally

# Filter data only for convnext, resnet
keywords = ['convnext', 'resnet','levit','beit']

# Filter rows based on the exact keywords
Best_Model_Df = Inference_Data[Inference_Data['family'].isin(keywords)]

show_all(Best_Model_Df, 'Inference', 'infer_img_size')

2.4 : Family Connection Plot

Let’s add lines through the points of each family, to help see how they compare – but note that we can see that a linear fit isn’t actually ideal here! It’s just there to help visually see the groups.

subs = 'levit|resnetd?|regnetx|vgg|convnext.*|efficientnetv2|beit|swin'

def show_subs(Inference_Data, title, size):
    df_subs = Inference_Data[Inference_Data.family.str.fullmatch(subs)]
    return px.scatter(df_subs, width=w, height=h, size=df_subs[size]**2, title=title,
        trendline="ols", trendline_options={'log_x':True},
        x='secs',  y='top1', log_x=True, color='family', hover_name='model_org', hover_data=[size])
show_subs(Inference_Data, 'Inference', 'infer_img_size')

We can conclude that Convnext can be go to model with decent GPU at our disposal, because it has more accuracy then resenet and it take less time than beit

Level 3 : Build a model using Convnext(basic or tiny)

List of all the basic & tiny version models in Convnext and choose the best.

[model for model in timm.list_models('convnext*') if 'base' in model or 'tiny' in model]
['convnext_base',
 'convnext_tiny',
 'convnext_tiny_hnf',
 'convnextv2_base',
 'convnextv2_tiny']
learn_conv = vision_learner(dls, convnext_base, metrics=error_rate).to_fp16()
learn_conv.fine_tune(5)
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ConvNeXt_Base_Weights.IMAGENET1K_V1`. You can also use `weights=ConvNeXt_Base_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
epoch train_loss valid_loss error_rate time
0 1.130588 0.183650 0.045175 00:17
/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()
/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()
epoch train_loss valid_loss error_rate time
0 0.306945 0.164550 0.047228 00:20
1 0.253462 0.118687 0.026694 00:11
2 0.207020 0.103058 0.024641 00:11
3 0.175618 0.094374 0.020534 00:11
4 0.151438 0.092010 0.020534 00:13

Compared to the resnet34 model, which had an error rate of 32%, the convnext_base model demonstrates a significant improvement with an error rate of just 21%

Structure of the architecture

learn_conv.summary()
Sequential (Input shape: 64 x 3 x 128 x 128)
============================================================================
Layer (type)         Output Shape         Param #    Trainable 
============================================================================
                     64 x 128 x 32 x 32  
Conv2d                                    6272       True      
LayerNorm2d                               256        True      
Conv2d                                    6400       True      
____________________________________________________________________________
                     64 x 32 x 32 x 128  
Permute                                                        
LayerNorm                                 256        True      
____________________________________________________________________________
                     64 x 32 x 32 x 512  
Linear                                    66048      True      
GELU                                                           
____________________________________________________________________________
                     64 x 32 x 32 x 128  
Linear                                    65664      True      
____________________________________________________________________________
                     64 x 128 x 32 x 32  
Permute                                                        
StochasticDepth                                                
Conv2d                                    6400       True      
____________________________________________________________________________
                     64 x 32 x 32 x 128  
Permute                                                        
LayerNorm                                 256        True      
____________________________________________________________________________
                     64 x 32 x 32 x 512  
Linear                                    66048      True      
GELU                                                           
____________________________________________________________________________
                     64 x 32 x 32 x 128  
Linear                                    65664      True      
____________________________________________________________________________
                     64 x 128 x 32 x 32  
Permute                                                        
StochasticDepth                                                
Conv2d                                    6400       True      
____________________________________________________________________________
                     64 x 32 x 32 x 128  
Permute                                                        
LayerNorm                                 256        True      
____________________________________________________________________________
                     64 x 32 x 32 x 512  
Linear                                    66048      True      
GELU                                                           
____________________________________________________________________________
                     64 x 32 x 32 x 128  
Linear                                    65664      True      
____________________________________________________________________________
                     64 x 128 x 32 x 32  
Permute                                                        
StochasticDepth                                                
LayerNorm2d                               256        True      
____________________________________________________________________________
                     64 x 256 x 16 x 16  
Conv2d                                    131328     True      
Conv2d                                    12800      True      
____________________________________________________________________________
                     64 x 16 x 16 x 256  
Permute                                                        
LayerNorm                                 512        True      
____________________________________________________________________________
                     64 x 16 x 16 x 1024 
Linear                                    263168     True      
GELU                                                           
____________________________________________________________________________
                     64 x 16 x 16 x 256  
Linear                                    262400     True      
____________________________________________________________________________
                     64 x 256 x 16 x 16  
Permute                                                        
StochasticDepth                                                
Conv2d                                    12800      True      
____________________________________________________________________________
                     64 x 16 x 16 x 256  
Permute                                                        
LayerNorm                                 512        True      
____________________________________________________________________________
                     64 x 16 x 16 x 1024 
Linear                                    263168     True      
GELU                                                           
____________________________________________________________________________
                     64 x 16 x 16 x 256  
Linear                                    262400     True      
____________________________________________________________________________
                     64 x 256 x 16 x 16  
Permute                                                        
StochasticDepth                                                
Conv2d                                    12800      True      
____________________________________________________________________________
                     64 x 16 x 16 x 256  
Permute                                                        
LayerNorm                                 512        True      
____________________________________________________________________________
                     64 x 16 x 16 x 1024 
Linear                                    263168     True      
GELU                                                           
____________________________________________________________________________
                     64 x 16 x 16 x 256  
Linear                                    262400     True      
____________________________________________________________________________
                     64 x 256 x 16 x 16  
Permute                                                        
StochasticDepth                                                
LayerNorm2d                               512        True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Conv2d                                    524800     True      
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
Conv2d                                    25600      True      
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Permute                                                        
LayerNorm                                 1024       True      
____________________________________________________________________________
                     64 x 8 x 8 x 2048   
Linear                                    1050624    True      
GELU                                                           
____________________________________________________________________________
                     64 x 8 x 8 x 512    
Linear                                    1049088    True      
____________________________________________________________________________
                     64 x 512 x 8 x 8    
Permute                                                        
StochasticDepth                                                
LayerNorm2d                               1024       True      
____________________________________________________________________________
                     64 x 1024 x 4 x 4   
Conv2d                                    2098176    True      
Conv2d                                    51200      True      
____________________________________________________________________________
                     64 x 4 x 4 x 1024   
Permute                                                        
LayerNorm                                 2048       True      
____________________________________________________________________________
                     64 x 4 x 4 x 4096   
Linear                                    4198400    True      
GELU                                                           
____________________________________________________________________________
                     64 x 4 x 4 x 1024   
Linear                                    4195328    True      
____________________________________________________________________________
                     64 x 1024 x 4 x 4   
Permute                                                        
StochasticDepth                                                
Conv2d                                    51200      True      
____________________________________________________________________________
                     64 x 4 x 4 x 1024   
Permute                                                        
LayerNorm                                 2048       True      
____________________________________________________________________________
                     64 x 4 x 4 x 4096   
Linear                                    4198400    True      
GELU                                                           
____________________________________________________________________________
                     64 x 4 x 4 x 1024   
Linear                                    4195328    True      
____________________________________________________________________________
                     64 x 1024 x 4 x 4   
Permute                                                        
StochasticDepth                                                
Conv2d                                    51200      True      
____________________________________________________________________________
                     64 x 4 x 4 x 1024   
Permute                                                        
LayerNorm                                 2048       True      
____________________________________________________________________________
                     64 x 4 x 4 x 4096   
Linear                                    4198400    True      
GELU                                                           
____________________________________________________________________________
                     64 x 4 x 4 x 1024   
Linear                                    4195328    True      
____________________________________________________________________________
                     64 x 1024 x 4 x 4   
Permute                                                        
StochasticDepth                                                
____________________________________________________________________________
                     64 x 1024 x 1 x 1   
AdaptiveAvgPool2d                                              
AdaptiveMaxPool2d                                              
____________________________________________________________________________
                     64 x 2048           
Flatten                                                        
BatchNorm1d                               4096       True      
Dropout                                                        
____________________________________________________________________________
                     64 x 512            
Linear                                    1048576    True      
ReLU                                                           
BatchNorm1d                               1024       True      
Dropout                                                        
____________________________________________________________________________
                     64 x 10             
Linear                                    5120       True      
____________________________________________________________________________

Total params: 88,605,184
Total trainable params: 88,605,184
Total non-trainable params: 0

Optimizer used: <function Adam at 0x7b5a37dbbeb0>
Loss function: FlattenedLoss of CrossEntropyLoss()

Model unfrozen

Callbacks:
  - TrainEvalCallback
  - CastToTensor
  - MixedPrecision
  - Recorder
  - ProgressCallback

Let’s downlod the model for future reference

learn_conv.export('Big_Cat_Convnext_Model.pkl')
#learn_conv.export('/content/drive/MyDrive/Colab Notebooks/FastAI Course/Big_Cat_Convnext_Model.pkl')

Level 4. Test the Model

Let’s test the model with an image

from fastai.vision.all import *
import gradio as gr
im = PILImage.create('/content/drive/MyDrive/Colab Notebooks/FastAI Course/SnowLeopard.jpg')
im.thumbnail((224,224))
im

learn_conv = load_learner('/content/drive/MyDrive/Colab Notebooks/FastAI Course/Big_Cat_Convnext_Model.pkl')
learn_conv.predict(im)
('CHEETAH',
 tensor(2),
 tensor([1.0988e-04, 8.7617e-05, 9.2564e-01, 2.9294e-06, 1.2592e-06, 1.6162e-06,
         2.6748e-02, 6.5111e-07, 4.7398e-02, 7.3348e-06]))
learn_conv.dls.vocab
['AFRICAN LEOPARD', 'CARACAL', 'CHEETAH', 'CLOUDED LEOPARD', 'JAGUAR', 'LIONS', 'OCELOT', 'PUMA', 'SNOW LEOPARD', 'TIGER']
categories = learn_conv.dls.vocab

pred, idx, probs = learn_conv.predict(im)
result = dict(zip(categories, map(float,probs)))
result
{'AFRICAN LEOPARD': 0.00010988322173943743,
 'CARACAL': 8.761714707361534e-05,
 'CHEETAH': 0.9256432056427002,
 'CLOUDED LEOPARD': 2.929433776444057e-06,
 'JAGUAR': 1.2592141729328432e-06,
 'LIONS': 1.6162448446266353e-06,
 'OCELOT': 0.026747871190309525,
 'PUMA': 6.511066317216319e-07,
 'SNOW LEOPARD': 0.04739758372306824,
 'TIGER': 7.334848760365276e-06}

Top 3 Predicted Cat Names with Highest Probability”

sorted_result = dict(sorted(result.items(), key=lambda item: item[1], reverse=True))
top_classes = list(sorted_result.keys())[:3]
top_classes
['CHEETAH', 'SNOW LEOPARD', 'OCELOT']

So our model predicted CHEETAH with probablity of 93%