Skip to content

An open-source implementaion for fine-tuning DINOv2 by Meta.

License

Notifications You must be signed in to change notification settings

2U1/DINOv2-Finetune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Finetuning DINOv2 for downstream task

This repository is for training DINOv2 for downstream tasks.
It is not for self-supervised learning.

Table of Contents

Supproted Features

  • Data Pararell
  • Class Balanced Loss
  • Rare Class Sampling
  • Select optimizer
  • Freeze/Unfreeze backbone

Installation

Install the required packages using requirements.txt. Because of xformer, it requires the latest version of pytorch. However you can use different version of xformer and pytorch.

Using requirements.txt

pip install -r requirements.txt

Dataset Prearation

The script requires a dataset formatted as below.

Data
├── ...
├── Class4
│   ├── Img1.png
│   ├── Img2.png
│   ├── ...
├── Class5
│   ├── Img1.png
│   ├── Img2.png
│   ├── ...
├── ...

Data preprocessing: Please run the following scripts to generate the class_stats.json.

python tools/preprocess.py /path/to/yout/dataset

Training

You can launch the training code by using:

bash train.sh

You can set your training arguments at config.py.
There is a setting for Rare Class Sampling(RCS). It is a setting for long-talied classification motivated from DAFormer.
This will sample the rare class more often during the iteration. However it has a risk of model to not see some classes. It is more suitable for multi-class classifiaction.

Training arguments
  • batch_per_gpu (int): Number of samples per GPU in each forward step (default: 16).
  • num_gpu (int): Number of GPUs used for training (default: 1).
  • resize (tuple): The size to which input images are resized (default: (224, 224)).
  • mean (list): Mean normalization values for each channel in RGB format (default: [0.485, 0.456, 0.406]).
  • std (list): Standard deviation normalization values for each channel in RGB format (default: [0.229, 0.224, 0.225]).
  • optimizer (dict): Optimizer settings.
    • type: Optimizer type (default: 'SGD').
    • params: Additional optimizer parameters, such as momentum (default: 0.9).
    • learning_rate: Learning rates for different parts of the model.
      • head_lr: Learning rate for the head (default: 1e-3).
      • backbone_lr: Learning rate for the backbone (default: 1e-6).
  • scheduler (dict): Learning rate scheduler settings.
    • type: Scheduler type (default: 'linear').
    • params: Additional scheduler parameters like warmup ratio (default: 0.03).
  • do_eval (bool): Whether to perform evaluation during training (default: False).
  • num_train_epoch (int): Number of epochs for training (default: 100).
  • model (dict): Model architecture settings.
    • backbone: Backbone model type (default: 'dinov2_l').
    • head: Classification head type (default: 'single').
    • num_classes: Number of output classes (default: 3).
    • freeze_backbone: Whether to freeze the backbone during training (default: False).
  • loss (dict): Loss function settings.
    • loss_type: Type of loss function (default: 'CE_loss').
    • beta: Beta parameter for class-balanced loss (default: None).
    • gamma: Gamma parameter for focal loss (default: None).
  • dataset (dict): Dataset paths.
    • train: Training dataset settings.
      • data_root: Root directory of the training dataset.
    • eval: Evaluation dataset settings.
      • data_root: Root directory of the evaluation dataset.
  • max_checkpoint (int): Maximum number of checkpoints to keep (default: 1).

Note: The backbone learning rate is often set to be much smaller than the head learning rate to prevent overfitting the pretrained layers.

Evalutation

You can evaluate your model by using:

bash eval.sh

The evaluation will calculate the top-k accuracy together.

TODO

  • Multi-label classification
  • Segmentation

License

This project is licensed under the Apache-2.0 License. See the LICENSE file for details.

Citation

If you find this repository useful in your project, please consider giving a ⭐ and citing:

@misc{Dino-v2-Finetuning,
  author = {Yuwon Lee},
  title = {Dino-V2-Finetune},
  year = {2024},
  publisher = {GitHub},
  url = {https://github.com/2U1/DINOv2-Finetune}
}

Acknowledgement

This project is based on

About

An open-source implementaion for fine-tuning DINOv2 by Meta.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published