Multi-Label Knowledge Distillation

Existing knowledge distillation methods typically work by imparting the knowledge of output logits or intermediate feature maps from the teacher network to the student network, which is very successful in multi-class single-label learning.

Paper and LLMs Binary Classification Knowledge Distillation

GitHub Link

The GitHub link is https://github.com/penghui-yang/l2d

Introduce

The GitHub repository "penghui-yang/L2D" contains the official implementation of the ICCV'23 paper titled "Multi-Label Knowledge Distillation." The project focuses on multi-label knowledge distillation and provides code to replicate the results. The repository includes requirements for running the code, installation instructions, and a quick start guide for training on MS-COCO dataset. It also explains how to use your own datasets and provides guidance on creating configuration files. The distillation process involves three parts feature-based, label-wise embedding, and logits-based, each with corresponding parameters. Existing knowledge distillation methods typically work by imparting the knowledge of output logits or intermediate feature maps from the teacher network to the student network, which is very successful in multi-class single-label learning.

Content

But it should be runnable with other PyTorch versions. You can train on MS-COCO with default settings stored in ./configs/coco/resnet101_to_resnet34_l2d.py: You can also try your own distillers and other options by making your own configuration files under the guidance of Configuration files. Your Pascal VOC 2007 dataset folder should be like this: Your MS-COCO 2014 dataset folder should be like this: train_anno.json and val_anno.json are in the fold ./appendix. Your NUS-WIDE dataset folder should be like this: All codes of the data processing part are in the fold ./data, and you can replace them with your own code. We use configuration files to pass parameters to the program. An example in the fold ./configs is shown below: We split a distiller into three parts: feature-based part, label-wise embedding part and logits-based part. Each part has a balancing parameter lambda and corresponding parameters.

Alternatives & Similar Tools

Free Google Gemini: the best largest and most capable AI model Free

Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.

Visit →

Video ReTalking-focuses on audio-based lip synchronization for talking head video editing Open Source

Video ReTalking, advanced real-world talking head video according to input audio, producing a high-quality

Visit →

UniSim-Chat Control Video and Virtual simulation Open Source

Then transplant it to the real world to solve complex problems

Visit →

LongLLaMA-handle very long text contexts, up to 256,000 tokens Open Source

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.

Visit →

LLaVA-LLMs designed to connect a vision encoder with a language model Open Source

Large Language and Vision Assistant

Visit →

Ntropy Insights- Save 80% on underwriting businesses everywhere Freemium

Use bank data and Ntropy's AI. Parse bank feeds and statements, extract revenue and COGs, automatically re-create a P&L within milliseconds. Any industry, any geo.

Visit →