Byeongho Heo, Jeesoo Kim, Sangdoo Yun, Hyojin Park, Nojun Kwak, Jin Young Choi arXiv Github Project Page Knowledge Distillation in a Nutshell The general process of knowledge distillation Knowledge distillation denotes a method that a small model is trained to mimic a pre-trained large model by passing the data from it to the small model […]