Fine-tuning

Thanks for sharing the code and your research.

Could you also share the fine-tuning code if available? I'm working on pre-training the network and then fine-tuning it on the same dataset using supervised learning.

Should I remove the decoder and add linear layers at the end for this process?