BEYOND BINARY REWARDS: TRAINING LMS TOREASON ABOUT THEIR UNCERTAINTY
https://gist.github.com/josherich/8a30dbf3d6ae0cae1048c3331f38fe80https://gist.github.com/josherich/8a30dbf3d6ae0cae1048c3331f38fe801引言与此担忧一致,研究表明,即使最初校准良好的大型语言模型(LLMs)在RL训练后也会变得过度自信(Lengetal.,2