Medium / Medium.com – Telegram

Medium / Medium.com

1.25K subscribers

106K links

Just main page of medium.com fresh from the oven

Download Telegram

About

Blog

Apps

Platform

Medium / Medium.com

1.25K subscribers

Medium / Medium.com

Deriving the DPO Objective Under the Plackett-Luce Model

#aifinetuning #directpreferenceoptimization #reinforcementlearning #languagemodels #languagemodeloptimization #rewardmodeling #bradleyterrymodel #plackettlucemodel

https://hackernoon.com/deriving-the-dpo-objective-under-the-plackett-luce-model

Deriving the DPO Objective Under the Plackett-Luce Model

Learn how the Plackett-Luce model is used to derive the DPO objective.

17 views22:30