Measuring progress in dictionary learning for language model interpretability with board game modelsKarvonen, Adam ; Wright, Benjamin ; Rager, Can ; Angell, Rico ; Brinkmann, Jannik ; Smith, Logan Riggs ; Verdun, Claudio Mayrink ; Bau, David ; Marks, Samuel
BASE:
Karvonen, Adam
;
Wright, Benjamin
;
Rager, Can
;
Angell, Rico
;
Brinkmann, Jannik
;
Smith, Logan Riggs
;
Verdun, Claudio Mayrink
;
Bau, David
;
Marks, Samuel
Google Scholar: Karvonen, Adam ; Wright, Benjamin ; Rager, Can ; Angell, Rico ; Brinkmann, Jannik ; Smith, Logan Riggs ; Verdun, Claudio Mayrink ; Bau, David ; Marks, Samuel Page ViewsYou have found an error? Please let us know about your desired correction here: E-Mail Actions (login required)
|
|