Improving language models by retrieving from trillions of tokens #2108

icoxfog417 · 2021-12-12T07:07:58Z

一言でいうと

事前学習済み言語モデルをデータスケーラブルにする手法。文字列をchunkにわけ、chunk内のtokenを既出のtokenだけでなく一つ前のchunkをクエリとして得られるcontextに依存させる。contextのベクトル検索に使用するデータセットは学習データとは別に用意し拡大が可能で、これにより性能を向上できる。

論文リンク

https://arxiv.org/abs/2112.04426

著者/所属機関

Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, Laurent Sifre

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving language models by retrieving from trillions of tokens #2108

Improving language models by retrieving from trillions of tokens #2108

icoxfog417 commented Dec 12, 2021

icoxfog417 commented Dec 12, 2021

Improving language models by retrieving from trillions of tokens #2108

Improving language models by retrieving from trillions of tokens #2108

Comments

icoxfog417 commented Dec 12, 2021

一言でいうと

論文リンク

著者/所属機関

投稿日付(yyyy/MM/dd)

概要

新規性・差分

手法

結果

コメント

icoxfog417 commented Dec 12, 2021