Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving language models by retrieving from trillions of tokens #2108

Open
icoxfog417 opened this issue Dec 12, 2021 · 1 comment
Open

Improving language models by retrieving from trillions of tokens #2108

icoxfog417 opened this issue Dec 12, 2021 · 1 comment

Comments

@icoxfog417
Copy link
Member

一言でいうと

事前学習済み言語モデルをデータスケーラブルにする手法。文字列をchunkにわけ、chunk内のtokenを既出のtokenだけでなく一つ前のchunkをクエリとして得られるcontextに依存させる。contextのベクトル検索に使用するデータセットは学習データとは別に用意し拡大が可能で、これにより性能を向上できる。

image

論文リンク

https://arxiv.org/abs/2112.04426

著者/所属機関

Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, Laurent Sifre

  • DeepMind

投稿日付(yyyy/MM/dd)

2021/12/8

概要

新規性・差分

手法

結果

コメント

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant