Why does the generated sentence have little to do with the original? #4

uhauha2929 · 2018-04-27T11:52:30Z

我用中文语料做的测试，词语和字符我都试过了，生成的句子貌似也有可读性，但是和原文出入较大，不知道是什么原因，是模型太简单了？

chen0040 · 2018-05-03T04:08:03Z

@uhauha2929 the text body and summarized text use different vocabulary, which might explain what you observed. Also the max_sequence_length is set on the text body, meaning it does not read any text from the text body after the max_sequence_length of words are read from the text body. One way to address the issue u mentioned is to use a single vocabulary for both text body and summarized text or read more texts from the text body. Also depending on the language of the text body (for example, the chinese requires chinese tokenizer may give a better result i think)

uhauha2929 changed the title ~~为什么生成的摘要和原文没有太大的关系？~~ Why does the generated sentence have little to do with the original? Apr 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does the generated sentence have little to do with the original? #4

Why does the generated sentence have little to do with the original? #4

uhauha2929 commented Apr 27, 2018

chen0040 commented May 3, 2018

Why does the generated sentence have little to do with the original? #4

Why does the generated sentence have little to do with the original? #4

Comments

uhauha2929 commented Apr 27, 2018

chen0040 commented May 3, 2018