Skip Navigation

Dataset of 196,640 books in plain text

github.com Here’s a download link for all of bookcorpus as of Sept 2020 · Issue #27 · soskek/bookcorpus

You can download it here: https://twitter.com/theshawwn/status/1301852133319294976?s=21 it contains 18k plain text files. The results are very high quality. I spent about a week fixing the epub2txt...

Here’s a download link for all of bookcorpus as of Sept 2020 · Issue #27 · soskek/bookcorpus
0
0 comments