Dataset of 196,640 books in plain text
Dataset of 196,640 books in plain text
github.com Here’s a download link for all of bookcorpus as of Sept 2020 · Issue #27 · soskek/bookcorpus
You can download it here: https://twitter.com/theshawwn/status/1301852133319294976?s=21 it contains 18k plain text files. The results are very high quality. I spent about a week fixing the epub2txt...
0 comments