Why isn't everyone talking about AI generated audiobooks?
I just listened to this AI generated audiobook and if it didn't say it was AI, I'd have thought it was human-made. It has different voices, dramatization, sound effects... The last I'd heard about this tech was a post saying Stephen Fry's voice was stolen and replicated by AI. But since then, nothing, even though it's clearly advanced incredibly fast. You'd expect more buzz for something that went from detectable as AI to indistinguishable from humans so quickly. How is it that no one is talking about AI generated audiobooks and their rapid improvement? This seems like a huge deal to me.
A lot of people just aren't aware of how fast AI is moving. AI voices were pretty meh earlier this year. A lot of people working on the audiobook/voice acting scene have been talking about this though.
I recommend everyone to check the YouTube channel "two minute papers" who have being doing videos about papers on AI for the last 10 years on so to see the accelerated progress AI have. Like 5 years ago those images generating AI looked like LSD infused dreams and now they look almost perfect.
Ah yes, Audio AI. I can't wait for this rapidly-approaching future where you literally won't be able to trust the validity of anything your senses tell you anymore
Or imagine politicians like Trump saying the most heinous stuff and then denying it saying it's fake or AI. How will people know? You won't even be able to trust your eyes or ears anymore.
Tech like this has been available for a number of years, and has most likely already been used against you. It's now getting available for the broader masses, but that might just be a blessing in disguise, since increased awareness will hopefully also make you suspicious of those cases that are already happening.
Yes, but you could tell they weren't real. They still needed real voice actors, real sound design, studios and stages and resources. Anyone with a halfway decent rig can fake shit to a very believable degree. Even with CGI you swear is fantastic, you see its fakeness once the novelty wears off
I want TTS made better with AI so that I won't need huge audiobooks filling up my phone. The epubs that I already have would serve as audiobooks when needed.
I have frequently used tts for listening to epubs. I have, however, not noticed much battery drain... And it's not as enjoyable as listening to an audiobook read by a narrator you like but it kind of works to a certain extent. So I wish you tts would get better.
As someone who only consumes books in audiobook form this is great news for me, I tried to listen to some automatically generated audio books around 2 years ago and I found them horrible to listen to just because they sounded so off.
I'd love to be able to copy in the text of a book and get actually listenable (is that a proper word?) audiobook out of the other side for some books that will just simply never be recorded by actual people due to being too old / obscure.
I've been wanting to be able to listen to the Pelucidar books for years but they just don't exist in audio format, is there somewhere publically available that I can do this?
I can't speak for OP but I do this as well. For me it's because I listen to them on the drive to/from dropping my kids off at school and I'll have it playing while I'm working or playing a game.
I like to read books before bed, but need darkness for a while before I have any chance of going to sleep, so me and my wife listen to 45min of audio book a night before going to sleep. Plus when we listen together there is no need to worry about getting ahead of each other and spoiling stuff.
I read books in other scenarios but that ritual is by the most time I have for reading and the most consistent as well.
Personally I mostly use audio books instead of reading because I get eye strain a lot easier than I used to. I go to an eye specialist for unrelated issues yearly, so it’s not an issue with a wrong lens prescription. It’s not a problem when I’m doing a low attention task where I can look away frequently, but for reading it sucks.
Not rude at all, similar to the other responses people have given but it oa two fold really. Firstly I just don't do well with sitting and reading a book, I get bored very quickly, can't concentrate on what is happening and start re-reading sentences or pages over and over where I am not paying attention properly. Additionally after only a couple of pages it will start putting me to sleep, I guess my attention span is just not sufficient for this form of media.
As a result I never read any books until I discovered audiobooks and my love for them, I honestly just disregarded books as a form of entertainment and thought they were a waste of time until discovering this way to consumer them which wasn't until I was in my early 30s.
On top of that I now listen to them mostly at work, I work with industrial machines and the work is repetitive as fuck and having a book to listen to makes the time go a lot faster and in a lot more interesting manner. Consequently I now love books and will listen to between 6 and 10 hours a day and now listen to them when I'm doing things like cooking, cleaning or running when I am not at work.
I listen to a lot of audiobooks too, but I wouldn't listen to something like this.
Have you listened to the one OP posted? After a minute I'm sleeping. There's no emotion, no tension, nothing.
I can't stress enough how bad OP's sounds. Sure, it sounds natural when compared to what technology was capable of some time ago, but it's dead inside.
Good voice actors bring a book alive. This doesn't.
I hadn't actually when I originally wrote this comment as I wasn't somewhere that I could listen to audio (without being an obnoxious cunt and I won't be one of those people even if the majority of people seem to be OK with it xD)
However I would agree, I couldn't listen to a whole book with this monotonous drone of a voice, however like you said compared to what was produced by voice to text type systems of the past this is miles ahead and I certainly look forward to the technology progressing to a point where it is listenable for me.
I agree with good voice actors being essential for enjoyment, I think that is one of the reasons I fell in love with graphic audio productions over the last couple of years.
Yes I realise that and was over simplifying in this response but as I stated in another comment I would be more than happy to work on prompts for myself if it could generate something satisfactory to listen to.
The video posted by OP still sounds a bit "dead" so I don't think the tech is quite there yet but it is promising for the future the way it is headed.
That sounds pretty cool, though I'd be concerned it will suffer from the classic problem of current AI (...and humans, but that's by the by) of confident incorrectness. Like an automatic transmission can miss meanings and types of context that a human will spot, programmatically generating speech can probably mess up punctuation and flow - even the way a human reader sometimes will get part way through a sentence and realise they need to start again for it to come out right.
That said, I can't see it being a big problem for most works, just unfortunate here and there. For once it seems an AI application short on downsides! (Except for the usual economic ones for many people previously trained in the field.)
There was a fairly big 40K lore channel on YouTube with a rather good AI impersonation of David Attenborough's voice and narration style/scripting. However, I just went to check it, yet it must have recently gotten hit with a DMCA and taken down. A shame really. Though I never got into 40K lore before, or the 40K franchise in general, I am a big fan of David Attenborough, and so that ended up really drawing me in to a new literary universe. However, it was a big mistake by the YouTube creator to use the name and photo likeness of Attenborough in the branding, video titles, and thumbnail art on the channel. I think without pushing that line, the AI voice with a clear disclosure could have kept the channel under the legal radar.
From the pinned comments made here, this looks to be the same creators new channel, now using a different voice, no longer based on any one real person:
I’ve been getting into audiobooks in a big way recently. This is interesting but somehow seems off to me. Maybe I’ll try listening to one and have my mind changed. We’ll see!
Audiobooks are offputting to me and I strongly prefer to read text, but this seems like a great thing overall for making books more accessible to people. More people experiencing a wider range of books is good.
Audiobooks have been a great coping mechanism for my ADHD, they've also made me a better driver.
For the latter, if I listen to my music I definitely feel a bit more aggressive, whereas if it's an audiobook (and I've given myself sufficient room), I'm much more forgiving.
For the former, I can mix them with menial tasks and it makes them so much more doable.
There are also a few AI sung songs out there that are pretty good. Most of them sound pretty Autotuny, but to some extent, that can be a style. Aura, by Ghost, is a good example. If I didn't know it was ai, I would just think it was autotune.
The topic was "Why isn't everyone talking about AI generated audiobooks?". Which I answered. Maybe if you spent more time reading yourself you would have comprehended that.
I'm sympathetic to the view that artists should be paid for their work. Collectively, artists have produced so much, and these tech companies are funnelling all their work into a machine and recycling it into new works, and profiting off that, without any compensation for the people partially responsible for this new reality. I'm also not interested in people who argue "but actually it's not copying that's not how the technology works it's actually a really complic-" yeah I don't care. Without the artists you would have nothing.
BUT
Don't confuse the business practices that make this technology a reality with the technology itself. These tools are incredible, and will result in things that could have never existed previously. I just believe we need to have serious conversations about what they mean for our future.
Exactly! I would do unspeakable things for a tool that would let me pop an epub file in and let me tune the voices and audio effects to my liking. I always have some problem or another with the voice actors.