It should be reasonably trivial to programmatically watch the frames; original programming will have mastered audio levels and set video compression; any shift to an ad should stand out like a sore thumb.
So as long as things aren’t locked down to a DRM’d player, it should be possible to fingerprint the audio and video stream content and drop any inserted frames that don’t match.
If YouTube decides to mangle the original content to fight back… then maybe that’s finally the impetus people will need to switch platforms.
You wouldn't neccesarily need to pay attention to the master and all, probably easier to request the video twice from youtube, detect the bits that don't change, skip timestamps around to only play those bits. Might have a bit of a failure rate if the same ad is served twice, and youtube could fight back by letting creators make slightly different video versions but still better than nothing.
If YouTube decides to mangle the original content to fight back… then maybe that’s finally the impetus people will need to switch platforms.
Switch to where? Everything that's not just a different youtube frontend is either shit or doesn't pay the creators. Federated FOSS sites aren't an option either cause once an influx of users outside the tech bubble happens, the server capacity will hit ground bottom.
The way the foss sites work is basically like torrents though so it has the whole "the more people that watch a video, the faster it becomes for everyone" effect. The primary upside is that each site serves as a guaranteed seeder.