I think it's a really cool idea. My first concern is about the amount of information that could be stored. I'm betting that syncing for certain services or use cases means transporting a significant amount of data.
You would definitely want to compress your message before adding it to the image. Using smething modern like ZStd provides incredibly high compression rates. And ideally this type of storage is best suited for backing up private keys in places that you can locate later, and are unlikely to be either noticed or deleted.
Size could definitely be a concern. I've got no idea what my requirements will be, but I at least know what the constraints are. For a baseline, each image can be up to 256KB. In theory, you can have 10,000 playlists (each with an image), but I wouldn't actually want to have more than one or a handful at the most so that it doesn't get in the way of the actual user's playlists. Of course, there's a good chance you wouldn't get all 256KB. They may let you send in arbitrary data, but I wouldn't be surprised to find them erase anything that's not a valid jpg. Also, if they re-compress after upload, that would be another hurtle.
But, generally speaking, this is a way to allow folks to write arbitrary data to the net that you can use for syncing (or whatever, but syncing in my case) without having to run a database or user accounts. And, once you can do that, it's game on.
Yeah even with nice compression like zstd and newer encryption algorithms like SUPERCOP you are going to still end up with quite a bit of data. Its always easier to sneak it into a binary or image than side-channels. And if the music is coming from an API then obvio you lose a lot of the places you can hide things.