Of course the general consensus on reddit is "lemmy devs are clueless and dangerous". I'm pretty sure a lot of it is one guy with multiple alt accounts, tho. He has a Joe McCarthy attitude about lemmy because of one of the primary devs.
Can someone with more knowledge on the lemmy protocol/api bring some light into this? The way the linked posted is written, it seems like some random angry guy just hates lemmy for whatever reason.
To me it seems like a complete bs argument. As far as I can tell this tactic is possible with every service where users can provide content. Of course I can link to a site that reads users data. There’s basically no preventing this unless the (lemmy) clients provide their own modified browser that masks the users IP and other metadata.
You actually can prevent this easily with CSP (content security policy). That header tells your browser which adresses it is allowed to load additional data from when visiting your site. It is an important tool to prevent cross-site scripting attacks, your browser should not load data from random sources when it is on your site.
Of course you would have to funnel all inline images through a site-local proxy that the browser is allowed to load data from.
This also has not only security implications, but also with the GDPR. Some jurisdiction consider ip addresses as personal data. Sending them to e.g. the US without user consent would be a violation. I know it is stupid to consider ip addresses as personal data and it is stupid to consider a browser loading data as sending that personal data somewhere on the sites' behalf. But there is a reason why a lot of websites for example only embed tweets after you explicitely allow it.
I think when you link images off-site on Reddit, Reddit still caches a preview for it and serves that to the user, the user will actually have to click a link to go off the platform into the unknown. If we do embeds and such here they’re loaded from off site directly without user interaction.
Ergo your browser makes a request to a random potentially dangerous server, and there isn’t much the average user can do to prevent that.
This is a valid privacy issue, and other fediverse projects like Mastodon already solve this. The problem is that by embedding an image, you can tell the client to make a network request to your server, revealing information such as your IP address and browser. The solution is to proxy media through your instance, which is presumably trusted. this hides your IP address and browser information. And as someone else mentioned here, a Content-Security-Policy can be used to ensure this attack isn't possible in a browser.