And if so, why exactly? It says it's end-to-end encrypted. The metadata isn't. But what is metadata and is it bad that it's not? Are there any other problematic things?
I think I have a few answers for these questions, but I was wondering if anyone else has good answers/explanations/links to share where I can inform myself more.
They don't really need the actual contents of your messages if they have the associated metadata, since it is not encrypted, and provides them with plenty of information.
So idk, I honestly don't see why I shouldn't believe them. Don't get me wrong though, I fully support the scepticism.
Metadata is all the content of a message besides the actual text content of the message (i.e. what you type). Examples would be the date and time it is sent, what users these messages were sent to / from, and the IP addresses of both parties. (The availability of metadata varies from messenger to messenger).
I like this example: If you only text your Aunt Sally, who lives in Alaska, twice per year to wish her a happy birthday and Christmas, just by looking at the metadata someone could infer the meaning of your messages, as well as your relationship to the person you're messaging. To a point this is true about any messages you sent.
As for Whatsapp specifically, it being end-to-end doesn't really matter imo, as the application is not open source and is owned by an advertising / social media company. As long as the code is closed source, you cannot be sure:
That your messages are encrypted at all
That your encryption keys are kept on-device, and not plainly available to a centralized party
That the encryption the application is using is securely implemented
At least for applications handling truly sensitive information (for the average person only their messenger and browser), you should be using open source software. The easiest recommendations I can make are:
Browsers: Firefox, Thorium, Brave (disabled all cryptocrap)
That your messages are encrypted at all
That your encryption keys are kept on-device, and not plainly available to a centralized party
That the encryption the application is using is securely implemented
This is true, but something that should be noted is that, to my knowledge, no law enforcement agency has ever received the supposedly encrypted content of WhatsApp messages. Facebook Messenger messages are not E2E encrypted by default, and there have been several stories about Facebook being served a warrant for message content and providing it. This has, as I understand, not occurred for WhatsApp messages. It is possible, of course, that they do have some kind of access and only provide it to very high-level intelligence agencies, but there's no direct evidence of that.
I would personally say that it's more likely than not that WhatsApp message content is legitimately private, but I'd also agree that you should use something like Signal if you're genuinely concerned about this.
They would better hide those evidences as best as they can, or they would lose a useful source of informations.
That's the whole game of intelligence: to be a step ahead of the opponent, it must believe its safe so you can steal useful informations. As soon as the breach is discovered, it ceases to be useful.
If you log into WhatsApp on another device, does your history show up?
If it does, that means they hold your encryption keys on their server. It's the only way this could work.
It's why with Signal you need to maintain your keys and keep backups. No one else has your keys, so logging in to other devices won't get history without that backup and the keys.
Works this way with encrypted XMPP too, of course.
How do I know other browsers/messengers actually include the code that is published when they arrive on my phone? Wouldn't it be possible to simply add tracking/malicious code outside of the open-source repository, build an APK from it and put that on the Play Store instead of the "clean" code on the repository?
You could compile the software yourself, and the builds they do publish are reproducable, therefore any hidden malicious code would almost certainly be noticed in any popular application.
What use is this knowledge through metadata to them? Let's say I have no Facebook account and no other apps by Meta. There are no ads within WhatsApp. What do they gain by having this data about me?
They know your relationships with other people, and could infer things about you which will be stored in their servers regardless of whether you have a Facebook account, I believe if you search for "shadow accounts" you can read more about that
The biggest problem is that it uploads your entire contact list and thus social network to Facebook. That alone tells them a lot about who you are, and crucially, also leaks this information about your friends (whether they use it or not).
With contacts disabled it's a pain to use (last time I tried you couldn't add people or see names, but you could still write to people after they contacted you if you didn't mind them just showing up as a phone number).
It still collects metadata - who you text, when, from which WiFi - which reveals a lot. But if both you and your contact use it properly (backups disabled or e2e encrypted), your messaging content doesn't get leaked by default. They could ship a malicious version and if someone reports your content it gets leaked, of course, but overall, still much better than e.g. telegram which collects all of the above data AND doesn't have useful E2EE (you can enable it but few do, and the crypto is questionable).
It might be E2EE but it's not encrypted on your phone and it's closed source. How do you know they don't send the conversation data to their company? How do you know they don't get the encryption keys to decipher the messages for them?
How do you know they don't get the encryption keys to decipher the messages for them?
My guess is that they just capture keywords before you send it. They don't need to read the contents of the sent conversation when both parties to the conversation are using an app they own. They can detect keywords before sending, log and report them, then send the message encrypted. No need to retain encryption keys since they already extracted what they want.
Other apps may have code published in a repository, but the path from repository into the Play Store onto my phone is not clear. How do I know that they don't add extra tracking code on top during the build and release to the Play Store? With for example a popular alternate app, Signal?
Your address book is uploaded to Facebook servers when you use Whatsapp. And each time you interact, they know with who and link this information with other profiles and users of the Meta products.
E2E is not equal to Symmetric Encryption, which is the most private "one way" encryption meaning the user controls the data at the origin, and the messages can't be decrypted by anyone else.
WhatsApp is not the latter, so it is not private. Signal is symmetric, for example.
Care to elaborate? You can't just imply asymmetric encryption can be decrypted by 3rd parties and not explain how.
Also I don't know how exactly signal works but I know that you don't need to share secrets externally to message someone, so how are they exchanging the symmetric keys without using asymmetric encryption to boot?
This is more of a "how encryption" works question, so I'll just defer to some article response I got from Google which explains it simpler than I would:
"When someone sends a message to a contact over an app using the Signal protocol, the app combines the temporary and permanent pairs of public and private keys for both users to create a shared secret key that's used to encrypt and decrypt that message. Since generating this secret key requires access to the users' private keys, it exists only on their two devices. And the Signal protocol's system of temporary keys—which it constantly replenishes for each user—allows it to generate a new shared key after every message."
Not sure what you mean? It's a meta app, they can easily fingerprint your devices without the need of an IP address, VPN doesn't matter at all. If you access any Meta services from the same device, they know exactly who you are.