I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.
This one is about why it was a mistake to call 1024 bytes a kilobyte. It's about a 20min read so thank you very much in advance if you find the time to read it.
This has been my pet rant for a long time, but I usually explain it .. almost exactly the other way around to you.
You can essentially start off with nothing using binary prefixes. IBM's first magnetic harddrive (the IBM 350 - you've probably seen it in the famous "forklifting it into a plane" photo) stored 5 million characters. Not 5*1024*1024 characters, 5,000,000 characters. This isn't some consumer-era marketing trick - this is 1956, when companies were paying half a million dollars a year (2023-inflated-adjusted) to lease a computer. I keep getting told this is some modern trick - doesn't it blow your mind to realise hdd manufacturers have been using base10 for nearly 70 years? Line-speed was always a lie base 10, where 1200 baud laughs at your 2^n fetish (and for that matter, baud comes from telegraphs, and was defined before computers existed), 100Mbit ethernet runs on a 25MHz clock, and speaking of clocks - kHz, MHz, MT/s, GT/s etc are always specified in base 10. For some reason no-one asks how we got 3GHz in between 2 & 4GHz CPUs.
As you say, memory is the trouble-maker. RAM has two interesting properties for this discussion. One is that it heavily favours binary-prefixed "round numbers", traditionally because no-one wanted RAM with un-used addresses because it made address decoding nightmarish (tl;dr; when 8k of RAM was usually 8x1k chips, you'd use the first 3 bits of the address to select the chip, and the other 10 bits as the address on the chip - if chips didn't use their entire address space you'd need to actually calculate the address map, and this calculation would have to run multiples of times faster than the cpu itself) . The second, is that RAM was the first place non-CSy types saw numbers big enough for k to start becoming useful. So for the entire generation that started on microcomputers rather than big iron, memory-flavoured-k were the first k they ever tasted.
I mean, hands up who had a computer with 8-64k of RAM and a cassette deck. You didn't measure the size of your stored program in kB, but in seconds of tape.
This shortcut than leaked into filesystems purely as an implementation detail - reading disk blocks into memory is much easier if you're putting square pegs into square holes. So disk sectors are specified in binary sizes to enable them to fit efficiently into memory regions/pages. For example, CP/M has a 128-byte disk buffer between 0x080 and 0x100 - and its filesystem uses 128-byte sectors. Not a coincidence.
This is where we start getting into stuff like floppy disk sizes being utter madness. 360k & 720k were 720 and 1440 512-byte sectors. When they doubled up again, we doubled 2800 512-byte sectors gave us 1440k - and because nothing is ever allowed to make sense (or because 1.40625M looks stupid), we used base10 to call this 1.44M.
So it's never been that computers used 1024-shaped-k's. It should be a simple story of "everything uses 1,000s, except memory because reasons". But once we started dividing base10-flavoured storage devices into base2-flavoured sectors, we lost any hope of this ever looking logical.
aside: the little-k thing. SI has a beautifully simple rule, capital letters for prefixes >1, small letters for prefixes <1. So this disambiguates between a millivolts (mV) and megavolts (MV).
But, and there's always a but. The kilogram was the first SI unit, before they'd really thought it through. So we got both a lower-case k breaking such a beautifully simple rule, and the kilogram as a base unit instead of a gram. The Kilogram is metric's "screw it, we'll do it live".
Luckily this is almost a non-issue in computing as a fraction of a bit never shows up in practice. But! If you had a system that took 1000 seconds to transfer one bit, you could call that a millibit per second, or mbps, and really mess things up.
Yeah, base ten really screws around with programming. You specifically have to use a decimal type if you really want to use it (for like finance or something), but it's much slower.