You've just invented a really shitty GPU, essentially.
Only some work is easily parallelized (and benefits from it) such as graphics, so people have essentially done what you've described only the cores aren't 8086 but more specialised for graphics math such as matrices and vectors.
As for more standard workloads, we still do a similar thing with server processors that utilize hundreds of weaker cores, but not all work benefits from the kind of parallelism that computer graphics does (or other stuff like AI, blockchain, physics simulations) and some workloads benefit from less parallelism.
But the short answer is that we do already do what you describe there it's just that sometimes it's still faster to have a few complex cores over thousands of simple ones.
How do you get data to all those cores? Think about it like this: imagine a restaurant with thousands of one person tables packed as tightly as possible. How do you get customers to/from the tables and how do you get them their food and clear their empty plates?
When it comes to modern processor designs, a large amount of the area on the chip is simply dedicated to solving this problem: getting data into the chip, caching it, and getting the data back out to the rest of the computer.
If your chip is just a massive grid of identical cores then you’re going to have extreme issues keeping them all fed with data to work on. What you likely end up doing is having a bunch of the cores spending a lot of time doing nothing other than acting as intermediaries that pass data along to inner cores for processing.
If you’re doing that then the advantage of having all these identical cores goes away and you’re probably better off replacing some of them with specialized hardware for solving that (IO ports and large, fast caches). But then if you continue that line of reasoning you get to the current designs which are already out there!
Modern processors have a lot of optimizations that go beyond just doing math quickly. For example branch prediction allows them to act on information before its even known.
Also, 8086s could be made smaller to go faster and run more efficiently, but they’d still run into thermal limits. What you’re describing is basically the strategy behind the gigahertz race that happened in the early 2000s. Eventually just going smaller and faster wasn’t enough because you couldn’t get the heat out fast enough.
Finally, writing code for thousands of parallel processors would be a bear, and the memory limitations of a 16 bit processor would add a lot of overhead when trying to address the memory requirements of modern software.
Edit: sorry, forgot what sub I was in. Basically, a 16 bit processor just can’t address enough memory, and modern processors have a lot of clever tricks that go beyond just being smaller.
Also, even when you can in theory parallelize something, it's not always easy to figure out how to do it.
The other reason we don't make chips with thousands of tiny 8086s is that, if we're making new chips anyway, why not incorporate the ideas we've learned in the last 40 years instead of just copying old tech? Not using new designs is kinda a waste.
And if you do incorporate new ideas to make a chip with thousands of execution units on it, guess what: that does exist and is called a GPU.