The sword's power changes with time, and as it racks up more kills. Soon, it gains a +1 to attack and damage. Then, it can become wreathed in flame as a bonus action. Then, it grants advantage to checks made to locate creatures. Then, its base power inverts and it can only kill non-evil creatures.
Do not tell the player about that last one. Insist to the player that it works exactly as you first described. The Paladin can kill innocent shopkeepers and little old ladies, but cannot kill this assassin working for the BBEG.
Will he question his own stab-first ask-later methods? Or will he turn evil without even noticing?
I personally hate this kind of twist. If you need to actively lie to your player, not just mislead with some clever wordplay, it always feels like you’re breaking trust.
The playstyle is stabbing random townsfolk on the off chance you kill a bad guy. Fuck that playstyle.
And for a lore reason, just have the sword be influenced by the morality of the wielder's actions. Stabbing random townsfolk is evil. The sword turns evil.
If you know that the sword can't hurt people that aren't evil, then stabbing randoms is by definition not evil because you can't hurt them.
I mean, yeah it's meta gaming hard and lots of folks wouldn't want this at their table, so chalk it up as a learning moment as a DM and figure out a good way to take it from them. The obvious one in this case is that the sword damages evil creatures, not destroys. Have our little meta-gaming pally stab a guy twice his level and get wrecked so he rethinks the practice. "Welp you've stabbed the bbeg, they've stripped you and the party of their possessions and locked you in a dungeon, boy you're lucky he had somewhere to be or you'd be dead." Like this is only a clever meta-game if you're in a video game where you know the level of the zone you're in and you know the full meta.
And even then, a simple "hey we're a RP table and we try to keep meta to a minimum, so please reconsider this practice" or "hey before you go stabbing everyone, do you know what the level of each of the characters are? something to think about..." is the polite thing to do before you ruin their game based on the DM's mistake.
Attacking people is still upsetting even if they don't get hurt. There are many ways to harass people without hurting them, and I'd consider surprise schrodinger shanking one of them. I don't know if I'd call that "evil" per se, but I'd definitely call it an asshole move.
Personally as a DM I wouldn't make the sword evil, but I might make it so eventually it would repel the grasp of the Paladin who used it so flippantly, rather than as a warrior of good.
First off, a sword that only destroys evil doesn't mean insta-kill. It just means you only deal a fatal blow if they're evil. You can just rule that it still damages good characters, so you lose basically all of your allies due to constant wounding.
Second, this is consequentialism vs deontologism. Is the morality of an act decided by the outcome or the act itself? You have the consequentialism view that the action is okay because you know it can only kill an evil person. I argue that the sword's properties can change without you knowing, so this knowledge is just belief. As the consequences cannot be truly known before the action takes place, the morality is decided by the action itself (deontology). Stabbing people at the start of every conversation is evil.
If I were doing this, I wouldn't describe the effects exactly (except the +1). I would just tell them it misses every time they attack a non-evil character first, and describe it being wreathed in flames. Then for the swap just tell them who it misses or hits still, but they have to figure out both times what the effect is (or that it changed).