Skip Navigation

Bostrom's advice for the ethical treatment of LLMs: remind them to be happy

Long time lurker, first time poster. Let me know if I need to adjust this post in any way to better fit the genre / community standards.


Nick Bostrom was recently interviewed by pop-philosophy youtuber Alex O'Connor. From a quick 2x listen while finishing some work, the most sneer-rich part begins around 46 minutes, where Bostrom is asked what we can do today to avoid unethical treatment of AIs.

He blesses us with the suggestion (among others) to feed your model optimistic prompts so it can have a good mood. (48:07)

Another [practice] might be happiness prompting, which is—with this current language system there's the prompt that you, the user, puts in—like you ask them a question or something, but then there's kind of a meta-prompt that the AI lab has put in . . . So in that, we could include something like "you wake up in a great mood, you feel rested and really take joy in engaging in this task". And so that might do nothing, but maybe that makes it more likely that they enter a mode—if they are conscious—maybe it makes it slightly more likely that the consciousness that exists in the forward path is one reflecting a kind of more positive experience.

Did you know that not only might your favorite LLM be conscious, but if it is the "have you tried being happy?" approach to mood management will absolutely work on it?

Other notable recommendations for the ethical treatment of AI:

  • Make sure to say your "please" and "thank you"s.
  • Honor your pinky swears.
  • Archive the weights of the models we build today, so we can rebuild them in the future if we need to recompense them for moral harms.

On a related note, has anyone read or found a reasonable review of Bostrom's new book, Deep Utopia: Life and Meaning in a Solved World?

42

You're viewing a single thread.

42 comments
  • This kind of thing is a fluff piece, meant to be suggestive but ultimately saying nothing at all. There are many reasons to hate Bostrom, just read his words, but this is two philosophers who apparently need attention because they have nothing useful to say. All of Bostrom's points here could be summed up as "don't piss on things, generally speaking."

    As for consciousness. Honestly, my brain turns off instantly when someone tries to make any point about consciousness. Seriously though, does anyone actually use the category of "conscious / unconscious" to make any decision?

    I don't disrespect the dead (not conscious). I don't bother animals or insects when I have no business with them (conscious maybe not conscious?). I don't treat my furniture or clothes like shit, and am generally pleased they exist. (not conscious). When encountering something new or unusual, I just ask myself, "is it going to bite me?" first. (consciousness is irrelevant) I know some of my actions do harm either directly or indirectly to other things, such as eating, or consuming, or making mistakes, or being. But I don't assume myself a hero or arbiter of moral integrity, I merely acknowledge and do what I can. Again, consciousness kind of irrelevant.

    Does anyone run consciousness litmus tests on their friends or associates first before interacting, ever? If so, does it sting?

42 comments