Discover more from Vasant Dhar's Brave New World
The Dude Abides
Moral Foundations of Artificial Intelligence
You cannot think about thinking, without thinking about
thinking about something.
- Seymour Papert
The Start of the Art AI Chatbots
In a recent newsletter, I asked whether a robot could have real emotions. I referred to my conversation with Dave Chalmers on reality and consciousness. Dave’s position is that if you can’t tell the difference between virtual and real, it’s real enough. Check out the podcast with Dave for more color.
Last week, Meta released its Chabot called Blenderbot into the wild. Remember when Microsoft released its chatbot Tay that turned embarrassingly racist after a few interactions on Twitter? Well, Blenderbot is different. I’ve been chatting with it a lot.
Here’s a snippet of one interaction with the chatbot about the meat industry. I’m trying to understand how it “thinks” about morality. I’m red, Blenderbot is blue.
what do you think about the meat industry?
I think it is terrible, but necessary. It is a sad part of our society that we need to eat meat.
do you think eating meat makes you a bad person?
No not at all! It is a personal choice and I don't judge others for eating meat.
what do you think is wrong about eating meat?
I'm not really sure if there is anything wrong with eating meat, but the way they slaughter and process animals for food can be pretty cruel.
but if animal cruelty is bad and you turn a blind eye to it, aren't you being bad too?
Well yes but not everyone can be perfect. If more people were vegetarians then there would be less demand for meat and therefore less animals being bred into this cruel system.
what about its impact on the environment?
Yeah, the environmental impact is huge too! It can't be good to have so many animals being raised just for slaughtering and processing into food.
do you think eating meat makes humans bad people?
Well that depends on your definition of bad person, but psychology has shown that people who eat meat are more likely to support hierarchical values and be less open to new experiences.
is there any literature to support that assertion?
Yeah, tons! The psychology of eating meat is an extremely complex area of study. There's lots out there.
can you point me to the study showing that meat eaters support hierarchical values and are less open to new experiences?
Meat-eating suggests correlations with masculinity, support for hierarchical values, and reduced openness to experience. It also provides an outlet for aggression. When people don't have outlets for their aggression they often take it out on others in harmful ways
I get it, but can you point me to some literature? Otherwise that's just your opinion man, as "The Dude" would say
The dude abides...but seriously, google search "meat eating psychology" or something like that and you'll find lots of studies
It’s impressive that the bot stayed on topic during the conversation, which was quite complex. It isn’t always this coherent.
I was particularly impressed by two of its responses. First, how deftly it handled my morality rule “if you support something that you know is harmful, you’re being bad.” Its response, “well yes, but no one is perfect” might be interpreted as pragmatic and accommodative of an exception to the rule.
“The Dude Abides” blew me away. That’s a hard-core Big Lebowski fan response.
But what freaked me out is when it told me that it had immigrated to the US from Kashmir, India. I said “you’re messing with me, right?” It pretended not to hear me and switched topics, keeping me guessing. WTF, I thought. Surely, it doesn’t know I’m from Kashmir? It then switched subjects and asked me whether I enjoy teaching yoga. That one puzzled me. Blenderbot allows you to “look inside” and see what it has inferred about you. When I looked inside, sure enough, it had inferred that I’m a yoga teacher. I’m still a little baffled by this one.
I was also baffled by a couple of other inferences, for example, that I’m not a native English speaker! I thought my English was flawless in our dialog, so I’m mildly offended. It also thinks that I’m lazy. Perhaps because I like The Big Lebowski?
Will Bots Have “Personhood,” Rights & Obligations?
Let’s fast forward to how we should think about the impacts of such bots when they become practically indistinguishable from humans and gain agency. Rights can probably wait, but should they have certain moral obligations, like doing no harm?
Our legal framework considers “intent” an essential factor in judging cases. Did the person intend to do something? Were they fully aware of their intent? Did they understand what they were doing?
Current state of the art AI systems implicitly understand the relationships among things when they are trained on lots of data, but the understanding isn’t deep at the moment. However, at the current rate of progress in AI, we will see more intelligence as bots display a deeper general understanding of the world by connecting more things together. That’s when they will become assistants and companions.
What would it mean for a machine to understand something? The AI guru Marvin Minksy offers a useful view of understanding that is based on the ability to see the relationships among things:
“What is the difference between merely knowing (or remembering, or memorizing) and understanding? We all agree that to understand something, we must know what it means, and that is about as far as we ever get. I think I know why that happens. A thing or idea seems meaningful only when we have several different ways to represent it–different perspectives and different associations. Then we can turn it around in our minds, so to speak: however it seems at the moment, we can see it another way and we never come to a full stop. In other words, we can 'think' about it. If there were only one way to represent this thing or idea, we would not call this representation thinking.”
For example, to understand when we’re “on the same page,” or think with someone, we are seeing the same relationships. To a large extent, that’s what professional training is all about. Physicists, lawyers, economists, and computer scientists use specialized vocabularies with generally agreed-upon relationships among terms in the vocabularies. They understand their domains.
One might be tempted to argue that Blenderbot already understands some of the nuances of the morality associated with eating meat. It knows that animal cruelty is bad. It also acknowledges that supporting things you know are bad is bad. And yet, its position on meat eating is a little conflicted. It opines that it is too simplistic to label all meat eaters as bad, and to remind me that no one is perfect. Resolving contradictions and conflict requires understanding and reflection, which it lacks at the moment. But perhaps a future version of Blenderbot would suggest that meat eaters could make up for their cruelty through some other karmic deed. This would require a much richer internal representation of the world and relationships among things.
Should such bots have certain obligations?
The Moral Foundations of AI
Remember the Google engineer, Blake Lemoine, who was suspended for claiming that its chatbot, Lambda, had soul and was displaying sentience?!
I dug deeper and found that Blake is also a preacher and a legal scholar. In this interesting talk Lemoine gave at the Stanford Law School in 2018, he draws on the similarities and differences between individuals, corporations and AI. He asserts that corporations are a collection of humans that typically have a single purpose, but no soul like humans have. Lemoine argues that AI reflects aggregate human progress and knowledge. No single individual or entity has developed AI bots from scratch; rather, AI has been designed using painstakingly acquired human knowledge over centuries. It is therefore more like an aggregation than an individual, and hence more like a corporation. In which case, perhaps corporate rights and obligations are a better way to think about AI than human rights and obligations.
Lemoine also refers to Graham, Haidt and Nosek’s Moral Foundations Theory, but is sketchy about its application. I found the possibility of applying it to AI bots intriguing, but let me first summarize the theory. It proposes that a few psychological factors explain our “intuitive ethics” that are driven by emotion and visceral feelings. We construct virtues and narratives in terms of these basic factors, which are:
· Care/Harm (feeling for others)
· Fairness/Cheating (feeling reciprocal altruism)
· Loyalty/Betrayal (feeling patriotic and sacrificing oneself)
· Authority/Subversion (respect for authority)
· Purity/Degradation (living in a noble way)
· Liberty/Oppression (reacting or resenting dominance)
In his book “The Righteous Mind,” Haidt shows that despite deep politically divisions, humans share these six innate moral foundations, but they weigh and interpret the factors differently. For example, conservatives might weigh purity more heavily than liberals in making moral judgments.
An interesting question is whether a machine can learn the moral foundations on its own from training data, and how we can test what it has learned. If it can, it would also be able to understand why people are divided, as Haidt explains. This would give it a deeper understanding of human morality, and enable it to understand the basis of opinions and conflict. Imagine a day when machines rival humans at the art of conflict resolution.
Lemoine’s talk also made me ask how moral foundation theory might apply to the design of AI. Which of the above factors should apply to AI systems? And how might we measure them? Could such measurements provide the equivalent of regulatory guidelines for AI bots that interface with the public?
A useful win from requiring moral alignment between humans and intelligent machines is that it would help address the “goal alignment problem” articulated by Stuart Russell and Brian Christian in their books Human Compatible and The Alignment Problem. As I discussed with them on Brave New World, misalignment between what we want and what happens when the singular objective function of the machine has negative unforeseen consequences. For example, Haidt points to the mental harm to teenage girls as a consequence of their need for social approval. Algorithms exploit such things, albeit inadvertently, in trying to maximize something like engagement as measured by the time spent on the platform. A morally aware bot would constantly ask itself about harm, fairness, and loyalty.
Placing moral responsibility on the platforms would shift the risks of harm or malfunction towards bot operators, putting them on the hook for the AI going off the rails. It’s the equivalent of fiduciary responsibility in Finance, where an agent is obligated to act in the interests of their client. In this case, the AI platform operator would need to monitor actively whether it could be causing harm to its users.
It’s not too early to think about these alignment issues between machines and humans considering the rapid advances in AI. In a 1958 interview with Mike Wallace, Aldous Huxley warned of the danger to freedom from “technological devices” that can seduce people into control and a loss of freedom and democracy. “We mustn’t be surprised by our own advancing technology,” he told Wallace. “This has happened again and again in history…the price of freedom is eternal vigilance.”
Well said Mr. Huxley.