Winter Spring Summer or Fall
I listened to a lot of James Taylor in my childhood. Winter, spring summer or fall.
I’m bummed that the summer is over. While the changing colors of Fall are quite spectacular in my neck of the woods, the weather is heading in the wrong direction. But, as Tony Soprano would often say in resignation, “whatcha gonna doo?”
I decided to be European this summer and take August off. I’ll start my new podcast season of Brave New World in mid-September, so tune in.
Fall also means new students. This week, I welcomed a cohort of students into the PhD program in Data Science at NYU. We discussed how Artificial Intelligence has become an incredibly fertile space for research across all areas of human activity, making this one of the most exciting areas to be in. These days, it seems like there’s a new innovation every week.
Indeed, a lot has happened in the world of AI this summer. The term “sentient” seems to be appearing more often in the media. Hype? Maybe, but AI now seems to be all around us, observing, sensing, and learning at a rate that I find astounding and a little concerning. We’ve still got some major unsolved problems in AI, such as issues of liability, obligations, and ownership.
Superintelligence
Several years ago, I happened to ask the Swedish philosopher Nick Bostrom what motivated him to write his book called Superintelligence. He said that his primary motivation was to make intellectual progress on certain critical problems related to AI, such as “the control problem,” which in a nutshell, is about getting AI to behave in a way that aligns with human goals instead of harming us.
Nick cautioned that it was a matter of time before machines far exceed the cognitive capacities of humans, at which point they might have little regard for humans relative to the goals we program into them. The pursuit of those goals, for example, might cause them to create intermediate subgoals that harm or eliminate us, without meaning to do so.
I’ve discussed the control problem with several of my guests on my Brave New World podcast, including Stuart Russell and Brian Christian. Stuart has written the book Human Compatible, which argues that “the standard model” of AI, where the machine optimizes a specified objective function, is broken, and leads to unintended outcomes. It’s a risky foundation on which to build automated decision-making systems. There’s evidence of this phenomenon already, for example, in social media platforms where the goal of maximizing user engagement seems to have caused things like teen harm and amplified polarization in society. They didn’t mean to.
Brian Christian makes a similar case in his book “The Alignment Problem” by demonstrating how several AI systems have gone off the rails already by blindly trying to optimize some objective function. Brian refers to Wiener’s quote on the control problem:
Disastrous results are to be expected in the real world wherever two agencies essentially foreign to each other are coupled in the attempt to achieve a common purpose. If the communication between these two agencies as to the nature of this purpose is incomplete, it must only be expected that the results of this cooperation will be unsatisfactory. If we use, to achieve our purposes, a mechanical agency with whose operation we cannot efficiently interfere once we have started it, because the action is so fast and irrevocable that we have not the data to intervene before the action is complete, then we had better be quite sure that the purpose put into the machine is the purpose which we really desire and not merely a colorful imitation of it.
When Bostrom’s book came out, there wasn’t much concern about AI going out of control. The book is not an easy read, so I was a pleasantly surprised that its message was picked up by the media. Perhaps it was because of the “AI fear factor” that captivates Hollywood. Musk also helped bring awareness to the control problem, saying that AI is potentially more dangerous than nuclear weapons.
Bostrom reported that many leading researchers in AI placed a 90% probability on the development of human-level machine intelligence by between 2075 and 2090. I think they may have been conservative in their estimates. Given what’s going on, I’d be tempted to move the timeframe forward by 25 years. AI machines are already in the wild, so the genie is out of the bottle.
In a previous newsletter titled The Dude Abides, I shared a transcript of my dialog with Meta’s Blenderbot chatbot about the meat industry. In response to a question about what it thought about the meat industry, the bot said that mass animal slaughter was terrible. When asked whether meat eating is therefore bad, it said not necessarily. But if it is bad to support something bad, then isn’t meat eating bad? I asked. True, but nobody is perfect, it said.
For a simple language understanding bot, I was impressed that it displayed at least a semblance of “understanding” about something as nuanced as morality. I use quotes because the machine doesn’t understand the world like we do, but rather, it learns implicit relationships among things from large swaths of data that humans have created. Much of the current research in AI is about creating more “awareness” within the machine about such relationships, nudging them closer to how humans understand the world.
Blenderbot isn’t bad, but the real state of the art language models, like Open AI’s GPT3, are much more powerful. A language model is one that can predict the missing word in a sentence. It turns out that such a model captures the nature of relationships among things like humans do – they are “embedded” in language. And a language model makes all kinds of amazing things possible. Indeed, we are already beginning to see applications that would have been unthinkable a few years ago. For example, look at this article listing some of the top GPT3 applications that should boggle your mind. In a nutshell, the machine can now do things like
write code based on an English description of what we want, like designing a website
write cold calls messages tailored to target audiences
analyze masses of customer feedback data and respond to them automatically or tell you what to change to better satisfy them
create avatars who talk to you seamlessly on any channel
do automated A/B testing for you and tell you which action is best and why
This was unthinkable a few years ago. In a few years, these kinds of things will be done automatically by machines.
But what happens when these bots make some costly mistakes that Wiener cautions us about? Who is responsible? Their operators? And do their operators “own” the bots, or does society have some ownership claim as well, considering that it contributed valuable training data to create the product? Should such products, or at least some versions of them, be part of our “public digital infrastructure?”
Who Owns AI?
It is worth thinking seriously about who owns AI. Open AI, the creator of GPT-3, is staffed with socially minded scientists who intend to promote “safe AI.” That’s well and good for now, but things will change as the stakes get higher and better understood and the actors change. We, meaning the public and policymakers, would do well to stave off the equivalent of the wild data land grab of the Web 2.0 companies that’s led to serious data governance and usage issues.
Whose data has been used to build language models like GPT-3? Was any private or personal information provided by people used? If so, who might have a potential ownership stake in the models?
Ownership is a complex human construct, whose meaning and justification has evolved over the centuries. The land on our planet was provided to us by God, so how did people end up owning it? The 17th century philosopher John Locke justified private property, such as land, on the basis of doing “work” that improved it. Locke argued that the fruits of one’s labor are one’s own, and that property rights accrue by the exertion of labor upon natural resources.
Machines have done most of the work in creating language models from data, aided by human programmers. Do the models belong only to the operators of the machines, or more broadly, to us all?
At the dawn of the modern Internet, my now late colleague Ken Laudon argued that the Lockean style “sweat of the brow” perspective of ownership applied to land is problematic when it’s applied to information, specifically personal data. Laudon argued that adding value to existing data doesn’t legitimize appropriation:
To argue that information gathering institutions add value to my personal information by compiling, collating and mixing in a database, does not solve the question of ownership. To say information gathering institutions have exclusive property rights to my personal information because they have added value to the information simply begs the question of who owns my personal information. Whether or not my personal information appears in a collection, or was mixed with other information, is not decisive for the question of ownership.
Do the creators of language models own the data on which the asset was built? Clearly, they do not. Nor did they pay anyone for it. So, do they own the “internals” of the models?
The issues of ownership and obligations of AI models aren’t clear at the moment. Certainly, data ownership is a hot potato and likely to get hotter, so derivative products from data pose a conundrum for their developers. Should the owners of language models using societal data split the revenues from its applications with society? In 2013, I suggested that Facebook pay its users to get ahead of the risks of using such data without explicit consent. I can’t but help wonder whether such a move would have avoided some of the lawsuits it now faces, such as this recent one alleging that it knowingly violated privacy, communications and wiretap laws. Facebook’s and Zuckerberg’s reputation took a big hit due to its greedy and careless data governance policy. AI product developers using societal data may be better off staying ahead of the game and addressing issues of ownership and obligations now, instead of risk being bitten down the line when things go wrong and the stakes are much higher.