My Latest Podcast
My latest guest on Brave New World was Raghu Sundaram, the most recent Dean of the Stern School of Business at NYU. Under Raghu’s leadership, Stern moved up in the rankings from #13 in 2018 to #7 in 2024. That is a phenomenal achievement.
Raghu is now Senior Vice Chancellor and Head of Global Strategy for NYU, leading NYU’s rise as a global player in education, a path that was initiated by an earlier president, John Sexton. John was an early guest on Brave New World.
Raghu and I had a freewheeling conversation about higher education in America and around the globe, and what is required to create a great university. So, check it out at:
Truth
I’m impressed with ChatGPT and Notebook LM by Google. I gave Notebook my podcast transcript with Daniel Kahneman and asked it to create a new conversation. It created a 10-minute summary, featuring a female and male voice, in which the woman explained the concepts of bias and noise to the man, whose role was to ask broader questions, such as the implications of human inconsistency in decision-making and whether AI could improve it.
I thought the conversation was very engaging and incredibly polished, with a sprinkling of “ahas” and “ums” to make it human. But there were some serious errors that I almost missed because I was so enthralled by the conversation. The average listener would not catch them. This is a serious issue, especially since the machine sounds so authoritative and convincing.
It made me ask a broader question: has truth has become a casualty on the march towards intelligent machines?
Truth is central in human affairs and has also served as the underpinning of AI. Over my 45 years in the field, however, I have seen three “paradigm shifts” in AI, where the meaning of truth has changed with each shift, resulting in different kinds of machines. Ironically, even as machines have become smarter and more human-like, truth seems to have become an afterthought in the latest shift towards large language models (LLMs) on which applications such as ChatGPT and Notebook LM are based.
What does truth really mean? We use the word all the time, and yet, it has no universal definition. In modern English, the definition we use seems circular: truth is something that is factual, which, in turn, is something that is true. Despite the circularity, however, we have a good idea what the word means. In Hindi, the word for truth is Satya, which also stands for essence, and can have a spiritual connotation as well. In Chinese, the definition of truth is even more complex, as in “the correct reflection of objective things and their laws in the human mind.” Note the role of the observer, which contrasts with the English and Hindi definitions. Michael Pillsbury, author of The Hundred Year Marathon, remarked how American politicians were often frustrated by the Chinese translations of their statements, which were far afield from their intentions. The word “no,” for example, is anathema in Chinese, so a phrase like “no, that isn’t possible” might get translated into something like “additional views will be considered.”
Perhaps ChatGPT will make translations better. At the very least, it can provide a real-time comparison to keep the human translator as close as possible to the intent of the speaker. Who would have imagined such a world even a few years ago, in which AI keeps humans honest?
Truth in Artificial Intelligence
What does truth have to do with Artificial Intelligence?
Even though truth and intelligence are distinct, truth has always been central to AI. Interestingly, however, its meaning has changed over the last sixty years as AI has come closer to emulating human intelligence. The reason for these changes is that what counts as “knowledge” has changed.
When I got into the field in the late 70s, logic was a major part of the AI paradigm for representing knowledge. Truth plays a big role in logic. Just like the Newtonian calculus was invented for calculating the motion of objects, a similar calculus had been invented by philosophers and logicians for calculating the truth of statements. For a concise history of logic, I’d recommend the entry on truth in the Stanford Encyclopedia of Philosophy. The roots of logic go back to Aristotle’s syllogism, that connects premises to conclusions, such as “All men are mortal; Socrates is a man; therefore, Socrates is mortal.” Sherlock Holmes used such rules of logic all the time. He always started with a careful observation of all facts of a case to determine whether a conjecture could be true. If additional data was needed, he acquired it.
The AI systems of the 70-s and 80s were mostly based on variations of this type of logical reasoning. Common application areas were medical diagnosis, engineering design, planning, and design. The knowledge in such systems was acquired from existing theory or human experts who had learned useful patterns for problem-solving through experience. The acquired knowledge was the equivalent of axioms, which were matched against the facts of a case in a Sherlock Holmes style of reasoning to ensure that the conclusions were “true.” Explanation consisted of tracing through the chain of reasoning between the observed data – such as the patient’s symptoms in the case of medical diagnosis – and the conclusion derived from applying knowledge to data – such as the diseases that could be causing the observations.
However, expressing knowledge through logic and its notions of truth was very difficult. For one thing, it is impossibly difficult for humans to specify everything they know about a subject, let alone express it in logic or rules. Much of our knowledge is also tacit, and we seem to invoke it on demand in all kinds of creative ways. We know a lot more than we are able to articulate. In my last newsletter on The Damodaran Bot, for example, I discussed how the valuation guru Aswath Damodaran invokes all kinds of perspectives for valuing businesses without even being aware of it. Human reasoning is much too complex and heterogenous to be captured by top-down specification of relationships required by this paradigm.
The center of gravity of AI began to tilt towards machine learning in the late 80s and 90s, with the maturing of database technology, the emergence of the Internet, and the increasing abundance of observational and transactional data. The machine learning paradigm provided an easier way for the machine to acquire empirical knowledge about the world – from data, without cumbersome knowledge engineering. In the process, AI became much more statistical and adopted future predictive ability as the primary criterion for something to count as knowledge. Empirics provided the truth for supervision of the learning process.
This model of future prediction as the basis for knowledge has propelled AI to the kinds of capability that have emerged in Large Language Models (LLMs) on which systems like ChatGPT and Notebook LM are based. LLMs have learned to predict the next word in a sequence very well, based on the vast amounts of freely available text data on the Internet to train from. What has guided them to converse so well is the collective human expression on the Internet, which serves as the truth from which they have learned to talk like us, even though much of it isn’t true according to the English definition of truth. Fact and truth have diverged. What has surprised everyone is the knowledge the LLMs have acquired in the process of learning how to predict the next word in a sequence, knowledge that allows them to perform all kinds of things they were not designed to do. It is remarkable how easily their knowledge can be repurposed for all kinds of other things that they were not trained to do, such as explaining why a joke is funny, create, summarize or interpret a document or image, answer questions, or create podcasts from text.
I call this capability “general intelligence.” It refers to an integrated set of essential mental skills that include verbal ability, reasoning, spatial, numerical, mechanical, and common sense and reasoning abilities, which underpin performance across all mental tasks. There are no boundaries between expert knowledge and common sense, which blend seamlessly into thinking about anything.
General Intelligence: Machines With No Purpose
Unlike all previous machines that were designed with a purpose, modern AI has been designed without a purpose, other than to carry out a human-like conversation about anything. In my conversation with my podcast guest Sam Bowman last year, he explained how serendipity led to the emergence of general intelligence. Being able to predict the next word well was just at the right level of difficulty, where doing well conversationally forced the machine to learn a large number of things about the world in general. In other words, a sufficiently deep understanding about the world, including common-sense, seems necessary for language fluency.
Does the level of general intelligence in AI pass the famous Turing test? In this test, we must determine whether a response to a question is coming from a human or a machine. Some researchers question whether AI has passed this test, but it is hard to ignore the mounting evidence of its capability. At a recent conference, an operator of an investment management firm, whose analysts make long-term investment recommendations, reported that the analyses being produced by ChatGPT for valuing companies was indistinguishable from those of its human analysts.
Does the business operator still need the analysts? Or should he trust the machine?
With a human analyst, there is some expectation that every effort was made to be truthful. Should we expect the same from the AI built on LLMs? I would think not, even though its outputs might be superior to those of most human analysts and lead to better results. This is a vexing problem for business leaders.
The larger problem is that machine Intelligence and truth have parted ways on the march towards human-like general intelligence. The AI guru and 2024 Physics Nobel Laureate Geoff Hinton described the conundrum with some humor, equating AI with an alien species that has descended on Earth, but “we’re having a hard time taking it in because they speak such good English.” We are never sure whether they are telling us the truth.
Is there a way out of this conundrum?
One obvious strategy for increasing truthfulness is to create applications on top of the LLM that are explicitly designed to be truthful. Researchers are pursuing approaches to try and get the machine to reflect on the truthfulness of its answers and check them in multiple ways before sharing them. Another approach being pursued is to convert language into logic and “prove” that the logic yields a truthful answer before sharing the output with the user. Such approaches might work well enough to be practical, but the bottom line is that there are no guarantees of truth when we build on a foundation that isn’t designed to be truthful in the factual sense.
It is ironic that as AI has gotten smarter, truth has become an afterthought.
I agree with your stance. It's fascinating how reliant we are becoming to such platforms. I believe that people use chatbots like Chatgpt and Copilot and assume that the responses they receive are truthful. They see these platforms as tools that should always be correct since the LLM is using information it has been trained on. I wonder what AI reliance could mean for human society in the future. Obviously at this moment it is very helpful but is there a worry that it could be harmful say 20 or 30 years in the future?