Last December, I hosted a Talk at Google with Max Tegmark, a renowned scientific communicator and physicist at MIT who has accepted donations from Elon Musk to investigate the existential risk from advanced artificial intelligence.
The video recording is here. In it, Max discusses his thoughts on the fundamental nature of reality and what it means to be human in the age of AI.
A key learning is the ‘dual-use’ nature of AI systems – the same algorithms that we use to develop new image classification techniques or improve our video games can also be used for nefarious reasons (such as mass surveillance). It’s a challenging issue that should be discussed and managed actively within the research community.
*This talk was presented for Google’s Singularity Network.*
Max Tegmark is a renowned scientific communicator and cosmologist, and has accepted donations from Elon Musk to investigate the existential risk from advanced artificial intelligence. He was elected Fellow of the American Physical Society in 2012, won Science Magazine’s “Breakthrough of the Year” in 2003, and has over 200 publications, nine of which have been cited more than 500 times. He is also scientific director of the Foundational Questions Institute, wrote the bestseller Our Mathematical Universe, and is a Professor of Physics at MIT.
The crux of the AI Alignment (safety) movement is based upon two ideas:
- The orthogonality of intelligence and motivation in artificially intelligent systems
- I.e. “Intelligent” systems do not necessarily have human-like motivations
- Instrumental convergence within autonomously-pursued goal systems
- I.e. there exists an intermediary set of goals that any artificially intelligent system would almost certainly pursue, no matter their explicitly set “final” goal. These typically include objectives like goal-content integrity and cognitive enhancement
These principles were first proposed by Nick Bostrom here
Sound bites (and recent articles in New York Times/ Tech Crunch
) do a poor job of communicating these principles. The concern is not
that a conscious AI system will rise up and “take revenge” on humans. Instead, it’s that an advanced and sufficiently capable AI system will pursue a human-specified goal, and by doing so, will inadvertently damage systems/ entities/ institutions that humans value.
The poem begins as an old sorcerer departs his workshop, leaving his apprentice with chores to perform. Tired of fetching water by pail, the apprentice enchants a broom to do the work for him – using magic in which he is not yet fully trained. The broom performs the chores as enchanted and fills the sorcerer’s cauldron with water. However, the apprentice soon realizes that the broom has obeyed only too well. The broom continues to bring buckets and the floor becomes awash with water. The apprentice realizes that he cannot stop the broom because he does not know how.
In a moment of desperation, the apprentice splits the broom in two with an axe – but each of the pieces becomes a whole new broom that takes up a pail and continues fetching water, now at twice the speed. The broom continues to bring water and fills the room until all seems lost. In the final moments, the old sorcerer returns and quickly breaks the spell. The poem finishes with the old sorcerer’s statement that such powerful spirits should only be called by those that have mastered them.
In this story, the apprentice under-specifies the goal he wishes his enchanted broom to pursue, not realizing that the broom:
- Does not care about filling the room with water. It only cares about increasing the chances that the bucket is filled
- Does not wish to be shut off as that would reduce its ability to ensure that the bucket is filled
*”Consciousness” in AI systems is a philosophical question and is not requisite for large-scale, negative impact from AI systems.
On July 22nd, I hosted a discussion with Tim Urban at Google about the future of artificial intelligence. Tim spoke to his blog Wait but Why and his writing on the The Road to Superintelligence.
The video recording is here. This talk was presented for Google’s Singularity Network.
Bio: Tim Urban has become one of the Internet’s most popular writers. With wry stick-figure illustrations and occasionally epic prose on everything from procrastination to artificial intelligence, Urban’s blog, Wait But Why, has garnered millions of unique page views, thousands of patrons and famous fans like Elon Musk. Urban has previously written long-form posts on The Road to Superintelligence, and his recent TED talk on procrastination has more than 6 million views.
Video Recording: Here
On November 23rd, I had the opportunity to host a discussion with John Searle at Google’s Headquarters in Mountain View (see the video recording here). The discussion focused on the philosophy of mind and the potential for consciousness in artificial intelligence.
As a brief introduction, John Searle is the Slusser Professor of Philosophy at the University of California, Berkeley. He is widely noted for his contributions to the philosophy of language, philosophy of mind and social philosophy. Searle has received the Jean Nicod Prize, the National Humanities Medal, and the Mind & Brain Prize for his work. Among his notable concepts is the “Chinese room” argument against “strong” artificial intelligence.
Of special note, there is a question from Ray Kurzweil to John @38:51.
This Talk was presented for Google’s Singularity Network.
In his review of the hypothesized superintelligent agent, Bill Hibbard, principal author of the Vis5D, Cave5D and VisAD open source visualization systems, proposes a mathematical framework for reasoning about AI agents, discusses sources and risks of unexpected AI behavior, and presents an approach for designing superintelligent systems which may avoid unintended existential risk.
Following his initial description of the AI-environment framework, Hibbard begins by noting that a superintelligent agent may fail to satisfy the intentions of its designer when pursuing an instrumental behavior implicit to its final utility function. This said instrumental behavior, while unintended, could occur in order for the AI to preserve its own existence, to eliminate threats to itself and its utility function, or to increase its own efficiency and computing resources (see: Nick Bostrom’s paperclip maximizer).
Hibbard notes that several approaches to human-safe AI suggest designing intelligent machines to share human values so that actions we dislike, such as taking resources from humans, violate the AI’s motivations. However, humans are often unable to accurately write down their own values, and errors in doing so may motivate harmful instrumental AI action. Statistical algorithms may be able to learn human values by analyzing large amounts of human interaction data, but to accurately learn human values will require powerful learning ability. A chicken-and-egg problem for safe AI follows: learning human values requires powerful AI, but safe AI requires knowledge of human values.
Hibbard proposes a solution to this problem through a “first stage” superintelligent agent that is explicitly not allowed to act within the learning environment (thus refraining from unintended actions). The learning environment includes a set of safe, human-level surrogate AI agents, independent of the superintelligent agent, whose actions in composite mirror those of the superintelligent AI. As such, the superintelligent agent can observe humans, as well as their interactions with the surrogates and physical objects, and develop a safe environmental model from which it learns human values.
Hibbard’s mature superintelligent agent may still pose an existential threat (he specifically notes the dangers of military and economic competition), however, its utility function should assign nearly minimal value to human extinction. See his full discussion here!