Modeling Nash Equilibria in Artificial Intelligence Development

In his discussion of a theoretical artificial intelligence “arms race”, Nick Bostrom, Director of the Future of Humanity Institute at Oxford, presents a model of future AI research in which development teams compete to create the first General AI. Under the assumption that the first AI will be very powerful and transformative (a notably arguable one, as per the soft vs. hard takeoff debate), each team is highly incentivised to finish first. Bostrom argues that the level of safety precautions each development team will undertake arises as a reflection of broader policy parameters, specifically those relating to the allowed level of market concentration (i.e. permitted consolidation of research teams), and information accessibility (i.e. degrees of intellectual property protection & algorithm secrecy).

In his work, Bostrom does not reach one specific conclusion regarding AI safety levels, but instead defines a set of Nash equilibria given various numbers of development teams + levels of information accessibility. Specifically, he notes that having additional development teams (and therefore reduced market concentration) may increase the likelihood of an AI disaster, especially if risk-taking is more important than skill in developing the AI. Increased information accessibility also increases risk. The more teams know of each others’ capabilities and methodologies, the greater the velocity of, and enmity in, development; a greater equilibrium danger level follows accordingly.

Bostrom’s derivation is intended to spur discussions on AI governance design. See his original paper here!

The Relevance of a Singleton in Managing Existential Risk

The idea of a ‘Singleton’, a universal decision-making agency that maintains world order at the highest level, offers a functional means for discussing the implications of global coordination, especially as they relate to existential risk. In his 2005 essay, Nick Bostrom both introduces the term and provides elaboration regarding possible examples of a Singleton, the ways one could arise, and its ability to manage global catastrophes.

Bostrom notes that a Singleton may come into being in various forms, including, but not limited to, a worldwide democratic republic, a worldwide dictatorship, or an omnipotent superintelligent machine; the final of these is the least intuitive (and certainly the most closely tied to science fiction), but does, in certain forms, meet Bostrom’s Singleton definition requirements.

One may note common characteristics between all forms of a Singleton. Its necessary powers include (1) the ability to prevent any threats (internal or external) to its own supremacy, and (2) the ability to exert control over the major features of its domain. The creation of a Singleton in ‘traditional government’ form may emerge if seen necessary to curtail potentially catastrophic events. Historically, the two most ambitious efforts to create a world government have grown directly out of crisis (League of Nations, United Nations); future increased power and ubiquity of military potential (e.g. nuclear, nanobot, A.I. capabilities) may help rapidly develop support for a globally coordinated government. The creation of a Singleton in superintelligent machine form may arise if a machine becomes powerful enough that no other entity could threaten its existence (possible through an uploaded consciousness or the ability to easily self-replicate), and if it holds universal monitoring/ security/ cryptography technologies (plausible given the rapidly increasing volume of internet-connected devices).

Although not without disadvantages (touched on further in the paper), the creation of a Singleton would offer a method for management of existential risk. See Bostrom’s full discussion on the merits of a Singleton here!

Orthogonality & Instrumental Convergence in Advanced Artificial Agents (Bostrom)

In his review of the theoretical superintelligent will, Nick Bostrom, Director of the Future of Humanity Institute at Oxford, applies a framework for analyzing the relationship between intelligence and motivation in artificial agents, and posits an intermediary goal system that any artificially intelligent system would almost certainly pursue.

Specifically, Bostrom notes the orthogonality of intelligence (here described as the capacity for instrumental reasoning) & motivation, and hence reasons that any level of intelligence could be combined with any motivation/ final goal; in this way, the two may be thought of axes along which possible agents can freely vary. This idea, often concealed by human bias towards the anthropomorphization of non-sensitive systems, implies that superintelligent systems may be motivated to strive towards simple goals (such as counting grains of sand), those that are impossibly complex (such as simulating the entire universe), or anything in-between. They, however, would not inherently be motivated to focus on human final goals, such as the ability to reproduce or the protection of offspring. High intelligence does not necessitate human motivations.

Bostrom ties this notion of orthogonality with the concept of instrumental convergence, noting that while artificially intelligent agents may have an infinite range of possible final goals, there are some instrumental (intermediate) goals that nearly any artificial agent will be motivated to pursue, because they are necessary for reaching almost any possible final goal. Examples of instrumental goals include cognitive enhancement and goal-content integrity. To the former, nearly all agents would seek improvement in rationality and intelligence, as this will improve an agent’s decision-making and make the agent more likely to achieve its final goal. To the latter, all agents have a present instrumental reason to prevent alteration of its final goal, because it is more likely to realize this goal if it still values it in the future.

Bostrom synthesizes the two theories by warning that a superintelligent agent will not necessarily value human welfare, or acting morally, if it interferes with instrumental goals necessary for achieving its final goal.

See his full discussion here!

Are We Living in a Simulation?

Are we living in a simulation? Nick Bostrom, founder of The Future of Humanity Institute at Oxford, seeks to argue that this scenario is not only possible, but in fact likely (given that we are able to develop the necessary technology before humanity’s extinction).

His line of reasoning is as follows: Because the number of simulations run by a civilization capable of running them would be very great, if simulations are done, then the number of people that are simulated would be much greater than the number of people that are not simulated, which would mean that the probability that we are living in a simulated universe is almost 1. So, it becomes clear that one of two things must be the case. Either the probability that simulations are run is very small (practically null), or it is almost certain that we ourselves are living in a simulation.

See his original paper, along with a filmed discussion!

Rationality's Volition

A Collection of Discussions on Theory, Intent, and Impact

bostrom

Modeling Nash Equilibria in Artificial Intelligence Development

The Relevance of a Singleton in Managing Existential Risk

Orthogonality & Instrumental Convergence in Advanced Artificial Agents (Bostrom)

Are We Living in a Simulation?