To appear in: Proceedings of the Ninth Conference on Artificial General
Intelligence, New York City, July 16-19, 2016.
A Methodology
for the Assessment of AI Consciousness
Harry H. Porter III
Portland State University,
Portland, Oregon
porter@pdx.edu
Online at: cs.pdx.edu/~harry/musings/ConsciousnessAssessment.pdf
Any definition of consciousness today will be
contentious. Consciousness is a subjective sensation and each thinking entity has
a unique experience of consciousness which no one else can share. Nonetheless, a way to measure a thinking entityÕs
level of consciousness — for example, on a numeric scale from 0 to 100 —
is needed. Obviously this task is problematic and impossible to do with
precision or accuracy, but the exercise is enlightening.
Our goal is only to measure the degree of
consciousness exhibited, without making any assumptions about its
implementation. We say that if an agent exhibits the given set of externally
observable behaviors listed below, then it is conscious by definition, and
conversely, if it is conscious, then these behaviors will be present. Specific
cognitive architectures, e.g., Global Workspace Theory [1, 2, 3], are
candidates for evaluation using the methodology proposed here. We do not
propose any testable theory of consciousness, only an approach to assessment
and, by extension, a concrete definition thereof.
This assessment consists of a series of questions.
Use this questionnaire to evaluate the degree of consciousness of a person, AI
system, or any other thinking entity. Answer the questions and then compute the
score.
There is no clear consensus on the definition of
consciousness; this test reflects the authorÕs personal definition. Other
researchers with differing definitions of consciousness may create their own
evaluation methodologies, or weigh the questions here differently to reflect
their definitions of consciousness.[1]
Answer each question using the following scale:
0
– NONE
Not present at
all
1
– SOME
Present, but
at a level far below human levels
2
– ALMOST
Substantially
present, but still at a sub-human level
3
– HUMAN
Present at a
level typical of a normal human
4
– SUPER-HUMAN
Present to a
degree that exceeds human ability
After answering all questions, add together the
points for each answer to give a total sum. Then multiply the sum by the number
0.741 to normalize and give the final score.[2]
Scores will lie on the following spectrum:
0 No
consciousness present. The consciousness level of a rock.
100 The
consciousness of a fully functioning human.
>100 A
consciousness that exceeds human levels.
133 The
maximum possible score.
The questions are listed
next.
¥ Is THE-TEST-SUBJECT able to
use logic in order to perform reasoning tasks? (The term ÒlogicÓ is to be
interpreted loosely to include an ability to make deductions.)
¥ Does THE-TEST-SUBJECT have
short-term (working) memory; the ability to acquire/deduce and then remember
new facts/data/etc. in such a way that they can be used in ongoing reasoning
tasks? (For example, able to answer: ÒMy name is Tom. I live in Portland. What
is my name?Ó)
¥ Does THE-TEST-SUBJECT have
long-term memory; the ability to acquire/deduce and then remember new
facts/data/information in such a way that it can be used in reasoning tasks
that occur in the future (after many unrelated reasoning tasks are performed)?
(For example: ÒYesterday I told you my birth date; how old am I?Ó)
¥ Can THE-TEST-SUBJECT deal
with partial/incomplete/inaccurate information?
¥ Can THE-TEST-SUBJECT make
decisions, such as whether to take action ÒXÓ or ÒYÓ, based on some train of
reasoning?
¥ Can THE-TEST-SUBJECT reason
about dates/times/intervals?
¥ Can THE-TEST-SUBJECT answer
questions about the current date?
¥ Can THE-TEST-SUBJECT reason
about size/space/location?
¥ Can THE-TEST-SUBJECT answer
questions about THE-TEST-SUBJECTÕS current location and the location of the
questioner? (For an electronic ÒcyberÓ entity whose location is not
well-defined, then at least it can give reasonable answers to questions about
its location.)
¥ Can THE-TEST-SUBJECT communicate
using natural language?
¥ Does THE-TEST-SUBJECT have
goals/motivations/needs?
¥ Does THE-TEST-SUBJECT take
steps to achieve his/her/its goals/motivations/needs?
¥ Does THE-TEST-SUBJECT have
opinions/likes/dislikes?
¥ Does THE-TEST-SUBJECT have
emotions/feelings/moods?
¥ Do THE-TEST-SUBJECTÕS emotions/feelings/moods
change appropriately over time in response to events?
¥ Does THE-TEST-SUBJECT have a
memory of the recent conversational history? (For example: ÒWhy did you say
that? – Because you just told me X, which implies it.Ó)
¥ Can THE-TEST-SUBJECT have
experiences? (The definition of ÒexperiencesÓ is to be taken loosely. A human
can stub his/her toe; the ability to have this sort of physical bodily experience
is not required. For example, an act of communication can itself be an
experience, although it involves only words and nothing physical.)
¥ Does THE-TEST-SUBJECT have a
memory of past experiences and events? (These experiences may have happened to THE-TEST-SUBJECT,
but this is not necessary.)
¥ Is some emotional coloring
attached to memories? (For example, can THE-TEST-SUBJECT distinguish between
good and bad memories?)
¥ Can THE-TEST-SUBJECT learn
new material?
¥ Can THE-TEST-SUBJECT change
in ways deeper than simply acquiring more data?
¥ Does THE-TEST-SUBJECT have
curiosity about the world and an impulse to learn and acquire
information/knowledge/wisdom?
¥ Does THE-TEST-SUBJECT have
knowledge about himself/herself/itself? Can THE-TEST-SUBJECT provide
a coherent description of who/what he/she/it is?
¥ Does THE-TEST-SUBJECT have
some mental model of THE-TEST-SUBJECTÕS own thought
processes?
¥ Can THE-TEST-SUBJECT sense or
perceive his/her/its current thought processes?
¥ Does THE-TEST-SUBJECT have
some mental model of his/her/its own goals/motivations/needs?
¥ Can THE-TEST-SUBJECT articulate
his/her/its current goals/motivations/needs?
¥ Does THE-TEST-SUBJECT have
some mental model of his/her/its own emotions/feelings/moods?
¥ Can THE-TEST-SUBJECT identify
his/her/its current emotions/feelings/moods?
¥ Can THE-TEST-SUBJECT control
his/her/its own thought processes? (For example, can THE-TEST-SUBJECT follow a
novel algorithm to perform some reasoning task? Can THE-TEST-SUBJECT be told
how to think more effectively and then alter his/her/its thought processes as a
result of these instructions?)
¥ Can THE-TEST-SUBJECT detect
when certain thought processes are not effective and alter his/her/its thought
processes in an attempt to make them more effective?
¥ Can THE-TEST-SUBJECT articulate
the algorithms that THE-TEST-SUBJECT uses to performs certain novel tasks? (For
example, able to ÒDescribe how you would sort a sequence of numbers?Ó)
¥ Does THE-TEST-SUBJECT have
knowledge about humans? We mean Òthe human speciesÓ as opposed to knowledge
about particular humans. (In the future or to evaluate an alien life form, we
might need to substitute Òthe dominate intelligent speciesÓ for ÒhumansÓ.)
¥ Does THE-TEST-SUBJECT have
knowledge about particular humans as they differ from other humans? (Perhaps in
the future, weÕll substitute the phrase Òother thinking entitiesÓ for ÒhumansÓ
in these questions.)
¥ Does THE-TEST-SUBJECT have the
ability to learn about other people and remember details about specific
individuals?
¥ Does THE-TEST-SUBJECT have an understanding
of common mental illnesses, such as depression, mania, phobia?
¥ Does THE-TEST-SUBJECT know
about the current thought processes of the person THE-TEST-SUBJECT is
communicating with? (For example, this would include knowing the person knows
ÒXÓ because they know both ÒYÓ and ÒY implies XÓ and they would be likely to infer
ÒXÓ.)
¥ Does THE-TEST-SUBJECT have a
model of the other personÕs thought processes, as they may differ from other
humans? (For example, the knowledge that Tom is good with facts, Robert is
ruled by his emotions, and Matthew is driven by his greed.)
¥ Does THE-TEST-SUBJECT know
about the personÕs current mood/feelings/emotional state?
¥ Can THE-TEST-SUBJECT make
reasonable inferences about how the personÕs current mood/feelings/emotional
state affects the personÕs current thoughts and actions?
¥ Does THE-TEST-SUBJECT know
about the personÕs current motivations/goals/needs and how this influences the
personÕs current thoughts and actions?
¥ Can THE-TEST-SUBJECT make
judgments about the other personÕs level of intelligence?
¥ Does THE-TEST-SUBJECT have
curiosity about the current conversationalistÕs thoughts/thought
processes/knowledge?
¥ Does THE-TEST-SUBJECT have
curiosity about the current conversationalistÕs mood/emotions/goals/motivations?
¥ Does THE-TEST-SUBJECT have a
desire/ability to imitate the thought processes of others? (Even if this
imitation is only temporary and only used to understand the thought processes.)
I used this assessment to evaluate the consciousness
of dogs. On many of the questions
I supplied the answer as a range (such as 0-2) rather than a single number,
because I do not know the exact value and can only supply my best guess.
According to my assessment using this methodology, dogs score 51-67 on this metric of
consciousness.
To quote Wikipedia, ÒCyc
is an artificial intelligence project that attempts to assemble a comprehensive
ontology and knowledge base of everyday common sense knowledge, with the goal
of enabling AI applications to perform human-like reasoning.Ó
I used this assessment to evaluate the consciousness
of this AI system. As before, I
supplied many answers in the form of ranges, due to my limited knowledge about
the system.
According to my assessment using this methodology, Cyc scores 36-54 on this metric of
consciousness, clearly below dogs, but well above rocks.
We can evaluate this assessment by asking which
attributes a personal digital assistant (such as ÒSiriÓ
on the iPad/iPhone) would need to possess in order for
you to feel she was conscious. We suggest that an AI entity scoring 100 or more
on the assessment described here would be judged by a reasonable person to be
conscious, at least to a nontrivial degree.[3]
The well-known Turing Test [4] is substantially
different from this test since the Turing Test is meant to determine how well a
machine can imitate a human. Turing Test judges are free to ask questions that
are peculiar to human experience and the human way of thinking. In other words,
the Turing Test checks for human-like thought. Any intelligent entity that can
pass the Turing Test will, by definition, be able to convince the judges that
it can think just like a human and would therefore be able to score 100 on our
test. On the other hand, our test only asks how well a thinking entity can
perform the various tasks associated with consciousness, not the larger
question of whether the entity can mimic or imitate the human way of thinking
to the point of being indistinguishable from a human mind.
There is also a difference in methodology. Turing
requires his ÒImitation GameÓ to be repeated a number of times, and the
question is whether the machine contestant is statistically indistinguishable
from the human contestant, winning 50% of the time. Performing the Turing Test properly
requires a competent judge who has proven skill in the art of distinguishing
between machines and humans. The Turing Test also requires you to perform a
number of rounds of the imitation game sufficient to extract reliable
statistics. Our test is much less expensive to perform: You just run through
the questions and answer them to the best of your ability. This assessment is a
subjective test of a subjective quality.
The Turing Test also suffers from a serious
shortcoming. Turing designed his test as a pass-fail test. In some cases, it
might be possible for a machine to be reliably differentiated from a human with
a single simple question. (For example, perhaps the machine doesnÕt know how
many fingers a typical human has.) Any unusual gap in the machineÕs knowledge —
no matter how irrelevant — might allow the judge to distinguish between
human and machine with 100% accuracy, even though the machine is, in all other
ways, capable of imitating human thought. When a machine fails the Turing Test,
it gives us no useful information.
One difficulty with our test is that it may be
difficult or impossible to answer some of the questions. For example consider
evaluating the consciousness level of a dog. The question ÒCan he/she/it sense
or perceive his/her/its current thought processes?Ó is almost impossible to
answer. For these questions, you must provide an educated guess. Omitting
questions will alter the test itself, whereas inaccurate answers merely change
the accuracy of the test. Using standard statistical methods, you can express
the final score with error probabilities, if you wish.
These questions are probably skewed to favor
human-like consciousness, although we have tried to make them as neutral (and
fair to the AI entities of the future) as possible.
These questions weigh certain features differently
than some people might prefer. For example, there is only one question about
natural language ability, there are several questions about emotions and
motivations, and there are no questions about body or the ability to visualize
shapes.[4]
Many aspects of the human experience are ignored.
There are no questions about whether the thinking entity can experience hunger,
pain, pleasure, or the joy of hearing beautiful music. There is nothing about
ÒloveÓ, which may be an emotion particular to sexual species that raise their
own offspring via direct personal contact. Perhaps some of these experiences
are important in your definition of consciousness.
There are no questions about experiencing an
Òinternal voiceÓ or Òhearing your own thoughtsÓ. While this seems to be an
important part of my subjective experience of consciousness, it is difficult to
know how others experience it. Additionally, it is hard to ask questions to
evaluate another personÕs experience of an internal voice. Also it is unclear
whether hearing an internal voice is required for consciousness. Nevertheless,
I considered adding the question, ÒDoes he/she/it report hearing an internal
voice at times?Ó.
There are no questions about sight and the ability
to see. It seems likely that a person totally blind from birth experiences
consciousness differently than most humans, but I question whether they are any
less conscious because of their visual impairment.
There are no questions about mental visualization,
although most humans can mentally envision shapes and can report on this
ability. Indeed, the ability to ÒseeÓ and manipulate objects entirely within
the mind is so distinct and widespread that some people may argue that it is a
requirement for consciousness.
There are no questions about body or bodily
experiences. Certainly all humans have bodies and a large part of the brain is
devoted to dealing with them, but it is possible and reasonable to envision a
completely disembodied consciousness.
There are no questions about sensors and actuators.
A human body can be viewed as a collections of senses and abilities to move and
otherwise take action. Many would insist that having some sort of physical
presence (such as a robotic body with sensors and actuators) is required for
consciousness, but this questionnaire takes the position that no physical embodiment
is required.
There are no questions about the survival instinct
or the desire to avoid death. This is a core feature of the experience of being
human, but so is the instinctive impulse to breathe. Both seem unrelated to
consciousness itself, and merely artifacts of the evolution of humans.
This questionnaire does not attempt to evaluate the
Òmoral goodnessÓ of a thinking entity. It attempts to rate the consciousness of
an evil psychopath in the same way as a virtuous nun, ignoring all questions
about whether one consciousness should be preferred over another.
As the future unfolds, society will need to grapple
with questions about which conscious entities are acceptable and which entities
must be forbidden and/or terminated, but this discussion is beyond this
assessment. Before this conversation can begin, we need to have a better
understanding of consciousness itself, and it is hoped that this informal
assessment will help stimulate the debate.
The topic of consciousness continues to fascinate philosophers,
AI researchers, neuroscientists, and the general public and much has been
written and said about the nature, definition, and mechanisms of consciousness.
Approaches to the subject range from philosophical (e.g., Daniel Dennett, David
Chalmers, John Searle), to neurological (e.g., Antonio Damasio,
Oliver Sacks), to more creative theories (Roger Penrose). Beyond the scientific
approach, there is an abundance of material on what might be termed Ònew-ageÓ
or ÒcosmicÓ consciousness — approaches to becoming one with the universal
consciousness and/or achieving a higher level of personal consciousness, moving
in the direction of enlightenment. When trying to define, understand, and
explain human consciousness, we are clearly in an exciting phase of
intellectual advancement.
In terms of evaluating the degree of consciousness, not
much work has been done. The ÒLovelace 2.0 Test of Artificial Creativity and
IntelligenceÓ [5] requires a human judge who, interacting directly with an AI
entity, asks the AI entity to produce a poem, drawing, story, etc. The judge
then evaluates the creative work of the AI. This approach seems quite
reasonable for testing the ability of an AI entity to think creatively in the
manner of a human. Consciousness seems to be related to creativity, but creativity
and consciousness are clearly different. In the Winograd
Schema Challenge [6], the AI entity is presented with a series of short
questions, each with a multiple choice answer which cleverly evaluates whether
the sentence was understood. The WSC is an excellent
test of natural common sense intelligence, but common sense intelligence is not
consciousness. The WSC has the advantage of being
objective and not requiring a human judge.
To many people, consciousness seems irreducible,
nonprogrammable, and perhaps even magical or beyond the reach of science. While
there are some eloquent arguments in favor of these opinions, it now appears
that a simple, unified explanation is elusive and consciousness is turning out
to be nothing more than a motley collection of reasoning skills, mental
abilities, and characteristics of neural processing that, in the case of
humans, has evolved in no different a way than other aspects of the brain.
Consciousness is just like other thinking — except that it is reflective
and involves the concept of self.
This paper is an attempt to enumerate and elucidate the features that
come together to form the colloquial notion of consciousness, with the
understanding that this is only one subjective opinion on the nature of subjectiveness itself.
1. Baars, B.J.: In the theatre of consciousness. Global Workspace
Theory, a rigorous scientific theory of consciousness. In: Journal of
Consciousness Studies, vol. 4, no. 4, pp. 292-309 (April 1997)
2. Baars, B.J.: The Conscious access hypothesis: origins and recent
evidence: In: TRENDS in Cognitive Sciences, vol. 6, no. 1 (January 2002)
3. Baars, B.J.: The global brainweb: An
update on global workspace theory. In: Science and Consciousness Review
(October 2003)
4. Turing, A.M.: Computing Machinery
and Intelligence. Mind 49, 433-460 (1950)
5. Riedl, M.O.: The Lovelace 2.0 Test of Artificial Creativity and
Intelligence,
http://arXiv.com:1410.6142v3 [cs.AI] (2014)
6. Levesque, H. J., Davis, E.,
Morgenstern, L.: The Winograd Schema Challenge. In: Proceedings
of the Thirteenth International Conference on Principles of Knowledge
Representation, AAAI Press (2012)
[1] To
make this (subjective) definition a clearer and more discrete target for future
discussions of the nature of consciousness, let us name the present methodology
and implied definition of consciousness ÒPorter's Definition and Assessment of
AI ConsciousnessÓ so as to distinguish it from other definitions.
[2] This multiplier was chosen so that an
answer of Ò3 – HUMANÓ for all 45 questions will yield a score of 100. If
questions are added or deleted, the multiplier will need to be adjusted
accordingly.
[3] Perhaps being able to form a friendship with a thinking entity is a useful indicator of
whether that entity is conscious. We suggest that with any AI entity able to score
high on this assessment, it would be possible to form a reasonably recognizable
friendship. For example, if the features listed here could be added to Siri, then there is no question that Siri
would appear to be more consciousness than she does now.
[4] It is for this reason that we are
presenting our implicit definition of consciousness as one possible standard definition among many, rather
than suggesting it is more valid or correct than competing definitions of
consciousness.