HCI Crossroads: Artificial Intelligence

Showing posts with label Artificial Intelligence. Show all posts

Sunday, July 24, 2011

A Summary of "CLARION" from "A Tutorial on CLARION 5.0"

Citation Sun, Ron. “A Tutorial on CLARION 5.0.” Department of Cognitive Science. Rensselaer Polytechnic Institute. 6 Oct. 2009. http://www.sts.rpi.edu/~rsun/sun.tutorial.pdf

Summary / Assessment
CLARION (short for Connectionist Learning with Adoptive Rule Induction ON-Line) is a cognitive architecture used to simulate human cognition and behavior. Dr Sun has led the development of CLARION at RPI. In the tutorial cited above, Dr. Sun provides an introduction to CLARION and its structure.

One important aspect of CLARION, and something that sets it apart form other cognitive architectures, is the method in which it models human knowledge. In CLARION, knowledge is split up into implicit knowledge, and explicit knowledge. The two types of knowledge are handled differently in the architecture, just as in real-life. For example, implicit knowledge is not directly accessible, but it can be used during computation (just as tacit knowledge is not easily passed on in real-life, but is used by people when solving problems). More specifically, in CLARION, implicit knowledge is modeled as a backpropagation neural network, which accurately represents the distributed and subsymbolic nature of implicit knowledge. On the other hand, explicit knowledge is represented in a symbolic (localist) way. Explicit knowledge is given a direct meaning making it more accessible and interpretable. Explicit knowledge is further divided into “rules” and “chunks”. Rules govern how an agent interacts with its environment (for example: If the stove is hot, don’t touch the stove). Chunks are combinations of implicit dimension/value pairs that are tied together to form a mental concept. For example: table-1: (size, large) (color, white) (number-of legs, four) where “table-1” is the mental concept and the dimension/value pairs are contained in parenthesis.

A layered concept is used when discussing the overall structure of CLARION. Explicit knowledge forms the top layer, while implicit knowledge forms the bottom layer. The architecture is further divided up into subsystems that handle a specific aspect of human cognition. Each subsystem has a module in the explicit layer (top layer), and a module in the implicit layer (bottom layer).

CLARION contains four subsystems: the Action-Centered Subsystem, the Non-Action-Centered Subsystem, the Motivational Subsystem, and the Meta-Cognitive Subsystem. The Action-Centered subsystem represents the part of the mind that controls an agent’s physical and mental actions. (For example, manipulating objects in the real world and adjusting goals.) The ACS uses input from the environment, and other internal information to determine the best action to take. The Non-Action-Centered Subsystem contains what can be considered as general knowledge or “semantic memory”. This type of knowledge includes ideas, objects, and facts. In the NACS, the upper layer contains connections (or associative rules) that connect declarative knowledge chunks, and the bottom layer contains implicit declarative knowledge in the form of dimension/value pair networks. The Motivational Subsystem represents the part of cognition that supplies reasons behind an agent’s actions. The MS describes the “drives” of the agent which in turn determines the agent’s goals. (For example, a drive may be to quench thirst, so the current goal structure would be centered on obtaining a source of water.) The Meta-Cognitive Subsystem is the main controller of cognition. It regulates the communication between all other subsystems. For example: it adds goals to the current goal structure, in response to the drives formulated by the Motivational Subsystem. It also transfers information learned in the ACS to the NACS.

As far as implications for design and HCI: I think that we can learn a great deal about human cognition from utilizing cognitive architectures, and we could leverage them to effectively simulate how our HCI designs will function in the real-world.

Tuesday, March 29, 2011

A Summary of "The Motivational and Meta-Cognitive Subsystems" from "A Tutorial on CLARION 5.0"

Citation
Sun, Ron. "Chapter 4: The Motivational and Meta-Cognitive Subsystems." "A Tutorial on CLARION 5.0." Department of Cognitive Science. Rensselaer Polytechnic Institute. 6 Oct. 2009.  http://www.sts.rpi.edu/~rsun/sun.tutorial.pdf

Summary / Assessment
In this chapter, the Motivational Subsystem (MS) and the Meta-Cognitive Subsystem (MCS) of the CLARION architecture are described. The MS is concerned with an agent’s drives and their interactions (i.e. – why an agent does what it does and why it chooses any particular action over another). The MCS controls and regulates cognitive processes. The MCS accomplishes this, for example, by setting goals for the agent and by managing ongoing processes of learning and interactions with the surrounding environment.

Dr. Sun mentions that motivational and meta-level processes are required for an agent to meet the following criteria when performing actions: sustainability, purposefulness, focus, and adaptivity. Sustainability refers to an agent attending to basic needs for survival (i.e. – hunger, thirst, and avoiding danger). Purposefulness refers to an agent selecting activities that will accomplish goals, as opposed to selecting activities completely randomly. Focus refers to an agent’s need to focus its activities on fulfilling a specific purpose. Adaptivity refers to the need of an agent to adapt (i.e. – to learn) to improve its sustainability, purposefulness, and focus.

When modeling a cognitive agent, it is important to include the following considerations concerning drives. Proportional Activation: Activation of drives should be proportional to offsets or deficits within the agent (such as the degree of the lack of nourishment). Opportunism: Opportunities must be factored in when choosing between alternative actions (ex: availability of water may lead an agent to choose drinking water over gathering food, provided that the food deficit is not too high). Contiguity of Actions: A tendency to continue the current action sequence to avoid the overhead of switching to a different action sequence (i.e. – avoid “thrashing”). Persistence: Actions to satisfy a drive should persist beyond minimum satisfaction. Interruption When Necessary: Actions for a higher priority drive should be interrupted when a more urgent drive arises. Combination of Preferences: Preferences resulting from different drives could be combined to generate a higher-order preference. Performing a “compromise candidate” action may not be the best for any single drive, but is best in terms of the combined preference.

Specific drives are then discussed. Drives are segmented into three categories: Low-Level Drives, High-Level Drives, and Derived Secondary Drives. Low-Level drives include physiological needs such as: get-food, get-water, avoid-danger, get-sleep, and reproduce. Low-Level drives also include “saturation drives” such as: avoid-water-saturation, and avoid-food-saturation. High-Level drives include “needs” such as: belongingness, esteem, and self-actualization (and others from Maslow’s needs hierarchy). Derived Secondary Drives include: gradually acquired drives through conditioning (i.e. – associating a secondary goal to a primary drive), and externally set drives (i.e. – drives resulting from the desire to please superiors in a work environment).

Meta-cognition refers to one’s knowledge of one’s own cognitive process. It also refers to the monitoring and orchestration of cognitive processes in the service of some concrete goal or objective. These concepts are operationalized within CLARION’s MCS through the following processes: 1) Behavioral Aims: which set goals and their reinforcements, 2) Information Filtering: which determines the selection of input values from the environment, 3) Information Acquisition: which selects learning methods, 4) Information Utilization: which refers to reasoning, 5) Outcome Selection: or determining the appropriate outputs, 6) Cognitive Modes: or the selection of explicit processing, implicit processing, or combination thereof, and 7)Parameter Settings: such as parameters for learning capability (i.e. – intelligence level).

Saturday, January 8, 2011

Terry Winograd's Shift from AI to HCI

Introduction
In a more recent paper [1], Terry Winograd discusses the gulf between Artificial Intelligence and Human-Computer Interaction. He mentions that AI is primarily concerned with replicating the human / human mind whereas HCI is primarily concerned with augmenting human capabilities. One question is wether or not we should use AI as a metaphor when constructing human interfaces to computers. Using the AI metaphor, the goal is for the user to attribute human characteristics to the interface, and communicate with it just as if it were a human. There is also a divide in how researches attempt to understand people. The first approach, what Winograd refers to as, the “rationalistic” approach, attempts to model humans as cognitive machines within the workings of the computer. In contrast, the second approach, the “design” approach, focuses on modeling the interactions between a person and the enveloping environment.

During his career, Winograd shifted interests and crossed the gulf between AI and HCI. In his paper, he mentions that he started his career in the AI field, then rejected the AI approach, and subsequently ended up moving to the field of HCI. He writes “I have seen this as a battle between two competing philosophies of what is most effective to do with computers”. This paper looks at some of the work Winograd has done, and illustrates his shift between the two areas.

Winograd's Ph.D. Thesis and SHRDLU
In his Ph.D. thesis entitled “Procedures as a Representation for Data in a Computer Program for Understanding Natural Language” [2], Winograd describes a software system called SHRDLU that is capable of carrying on a English conversation with its user. The system

contains a simulation of a robotic arm that can rearrange colored blocks within its environment. The user enters into discourse with the system and can instruct the arm to pick up, drop, and move objects. A “heuristic understander” is used by the software to infer what each command sentence means. Linguistic information about the sentence, information from other parts of the discourse, and general information are used to interpret the commands. Furthermore, the software asks for clarification from the user if it cannot understand what the inputted sentence means.

The thesis examines the issue of talking with computers. Winograd underscores the idea that it is hard for computers and human to communicate since computers communicate in their own terms; The means of communication is not natural for the human user. More importantly, computers aren't able to use reasoning in an attempt to understand ambiguity in natural language. Computers are typically only supplied with syntactical rules and do not use semantic knowledge to understand meaning. To solve this problem, Winograd suggests giving computers the ability to use more knowledge. Computers need to have knowledge of the subject they are discussing and they must be able to assemble facts in such a way so that they can understand a sentence and respond to it. In SHRDLU knowledge is represented in a structured manner and uses a language that facilitates teaching the system about new subject domains.

SHRDLU is a rationalistic attempt to model how the human mind works. It seeks to replicate human understanding of natural language. Although this work is grounded in AI, there a clear implications for work in HCI. Interfaces that communicate naturally with their users are very familiar and have little to no learning curve. Donald Norman provides several examples of natural interfaces in his book “The Design of Future Things” [3]. One example that stands out is a tea kettle whistle. The tea kettle whistle offers natural communication that water is boiling. The user does not need to translate the sound of a tea kettle whistle from system terms into something he/she understands; it already naturally offers the affordance that the water is ready.

Thinking machines: Can there be? Are We?
In “Thinking Machines” [4], Winograd aligns his prospects for artificial intelligence with those of AI critics. The critics argue that a thinking machine is a contradiction in terms. “Computers with their cold logic, can never be creative or insightful or possess real judgement”. He asserts that the philosophy that has guided artificial intelligence research lacks depth and is a “patchwork” of rationalism and logical empiricism. The technology used in conducting artificial intelligence research is not to blame, it is the under-netting and basic tenets that require scrutiny.

Winograd supports his argument by identifying some fundamental problems inherent in AI. He discusses gaps of anticipation where in a any realistically sized domain, it is near impossible to think of all situations, and combinations of events from those situations. The hope is that the body of knowledge built into the cognitive agent will be broad enough to contain the relevant general knowledge needed for success. In most cases, the body of knowledge contributed by the human element is required; since it cannot be modeled exhaustively within the system. He also writes on the blindness of representation. This is in regards to language and interpretation of language. As expounded upon in his Ph.D. thesis, natural language processing goes far beyond grammatical and syntactic rules. The ambiguity of natural language requires a deep understanding of the subject matter as well as the context. When we de-contextualize symbols (representations) they become ambiguous and can be interpreted in varying ways. Finally, he discusses the idea of domain restriction. Since there is a chance of ambiguity in representations, AI programs must be relegated to very restricted domains. Most domains, or at least the domains the AI hopes to model, are not restricted (e.g. - medicine, engineering, law). The corollary is that AI systems can give expected results only in simplified domains.

Thinking Machines offers some interesting and compelling arguments against the “rationalist approach”. It supports the idea that augmenting human capabilities is far more feasible than attempting to model human intelligence. This is inline with the “design approach” (i.e. - placing focus on modeling interactions between the person and his/her surrounding environment.)

Stanford Human-Computer Interaction Group
Winograd currently heads up Stanford's Human-Computer Interaction Group [5]. They are working on some interesting projects grounded in design. One such project, d.tools is a hardware and software toolkit that allows designers to rapidly prototype physical interaction design. Designers can use the physical components (controllers, output devices) and the accompanying software (called the d.tools editor) to form prototypes and study their behavior. Another project, named Blueprint, integrates program examples into the development environment. Program examples are brought into the IDE through an built-in web search. The main idea behind Blueprint is that it helps to facilitate the prototyping and ideation process by allowing programmers to quickly build and compare competing designs. (more information on these projects can be found on their website (link is below))

References
[1] Terry Winograd, Shifting viewpoints: Artificial intelligence and human–computer interaction, Artificial Intelligence 170 (2006) 1256–1258.
http://hci.stanford.edu/winograd/papers/ai-hci.pdf

[2] Winograd, Terry (1971), "Procedures as a Representation for Data in a Computer Program for Understanding Natural Language," MAC-TR-84, MIT Project MAC, 1971.
http://hci.stanford.edu/~winograd/shrdlu/

[3] Norman, Donald (2001), The Design of Future Things, New York: Basic Books, 2007.

[4] Winograd, Terry (1991), "Thinking machines: Can there be? Are We?," in James Sheehan and Morton Sosna, eds., The Boundaries of Humanity: Humans, Animals, Machines, Berkeley: University of California Press, 1991 pp. 198-223. Reprinted in D. Partridge and Y. Wilks, The Foundations of Artificial Intelligence, Cambridge: Cambridge Univ. Press, 1990, pp. 167-189.
http://hci.stanford.edu/winograd/papers/thinking-machines.html

[5] The Stanford Human-Computer Interaction Group
http://hci.stanford.edu/

Pages