<Notes will be very sparse>
Chapter 1: Prospects and Limits of the Empirical Study of Expertise: An Introduction. Ericcson, Smith
- For chess, De Groot set up well-defined tasks for analyzing chess expertise not by watching players go through full games (which would be too diffuse in terms of the entire state space), and instead presented chess positions and asked players to only select the next move
- This isn’t exactly possible though, because in general you can’t exactly solve a board position in chess due to complexity
- De Groot used “thinking aloud” experiments by players of different skill levels
- De Groot found that when using the thinking aloud approach with next move queries, experts and masters took around 10 minutes: “In the beginning, the players familiarized themselves with the chess position, evaluated the position for strengths and weaknesses, and identified a range of promising moves. Later they explored in greater depth the consequences of a few of those moves. On average, both masters and experts considered more than thirty move possibilities involving Black and White and considered three or four distinctly different first moves.”
- He found that masters and experts didn’t differ in their rollout depth
- The differences between the two groups; but masters generally mention the best move during familiarization, whereas experts found the best move later on. This implies that move selection in chess generally comes down not to improved computation but rather improved board-value representation. “De Groot (1978, p. 316) argued that mastery in ‘the field of shoemaking, painting, building, [or] confectionary’ is due to a similar accumulation of experiential linkings.”
- During tests on board memorization (exposure from 2-10 seconds) improved recall was linked to improved playing ability. Chase and Simon followed up on these experiments
- For random board configurations (not arrived at during natural play), recall between masters and novices was equivalent… “showing that the superior memory performance of the master depends on the presence of meaningful relations between the chess pieces, the kinds of relations seen in actual chess games.”
- Recall of piece location did not occur smoothly over time – there would be bursts which corresponded with logical chunking; masters were found to have different <larger> chunk sizes
- “Chase and Simon (1093) found that the number of chunks recalled by chess players at all skill levels was well within the limit of around 7 +/- 2 <so it seems not to be the case that masters are simply better at all recall tasks>. They attributed the difference in memory performance between strong and weak players to the fact that the more expert chess players were able to recognize more complex chunks, that is, chunks with a larger number of chess pieces per chunk.”
- Estimated 3,000 hours to be an expert, 30,000 to be a master
- Better expert memory in areas of expertise has been shown in many other domains. Although experts may acutally forget parts of the information, it is usually in the case where that information is irrelevant (for example, forgetting symptoms that aren’t related to the diagnosis of a patient)
- (p.20) “The types of differences found in a wide range of domains of expertise are remarkably consistent with those originally noted by de Groot (1978) in the domain of chess. Expert performers tend to retrieve a solution method (e.g., next moves for a chess position) as part of the immediate comprehension of the task, whereas less experienced subjects have to construct a representation of the task deliberately and generate a step-by-step solution, as shown by research on physics problems (…) and algebra-word problems (…). Medical experts generate their diagnoses by studying the symptoms (forward reasoning), whereas less experienced medical students tend to check correctness of a diagnoses by inspecting relevant symptoms (backward reasoning) (Patel & Groen, chapter 4, this volume).”
- <Next paragraph> On the same theme, expert performers have a body of knowledge that not only is more extensive than for nonexperts but is also more accessible (…). Whenever knowledge is relevant, experts appear to access it efficiently (…). The experts are therefore able to notice inconsistencies rapidly, and thus inconsistent hypotheses are rejected rapidly in favor of the correct diagnosis (…). On presentation, information in the problem is integrated with the relevant domain knowledge (Patel & Groen, chapter 4, this volume).”
- p.22 discusses domain specific memorization schemes
- In categorization of physics problems, experts categorized them based on solution methodology that could be applied, whereas novices categorized them based on superficial aspects of the problem, such as the types of objects being discussed.
- Studies of board recall in chess show that masters also utilize forms of long term memory (not just short-term) in the task. Additionally, chunks are formed such that in many cases there is overlap so that there are also encodings of how chunks relate to each other.
- Recall also depends on task; as mentioned doctors may forget symptoms irrelevant to diagnosis, and similar results with studies on programming
- Studying performance of experts in the lab can be difficult because tasks in the lab must match the same tasks that the experts are experienced in
- Experts have faster response time, better ability to plan ahead, and better memory (all in the particular domain of expertise)
- Chase and Simon theory: (p.26)
- Difference in ability is related to immediate access to relevant knowledge (retrieving chess board positions/relevant chunking) (1973 – perception in chess)
- Theoretical account of how experts extract best moves from long-term memory
- Chunks serve as cues to activate best move recall
- “The chess masters’ richer vocabulary of chunks thus played a critical role in the storage and retrieval of superior chess moves.”
- Accounts focusing on practice and learning: (p.27)
- Improvement in a task often follows a power law <serious diminishing returns> (Newell & Rosenbloom, 1981). They also consider chunking here
- Fitts proposed 3 stages in skill aquisition:
- Cognitive: cognitive effort to understand the task and what parts to pay attention to
- Associative: “… making the cognitive process efficient to allow rapid retrieval and perception of required information.”
- Autonomous: “… performance is automatic, conscious cognition is minimal.”
- “First, it is important to distinguish between practice and mere exposure or experience. It is well known that learning requires feedback in order to be effective. Hence, in environments with poor or even delayed feedback, learning may be slow or nonexistent.”
- In some domains, performance never really improves, even after enormous amounts of practice – this is often the case when the domain is chaotic. Time spent doing something isn’t always a good measure of proficiency
- Accounts focusing on memory functioning: (p.28)
- “The Chase-Simon hypothesis that superior memory of the expert reflects the storage of more complex independent chunks in short-term memory has been seriously questioned, and most of the empirical evidence also suggests storage of interrelated information in long-term memory, as mentioned earlier.”
- Experts happen to develop excellent memory for the task of interest, although setting out with the goal just to develop the same memory ability (with no improvement in the actual task itself) one can develop memory on the level of a master quite quickly
- There is a school of thought that holds that in the above situation, those that trained specifically for recall are using only short-term memory, whereas experts go through the loop of accessing long term memory, but do that very quickly so it seems the same as short-term memory.
- Accounts focusing on the ability to plan and reason: (p.31)
-
-
- Chess masters can play “mental chess,” keeping track of the progress of a game simply by being told the move sequence. “This research raises the possibility that acquisition of expert-level chess skill involves the development of skilled memory for chess positions.”
- “Charness (1981) found that the depth to which a possible move sequence for a chess position was explored was closely related to the level of chess skill, at least for chess players at or below the level of chess experts.” <but I think I remember reading that there wasn’t much difference between experts and masters, oh immediately they say that is what de Groot found.>
- “One should also keep in mind that the task of searching for a move for a middle-game chess position is not designed to measure the capacity to make deep searches and hence may well reflect pragmatic criteria for sufficient depth of exploration to evaluate a prospective move.”
- “In the absence of a strict time constraint, there appears to be no clear limit to the depth to which a chess master can explore a position.” <due to the ability to play mental chess perfectly>
- Abilities of chess masters to play mental chess “… was consistent with the characteristics of skilled-memory theory (Chase & Ericsson, 1982; Ericsson & Staszewski, 1989).”
- In medical diagnoses, doctors must integrate evidence, not all of which may be available at the same time
- “The most effective approach to organizing the results across different domains of expertise is to propose a small number of learning mechanisms that can account for the development of similar performance characteristics in different domains within the limits of human informational capabilities. There is now overwhelming empirical support for the theory of acquisition of skill with mechanisms akin to those originally proposed by Chase and Simon (1973).” Which they themselves claimed was just a preliminary attempt at a theory.
Chapter 2: Experts in Chess: The Balance Between Knowledge and Search. Charness
- “Because of its unique properties – particularly its rating scale [elo] and its method of recording games – chess offers cognitive psychologists an ideal task environment in which to study skilled performance. It has been called a Drosophila, or fruit fly, for cognitive psychology (Charness, 1989; Simon & Chase, 1973).”
- Here, what is considered is “… the opportunity for trading off knowledge and search to reach a a single goal: skilled play.”
- Also considers how computer chess works
- Research on chess found that between experts and masters, search size was about the same, but recall/chunking efficiency (not # of chunks) was better in masters. The conclusion therefore was “… that chess skill depended on a large knowledge base indexed through thousands of familiar chess patterns. They theorized that recognition drives move generation in search, enabling the skilled player to examine promising paths, but leaving the less skilled to wander down less productive paths.” <Better heuristic accuracy>
- “Nonetheless, further research has revealed some apparent flaws in a strictly recognition-based theory. Other studies have brought into question the notion that recall of briefly seen chess positions would depend on the type of short-term memory system simulated by Simon and Gilmartin (1973).” Masters were still better at move selection for unnatural board configurations (even though their recall and that of experts was the same). This, along with a few other results showed “… a simple recognition-association theory was inadequate to account for all the data.”
- “Both I (Charness, 1976) and Frey and Adesman (1976) demonstrated that when chess players recalled briefly seen positions, information was not retrieved from short term memory. My study showed virtually no interference when players had to perform interpolated processing between exposure to the chess position and recall… Clearly a more sophisticated view of skilled memory, such as that proposed by Chase and Ericsson (1982), Ericsson (1985), and Ericsson and Staszewski (1989), is needed to account for recall effects. These theorists have stressed the importance of domain-specific, easily activated, long-term-memory retrieval structures in recall performance.”
- In a longitudinal study, Charness retested a player after a 9 year delay, where the player started at average tournament level strength and ended up an international master. “DH [the player] showed virtually no change in search (depth, extent), but did show major changes in recall, evaluation, and chunking… The major changes seemed to be pattern-related… the significant factor in skilled chess play at the top levels is what is searched, not how exhaustively or deeply the search is conducted.”
- Masters are less impacted by time pressure than lower-quality players
- There is also literature on abacus calculation (Hatano, Miyake, & Binks, 1977) <I know that those skilled with the abacus can also do “mental calculation” and can keep track of bead positions and changes fully in their head, just as chess masters can>
- A questionnaire (partially dealing with openings) is a better predictor of chess ability than the recall task
- <Lots of discussion of size of chess, number of openings, middle, and endgame knowledge, other aspects of metagame, learning from books as opposed to direct play>
- “Incidental” serial memory: good players can often recall large portions of a game right after the match, and masters can sometimes recall entire games from months or years earlier.
- Game trajectories can be encoded partially in terms of openings, closings, and other logical chunks
- “It is probably fair to characterize much of human learning as pattern learning. An unanswered question is that of whether certain patterns are easier to learn (and model) than others. Both psychometric investigations and neuro-psychological research provide evidence that all processing is not the same: Some people are better at spatial tasks; others at verbal tasks.”
Chapter 4: The General and Specific Nature of Medical Expertise: A Critical Look. Patel, Groen
- “Two fundamental empirical findings in research on expert-novice comparisons have been the phenomena of enhanced recall and forward reasoning. The first refers to the fact that experts have superior memory skills in recognizing patterns in their domain of expertise. This is extensively reviewed by Ericsson and Smith (chapter 1, this volume). The second pertains to the finding that in solving routine problems in their domains, expert problem-solvers tend to work ‘forward’ from the given information to the unknown. With the exception of Anzai’s study (chapter 3, this volume <on reasoning of physics problems, I didn’t have time to read>), this is not so extensively treated in this volume, but it has been discussed at length in a recent article by Hunt (1989)…”
- For details on the Hunt paper, check this out, <turns out forward and backward have different meanings than what I am used to, and the type of planning I am considering at the moment is actually the backward style, as defined here>
- “It might be noted that the distinction is frequently made, perhaps more generally, in terms of goal-based (backward) versus knowledge-based (forward) heuristic search (e.g. Hunt, 1989).”
- “The distinction between forward and backward reasoning is closely related to another distinction between strong problem-solving methods, which are highly constrained by the problem-solving environment, and weak methods, which are only minimally constrained. As Hunt pointed out, the distinctions are logically independent. Forward reasoning, however, is highly error-prone in the absence of adequate domain knowledge because there are no built-in checks in the legitimacy of the inferences. Therefore, success in using forward reasoning is constrained by the environment because a great deal of relevant knowledge is necessary. Hence, it is a strong method for all practical purposes. In contrast, backward reasoning is slower and may make heavy demands on working memory (because one has to keep track of things as goals and hypotheses). It is, therefore, most likely to be used when domain knowledge is inadequate, in which case there is a need for a method of reasoning that is minimally hampered by this lack of knowledge. Hence, backward reasoning usually is a symptom of a weak method.”
- Here the focus isn’t on differences between experts and novices, but rather “… an emphasis on the factors determining accurate performance and the robustness of the recall and forward-reasoning phenomena under variations of these factors… these phenomena are not as closely related as was implied by what Ericsson and Smith (chapter 1, this volume) refer to as the original theory. Specifically, there appears to be a ceiling effect associated with the recall of clinical cases. Beyond that level, however, there continues to be a strong relation between diagnostic accuracy and the use of forward reasoning.”
- Development from novice to expert is a 3 stage process:
- “… development of adequate knowledge-structure representations.”
- learning what is relevant and irrelevant in a problem
- “… learning how to use these relevant representations in an efficient fashion”
- Study presented data in a very structured (non-naturalistic manner)
- In identifying forward reasoning, did some graph-representation <although exactly how isn’t totally clear>
- “Forward reasoning corresponds to an oriented path from a fact to a hypothesis. Thus, forward-directed rules are identified whenever a physician attempts to generate a hypothesis from the findings in a case. Backward-directed rules correspond to an oriented path from a hypothesis to a fact.”
- They then asked other experts for causal rules explaining each case, and transformed them into production rules.
- Experts and “subexperts” (the next level below, but above “intermediate” – in this case it meant asking doctors questions about a medical issue outside their specialization) had the same recall, although diagnostic accuracy decreased
- An earlier study which seems to form the basis of this chapter(Patel & Groen 1986) found that all cases where pure forward reasoning were used corresponded to correct diagnoses, and that in any case where forward reasoning was not used
- Those working outside of their domain of expertise used a combination of forward and backward reasoning
- <skipping a bit>
- In the problems studied here, recall (as was studied by De Groot, among others) was not an accurate metric of performance due to ceiling effects (experts and subexperts both had perfect recall, although their actual performance in diagnosis differed). There is actually a nonmonotonic relationship between recall and accuracy in theses studies (there were 5 levels of expertise)
- Previous studies assumed recall, diagnostic accuracy, and forward reasoning were all correlated. “Thus, a theory that simply assumes that the development of expertise is related to the development of better representations cannot be true.”
- The findings argue against a couple of theories:
- Argues that medical diagnosis is not simply pattern recognition
- Argues against the idea that rules cannot be structured into some kind of hierarchy
- “Both of these theories posit a close relationship between chunk size in working memory and performance in problem-solving tasks. Hence, they predict a monotonically increasing relationship between recall and diagnostic accuracy, which as we have seen, does not hold.”
- Results argue in favor of SOAR model (Laird, Rosenbloom, Newell 1985) (seems to be a pretty GOFAI model). Has its own chunking system, and allows for forward and backward reasoning
- Argue for 3 kinds of expertise:
- Generic: development of adequate representations (for example experts and subexperts had the same recall)
- Specific: <not really clear on the point they are making here>
- Domain-independent: weak methods – used when there is not sufficient base information, and information must be searched for. “In contrast, strong methods are more akin to decision making than to search and are highly dependent on an adequate knowledge base.”
- In the studies on physics problems, there is good evidence that problem solving is a mixture of forward and backward reasoning. Forward reasoning is used on routine parts of problems, and backward reasoning on “nonroutine situations”
- That is backward reasoning can be used to “stitch together” a logical argument in situations that are difficult somehow (either because of lack of expertise, or because the problem is just hard)
- Argue that this form of generic expertise (at least being able to identify relevant parts of a problem, discard the rest, and use backward reasoning where there is a lack of expertise in that exact domain). This is how doctors making diagnoses outside of their field of expertise function
- “Intermediates conduct irrelevant searches, whereas experts do not. Novices do not conduct irrelevant searches simply because they do not have a knowledge base to search.”
Chapter 10: Techniques for Representing Expert knowledge
<Lots of the stuff here falls under categories of classical AI, linear algebra dimension reduction, hierarchical clustering, just making a concrete note about one item of interest>
- “Indeed, some of the continuing research themes have to do with how the organization of concepts for an expert differs from that for a novice…”
- The nature of the question requires questions about particular small testable aspects of the task of interest (such as recalling chess positions, as opposed to playing through games of chess)
- Major issue is how to elicit and then describe expertise
- Both direct (interviews, thinking out loud, observation of task performance, closed curves <see below>) and indirect methods (such as giving pairwise similarities and running through MDS, or hierarchical clustering)
![IMG_20140905_141807640~2[1]](https://aresearch.files.wordpress.com/2014/09/img_20140905_14180764021-e1409941222216.jpg?w=318&h=178)
- “Reitman (1976) asked a master of the game of Go to draw closed curves around related stones involved in a position in the game. Figure 10.4 illustrates several aspects of his responses. Two positions are displayed, with the master’s encircling of related stones. In addition, each stone bears a number that represents the ordinal position in which that stone was placed on the board in a recall task six months later <!>. Note that the recall order matches the closed curves to a remarkable degree: Nearly always, all stones of an encircled chunk were recalled before moving on to another chunk. This regularity of behavior supports claims for the validity of the information contained in the originally closed curves.”