The network properties of episodic graphs. Zhuang, Sreekumar, Belkin, Dennis. 2012? Poster?

  1. “We present statistical analyses of the small world properties for two particular types of episodic graphs. One is from the paragraph space of the Internet Movie Database (IMDb) and the other is from images collected as subjects engaged in their activities of daily living. We show that they have a small-world structure which is characterized by sparse connectivity, short average path lengths between nodes, and high global clustering coefficient. However, the degree distribution analyses show that they are not scale-free graphs.”
  2. Deals with models for episodic memory – how are context cues used?
  3. One idea is memory forms a network and cues help search
  4. As an approximation, they analyze graph structure of gossip stories on IMDB as well as SenseCam activity logging cameras
  5. Seems like they use very simple bag of words and color histogram to measure closeness <I should look at this more carefully later>
    1. <Not clear on how they then picked neighbors but may be stocastic transitions with weights equal to similarities>
    2. <If I understand this methodology, I don’t think its what I would use>
  6. Heavily pruned edges, at leas the IMDB dataset looks pretty similar to a random graph if pruning is not done
  7. High global clustering and short path lengths
    1. Small world structure, but no scale free property

What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision. Malamaud, Huang, Rathod, Johnston, Rabinovich, Murphy. (NAACL?) 2015

Going off the Arxiv version

  1. “We present a novel method for aligning a sequence of instructions to a video of someone carrying out a task. In particular, we focus on the cooking domain, where the instructions correspond to the recipe. Our technique relies on an HMM to align the recipe steps to the (automatically generated) speech transcript. We then refine this alignment using a state-of-the-art visual food detector, based on a deep convolutional neural network. We show that our technique outperforms simpler techniques based on keyword spotting. “
  2. Most large knowledge bases are based on declarative facts like “Barack Obama was born in Hawaii”, but lack procedural information
  3. This is a complex problem and can involve many different types of media and information, but here they focus on aligning video to text
    1. They work on video, with the text being the recipe that the user uploaded with the video
  4. Instructional videos are a good place to start on these types of problems because they often are heavily annotated with speech in the video
  5. Process is as follows:
    1. Align instructional steps (recipe) with speech via HMM
    2. Refine this alignment using computer vision
  6. They “create a large corpus of 180k aligned recipe-video pairs, and an even larger corpus of 1.4M short video clips, each labeled with a cooking action and a noun phrase. We evaluate the quality of our corpus using human raters. Third, we show how we can use our methods to support applications such as within-video search and recipe auto-illustration.”
  7. Worked from a corpus of 180k videos (started from 7.5 mil and worked their way down)
  8. Separate text into 3 classes (very accurately, with just naive Bayes and bag of words): recipe, ingredient, non-recipe
  9.  Use in house NLP processor similar to the Stanford parser
  10. Video annotation is provided automatically by youtube, then apply the NLP processor to that
    1. This data isn’t high quality, and the system works better when a real transcript is provided, but using automatic transcripts gets them more data
  11. HMM has #states = #steps in recipe (you can only move forward in this HMM)
    1. Figuring out what parts of the dialogue are “non-recipe” are important for this and help prevent premature transitions
  12. A simpler method is to do “keyword spotting” by looking for verbs and taking windows around when that occurred and looking for simple noun/verb combinations
  13. They use both of the previous techniques together: HMM+keyword spotting
  14. Sometimes people verbally describe what they are going to do before they do it.  They then use image recognition to find where an object was described (+/- seconds) figure out where to align the annotation to in the actual video
    1. Trained their own vision food-detector
  15. They downsample images to about the same size as overfeat uses (~220×220)
  16. When asking actual people on MTurk the hybrid HMM+keyword spotting method was rated as best
  17. <Here they are working from english language to a syntax tree and then doing alignment.  I wonder if you can do something similar with motion primitives which have also been used to learn generative grammars to do an alignment?>
  18. Other related work
  19. Methods for making simple subject-verb-object-place sentences from video

Different Spatial Scales of Shape Similarity Representation in Lateral and Ventral LOC. Drucker, Aguirre. Cerebral Cortex 2009

Uses the same stimulus as earlier study of Fourier basis and shape

  1. ” The results, indicating a coarse spatial coding of shape features in lateral LOC and a more focused coding of the entire shape space within ventral LOC, may be related to hierarchical
    models of object processing.”
  2. Small regions of cortex respond to different classes of images, and furthermore “… small regions of cortex contain populations capable of representing the entire space of images in a category”
  3. “A counterpoint to this apparent specialization has been the demonstration that information regarding object category is also contained in the distributed pattern of voxel responses across
    and between these specialized regions (…).”
  4. The results show there are also coarse-scale representations.  “This type of representation might
    correspond to the ‘‘chorus of fragments’’ model of Edelman and Intrator (1997), where individual properties of objects are represented by separate neural populations.”
  5. “Our focus here is upon the representation of variations in stimulus identity within a simplified object category… In practice, the structure of a parameterized space of shapes can be recovered from human behavioral responses (e.g., reaction times or similarity judgments) …”
    1. This similarity may also be reflected in neural patterns, which is what they check out here
  6. “Does a similar system of neural representation exist within human visual cortex? The human lateral occipital complex (LOC) shows similar functional properties to those previously ascribed to IT structures in the macaque. This region responds more strongly when a viewer is presented with images of parseable objects, as opposed to images that have no 2- or 3- dimensional interpretation, and appears largely indifferent to the method of object perception, for example, objects may be defined by luminance, texture, motion, or stereo difference”
    1. For IT and macaque, see this
    2. Ventral LOC is also called posterior fusiform sulcus (pFS)
  7. “Two recent studies have demonstrated a relationship between the perceptual similarity and the
    distributed pattern of neural activity in LOC (Op de Beeck et al. 2008; Haushofer et al. 2008),”
  8. During fMRI scanning, subjects viewed 16 different shapes defined by radial frequency components (RFCs; a series of sine waves of various frequencies describing perturbations from
    a circle; Zahn and Roskies 1972; Fig. 1)

    1. Idea of RFCs actually being used in shape recognition was eventually “experimentally rejected” but it makes convenient stimuli, also totally abstracted with no categorical boundaries
  9. Shapes were modified by altering amplitude and phase of a particular frequency component
  10. Here neural adaptation (depends on habituation) to shape is studied on neural level
  11. “We asked in this study if the degree of recovery from neural habituation at different cortical sites was proportional to the transition in similarity between 2 stimuli.”
  12. “In this study, we investigated if the distributed pattern of response can inform as to the identity
    of stimulus variation within an object category;”
  13. “Continuous Neural Adaptation in Ventral LOC Is Proportional to Shape Similarity”
    1. No adaptation effects found in lateral LOC
    2. Magnitude in change in shape matched linearly with change in ventral LOC
  14. “An alternative explanation for the proportional recovery from adaptation in ventral LOC is that the extreme stimuli (those from the corners of the stimulus space) may evoke a larger neural response generally (e.g., Kayaert et al. 2005). As the larger distance stimulus transitions tend to include these
    extreme stimuli to a greater extent, perhaps the apparent recovery from adaptation is actually a larger response to these extreme stimuli independent of an adaptation effect.”

    1. This is not the case, however, as the results show that “the proportional recovery from adaptation seen in ventral LOC indicates the presence of a population code for stimulus shape and cannot be attributed to a generally greater neural response to extreme stimuli.”
  15. “Distributed Pattern Responses Distinguish between Shapes”
  16. Use SVMs to analyze data at coarse spatial level, which worked well
  17. “The accuracy of the SVM analysis and the identified patch within lateral LOC indicates that the distributed voxel pattern of activity in that area carries information about shape.However, the pattern difference between shapes need not reflect the similarity of the stimuli or indeed have any particular structure. The SVM requires only that patterns be different in order to distinguish them—no assumptions about similarity structure are made or used.”
  18. “Within lateral LOC, the strongly discriminant responses seen in the SVM analysis were found to also reflect stimulus similarity consistently across subjects (t4 = 10.0, P = 0.001). In contrast,
    the distributed pattern of response in ventral LOC had a weaker correlation with the perceptual similarity of the stimuli (t4 = 1.2, P = 0.3) (Fig. 4A). The difference between these subregions of
    area LOC was significant (t4 = 11.4, P = 0.0003).”

    1. Mixed evidence for this being attributable strictly to retinotopic similarity
  19. “The RFC-Amplitude and RFC-Phase Axes Are Differentially Represented at Coarse and Fine Neural Scales”
  20. “Although the distributed neural similarity matrix measured from lateral LOC was strongly correlated with the stimulus similarity matrix, there appeared to be aspects of the structure
    of the neural response not evident in the stimulus matrix”
  21. Earlier studies on these shapes showed results that had phase and amplitude being recognized as orthogonal and equally important but that wasn’t completely replicated here.  Results here say the dimensions are “equally perceptually salient”, but that they are not perceived equivalently
  22. “…both aspects of the stimulus space [amplitude, phase] are represented by the within-voxel population code within ventral LOC… A rather different result was observed for the distributed
    pattern of response within lateral LOC. There, the distributed pattern across subjects reflected the shapes primarily in terms of RFC-amplitude but not RFC-phase”
  23. “For example, clusters of neurons might represent the tightness of the ‘‘knobs’’ of the shapes (defined by RFC-amplitude) independent of the direction that those knobs point within the overall shape (defined by RFC-phase). RFC-amplitude and RFC-phase may be taken as similar to ‘‘feature’’ and ‘‘envelope’’ parameters of Op  de Beeck et al. (2008), respectively; we thus contribute a similar finding in that features are represented in the distributed pattern in lateral LOC much more reliably than the overall shape envelope.”
  24. “Based upon the differential sensitivity to shape identity for the adaptation and distributed pattern methods, we argue that although both the lateral and ventral components of area LOC contain neural population codes for shape, the spatial scale of these representations differ. Specifically, the
    absence of a distributed pattern effect within ventral LOC is evidence for a homogeneous representation of the shape space, such that the average response of any one voxel does not differentiate between the shapes, whereas the presence of a distributed code and the absence of an
    adaptation effect in lateral LOC suggests that there is a heterogenous distribution of shape representation,”
  25. “Within ventral LOC, no meaningful tuning for the shape space can be identified: The amplitude of the response is no different for different shapes. This indicates that ventral LOC voxels are broadly tuned for shape identity. In contrast, lateral LOC voxels show relatively narrow tuning: there is a progressive decline in the response of a voxel for shapes more distant from the shape for which the voxel is best tuned (which was frequently a stimulus from the edges of the stimulus space). Moreover, lateral LOC voxels appear more narrowly tuned for the RFC-amplitude, as compared with the RFC-phase dimension of the shape space, consistent with our previous observation”
  26. “The narrow tuning observed in lateral LOC may also explain the absence of a linear adaptation response in this region to transitions in shape space. If a given voxel is narrowly tuned to a particular region of the shape space, then it may only show recovery from adaptation for stimulus transitions within its tuned area.”
  27. <Discussion>
  28. ” By using a continuous carryover design, our study was capable of examining neural similarity both on a coarse, across-voxel scale by distributed pattern analysis, as well as on a fine, within-voxel scale using continuous neural adaptation. We can thus compare the information provided at distributed and focal levels.”
  29. “Unlike ventral LOC, the lateral portion of LOC did not show adaptation responses that were linearly related to shape similarity. We found that the narrow tuning of lateral LOC voxels could explain this finding, indicating that each particular voxel has a population of neurons that are tuned to one specific region of the shape space. Consequently, most of the transitions between stimuli would not induce neural adaptation within the voxel as they would be transitions between stimuli not within the voxel’s receptive field.”

Perceptual Similarity of Shapes Generated from Fourier Descriptors. Cortese, Dyre. Journal of Experimental Psychology. 1996.

  1. A metric representation of shape is preserved by a Fourier analysis of the cumulative angular
    bend of a shape’s contour. Three experiments examined the relationship between variation in
    Fourier descriptors and judgments of perceptual shape similarity. Multidimensional scaling of
    similarity judgments resulted in highly ordered solutions for matrices of shapes generated by
    a Fourier synthesis of a few frequencies. Multiple regression analyses indicated that particular
    Fourier components best accounted for the recovered dimensions. In addition, variations in
    the amplitude and the phase of a given frequency, as well as the amplitudes of 2 different
    frequencies, produced independent effects on perceptual similarity. These results suggest that
    a Fourier representation is consistent with the perceptual similarity of shapes, at least for the
    relatively low-dimensional Fourier shapes considered.”
  2. Although many things are useful for object recognition (color, texture, etc) earlier work shows outline (contour) shape being the most important
  3. Mention approach for shape representation as having an alphabet of shape-piece prototypes that are then assembled – can be represented hierarchically or spatially in some other manner.
    1. Pinker was a proponent of this
  4. But there hasn’t been any real traction on this from a practical sense, as “The difficulty lies in representing he infinite variety of shapes with a small set of primitives. Typically, the parts are distinguished only by qualitative differences in shape. ”
    1. Idea of geons, codons, but they dont deal with metric variations which seems important
    2. Marr also had idea of some form of decomposition
  5. An alternative is to use a system that doesn’t involve parsing an object into parts
    1. Fourier descriptors is one system from computer vision
  6. ” In this method, given an arbitrary starting point on a closed contour, the function relating cumulative arc length to local contour orientation is expanded in a Fourier series ”
    1. Has some nice properties, including that global shape characteristics can be determined just by the first few low-frequency terms, also its basically invariant to starting point
  7. Fourier descriptors were used in early computer vision and have been considered in biological vision as well
  8. One study found “…hat approximately half of the visually responsive neurons in the inferior temporal cortex were selectively tuned to the frequency of FD stimuli ”
    1. “… all frequencies were about equally represented, except for a reduced incidence of the frequency 64 cycles per perimeter. ”  Fits werent quite linear but were still good
  9. “In the present experiments, we tested this prediction [that FDs are related to categorization] by obtaining ratings of perceived shape similarity and subjecting them to multidimensional scaling”
  10. “…if, on the one hand, perceived shape similarity is related to variation in the amplitude and phase parameters of the contour, then vectors representing these Fourier components should account for the dimensions of the recovered similarity space. If, on the other hand, qualitative
    stimulus attributes are used to represent shape (e.g., smoothness, number of parts, or orientation), then vectors representing these qualities should account for the majority of variability in similarity judgments. For this reason, we also obtained ratings on a number of unidimensional scales representing qualitative aspects of the stimuli.”
  11. “In Experiment 1, we varied the amplitude and the phase of a single FD frequency. A Fourier representation of shape would predict that the perceptual similarity space should reflect variation of these two parameters. Also, because of the independence of amplitude and phase in a Fourier representation, we made an additional prediction: The amplitude and the phase of a given FD frequency should show independent effects on perceived similarity.”
  12. Screen Shot 2015-01-13 at 1.22.45 PM
  13. Participants were shown 45 pairs, and were told to rate them for similarity on a numeric scale, and then after that they rated each shape on 7 independent numeric scales (width, straightness, smoothness, # of parts, complexity, symmetry, orientation) – these criteria were intended to be alternatives for doing classification
  14. MDS using euclidian distance on the similarity ratings – there was a sharp elbow with 2 dimensions
  15. Screen Shot 2015-01-13 at 1.32.06 PM
  16. This reproduces almost exactly the earlier figure (just rotated and flipped), “….which suggests that perceived dissimilarity is monotonically related to distance in a 2-D Euclidean space with, in this case, amplitude and phase as the two dimensions. Indeed, the relationship between distances in this space and perceived dissimilarities may be linear: A linear multidimensional-scaling analysis produced a 2-D solution with virtually the same pattern as that for the monotonic analysis”
  17. ” that the phase and the amplitude of Frequency 6 accounted for more variability in the judgments of similarity than did any of the unidimensional scales, with the exception of smoothness.”
  18. Experiment 2

  19. ” Fourier theory also predicts another pattern of effects on similarity judgments: the independence of amplitude values at different frequencies. The purpose of Experiment 2 was to test this prediction…”
  20. Based on MDS “it appears that the perceived dissimilarities of these shapes are monotonically related to distance in a Fourier space, with amplitude of frequency 2 and amplitude of frequency 4 as the two dimensions “
  21. “Fitted vectors for the amplitudes of the two frequency components were found to be orthogonal
    (angular difference = 88.8°, suggesting that there were independent perceptual effects of variations in amplitude on two different frequencies. This observation, along with the observed independence of amplitude and phase in Experiment 1, is consistent with a representation of shape based on FDs.”
  22. Variation in amplitude of freq 2 was highly correlated with judgements of “width” and freq 4 was with “smoothness”
  23. Experiment 3

  24. “Experiment 3 tested the effects of variation in the phases of two different frequencies on
    judgments of similarity. As in Experiments 1 and 2, this was an investigation of the perceptual effects of variation in two parameters of the Fourier expansion. However, unlike the previous experiments, the parameters manipulated in this experiment did not exhibit independent effects on the shape of the contour, because the relative phases, and not the absolute phases, determined the shape”
  25. Here stimuli were constructed from freqs of 4,6,8 cycles/perimiter, with amplitudes held constant.
    1. Phases of freqs 6, 8 were varied indep (need an extra freq of 4 around for the comparison to work)
  26. Here stress plot from MDS didn’t have a clear elbow, but plotting with 2 dimensions made items in a ‘U’ shape (a linear manifold), implying a one-dimensional solution – they are dependent
  27. “This relationship, taken together with the results of Experiments 1 and 2, which found a significant relationship between number of parts and amplitude of frequencies 4 and 6, suggests that object parsing may be related to the amplitude and the relative phase of frequencies in this range (4 to 8 cycles per perimeter). “
  28. “Of particular importance was the evidence found for independent perceptual effects for variations of amplitude and phase on a single frequency and for variations of amplitudes on two different frequencies. Both of these results predicted by a Fourier theory.”
Tagged , ,

Effects of Task Demands on the Responses of Color-Selective Neurons in the Inferior Temporal Cortex. Koida, Komatsu. NatNeuro 2007

  1. “Categorization and fine discrimination are two different functions in visual perception, and we can switch between these two functions depending on the situation or task demands. To explore how visual cortical neurons behave in such situations, we recorded the activities of color-selective neurons in the inferior temporal (IT) cortex of two monkeys trained to perform a color categorization task, a color discrimination task and a simple fixation task. Many IT neurons changed their activity depending upon the task, although color selectivity was well conserved. A majority of neurons showed stronger responses during the categorization task. Moreover, for the population of IT neurons as a whole, signals contributing to performing the categorization task were enhanced. These results imply that judgment of color category by color-selective IT neurons is
    facilitated during the categorization task and suppressed during the discrimination task as a consequence of task-dependent modulation of their activities.”
  2. “On the one hand, we are able to discriminate subtle differences in color; on the other hand, we often categorize similar colors into a single group, such as ‘red’ or ‘green’. How we utilize each of these two functions depends upon the demands of the situation or task: in some situations we respond similarly to colors A and B (categorization); in other situations we respond differently to the two colors (discrimination).”
    1. <Discrimination isn’t the opposite of categorization – it can be thought of categorization where two items are simply placed in different categories…>
    2. Task switching happens in PFC, and there are neurons there that encode task rules, and change responses as soon as tasks change
  3. Question is whether neuronal activity in sensory cortices changes with changes in task rules
  4. “Many studies have shown that attention to specific locations or visual features modulates
    the activities of visual cortical neurons. However, it is generally believed that stimulus coding and neuronal responses within visual cortical areas are stable with respect to changes in a task rule, though few studies have directly examined this problem.”
  5. Here they examine if color sensitive neurons in inferior temporal (IT) cortex change based on task.  In one case monkeys have to make a categorical judgement, and in the other a find discrimination to the same color stimulus
  6. “The IT cortex lies at the final stage of the ventral visual pathway15, and it has strong mutual connections with the PFC16. Single-unit recordings have shown that many color-selective neurons exist within the IT cortex…”
  7. Three tasks, all which present a colored stimulus that monkey fixated on:
    1. Categorization: classify color as closer to red or green, either by maintaining fixation or saccading
    2. Discrimination: choose which of two choice stimuli was the same color as the sample stimulus
    3. Passive viewing
    4. <Not yet clear how the discrimination worked, was it with a saccade to one of the two colors?  If so it seems it would have been better to have the categorization task also involve a saccade to one of two locations instead of either fixation or saccade>
  8. “We found that the activity of many IT neurons differed depending upon the task, although color selectivity was well conserved. For the population of IT neurons as a whole, color signals differentiating red versus green were enhanced during the categorization task. This suggests that judgment of color category by color-selective IT neurons was facilitated during the categorization task and suppressed during the discrimination task as a consequence of task dependent modulation of these neurons. These results suggest that the flow of color information from the IT cortex is strongly controlled by top-down signals representing the ongoing task rule presumably sent from the PFC.”

Categorical Clustering of the Neural Representation of Color. Brouwer, Heeger. JNeuro 2013.

  1. fMRI study where subjects viewed 12 colors did either a color-naming or distractor task
  2. “A forward model was used to extract lower dimensional neural color spaces from the high-dimensional fMRI responses.”
  3. Vision areas of V4 and V01 showed clustering for color naming task but not for distractor
  4. “Response amplitudes and signal-to-noise ratios were higher in most visual cortical areas for color naming compared to diverted attention. But only in V4v and VO1 did the cortical representation
    of color change to a categorical color space”
  5. We can perceive thousands of colors but have only a handful of descriptive categories for colors, so we can see two different colors but would still potentially call it the same thing
  6. Inferotemporal cortex (IT) is believed to deal with categorization of color
  7. “…performing a categorization task alters the responses of individual color-selective neurons in macaque IT (…).”
  8. Similar colors cause overlapping patterns of neural activity, “… neural representations of color can be characterized by low-dimensional ‘neural color spaces’…”
  9. “Activity in visual cortex depends on task demands (…).”
    1. Use fMRI to study this
  10. “Forward model” is used to reduce fMRI signals to a lower dimensional space of bases
  11. ” Normal color vision was verified by use of the Ishihara plates (Ishihara, 1917) and a computerized version of the Farnsworth–Munsell 100 hue scoring test (Farnsworth, 1957).”
    1. <Need to learn about this>
  12. “The 12 stimulus colors were defined in DKL (Derrington, Krauskopf and Lennie) color space… We chose the DKL space because it represents a logical starting point to investigate the neural representation of color in visual cortex. Although there is evidence for additional higher-order color mechanisms in visual cortex (Krauskopf et al., 1986), the color tuning of neurons in V1 can be approximated by linear weighted sums of the two chromatic axes of DKL color space
    (Lennie et al., 1990).”
  13. Color categorization task was done outside the scanner, involved putting 64 colors into one of 5 categories
  14. When in the fMRI, there were two types of episodes.  In one, subjects had to press one of 5 buttons to categorize the color (R,G,B,Y, or purple).  Distractor task was a 2-back test (is color the same as the color 2 steps ago)
  15. <details on fMRI processing>
  16. Used the forward model from this paper.
  17. ” We characterized the color selectivity of each neuron as a weighted sum of six hypothetical
    channels, each with an idealized color tuning curve (or basis function) such that the transformation from stimulus color to channel outputs was one to one and invertible. Each basis function was a half-wave-rectified and squared sinusoid in DKL color space.”
  18. Assume voxel response is proportional to the number of responding neurons in that voxel
  19. Channel responses C (an n x c matrix where n was number of colors, and c # channels(6)).  Then did PCA on this to “extract neural color spaces from the high-dimensional space of voxel responses(…)”
  20. “According to the model, each color produces a unique pattern of responses in the channels, represented by a point in the six-dimensional channel space.  By fitting voxel responses to the forward model, we projected the voxel responses into this six dimensional subspace.”
    1. PCA <A competing method they previously used to do this analysis> did not work as well – had similar results but more variability because it tries to fit noise where the forward model throws it out
  21. To visualize the forward model, they ran PCA to project the 6D space to 2D (these 2 dimensions accounted for almost all the variance)
  22. “Reanalysis of the current data using PCA to reduce dimensionality directly from the number of voxels to two also yielded two-dimensional neural color spaces that were similar to those published previously. Specifically, the neural color spaces from areas V4v and VO1 were close to circular, whereas the neural color spaces of the remaining areas (including V1) were not circular, replicating our previously published results and supporting the previously published conclusions (Brouwer and Heeger, 2009).”
  23. Used many different clustering methods to see if colors labeled in the same color category had a more similar response than those in other categories
  24. On to results
  25. Subjects were pretty consistent where they put color class boundaries.  Blue and green were the most stable
  26. Subjects weren’t told category labels–basically that they were doing clustering–but still categories were intuitively identifiable and pretty stable
  27. Color clustering was strongest in V01 and V4v, during the color-naming task.  Responses from neighboring area V3 were more smoothly circular and therefore not as good at clustering.
  28. Screen Shot 2015-01-06 at 12.38.08 PM
  29. “The categorical clustering indices were significantly larger for color naming than diverted attention in all but one (V2) visual area (p 0.001, nonparametric randomization test), but the
    difference between color naming and diverted attention was significantly greater in VO1 relative to the other visual areas (p 0.01, nonparametric randomization test). One possibility is that all visual areas exhibited clustering of within-category colors, but that the categorical clustering indices were low in visual areas with fewer color-selective neurons, i.e., due to a lack of statistical power”
  30. “… no visual area exhibited categorical clustering significantly greater than baseline for the diverted attention task.”
  31. Manual clustering done by subjects matched that done from the neural data, aside from the fact that neurall turqoise/cyan matched with blues, whereas people matched it with greens
  32. “Hierarchical clustering in areas V4v and VO1 resembled the perceptual hierarchy of color categories”
    1. In V01 when doing color naming.
    2. The dendogram resulting from the distractor task looks pretty much like garbage
  33. <Shame on the editor.  Use SNR without defining the abbreviation – I assume its signal to noise ratio?>
  34. “Decoding accuracies from the current data set were similar; forward-model decoding
    and maximum-likelihood decoding and were nearly indistinguishable.”
  35. <Between this and the similarity of the result of PCA, what does their forward model buy you?  Is it good because it matches results and is *less* general?>
  36. “… we propose that some visual areas (e.g., V4v and VO1) implement an additional color-specific change in gain, such that the gain of each neuron changes as a function of its selectivity relative to the centers of the color categories (Fig. 8C). Specifically, neurons tuned to a color near the center of a color category are subjected to larger gain increases than neurons tuned to intermediate colors”
    1. <It is only shown that doing this in simulation helps clustering, which is in the neural data, but they don’t show that the neural data specifically supports this over other approaches>
  37. “Task-dependent modulations of activity are readily observed throughout visual cortex, associated with spatial attention, feature-based attention, perceptual decision making, and task structure (Kastner and Ungerleider, 2000; Treue, 2001; Corbetta and Shulman, 2002; Reynolds and Chelazzi, 2004; Jack et al., 2006; Maunsell and Treue, 2006; Reynolds and Heeger, 2009). These task-dependent modulations have been characterized as shifting baseline responses, amplifying gain and increasing SNR of stimulus-evoked responses, and/or narrowing tuning widths. The focus in the current study, however, was to characterize task-dependent changes in distributed neural representations, i.e., the joint encoding of a stimulus by activity in populations of neurons.”
  38. <Need to read all references in section “Categorical specificity of areas V4v and VO1″>
  39. Lots of results that show V4 and nearby areas respond to chromatic stimuli.  They have a previous paper (their one from 2009) that V4v and V01 better match perceptual experience of color than other regions, but there aren’t many results dealing with “… the neural representation of color categories, the representation of the unique hues, or the effect of task demands on these representations”
  40. Previous EEG studies show that the differences in EEG when looking at one color and then another “…  appear to be lateralized, providing support for the influence of language on color  categorization, the principle of linguistic relativity, or Whorfianism (Hill and Mannheim, 1992; Liu et al., 2009; Mo et al., 2011). Indeed, language-specific terminology influences preattentive color perception. The existence in Greek of two additional color terms, distinguishing light and dark blue, leads to faster perceptual discrimination of these colors and an increased visual mismatch negativity of the visually evoked potential in native speakers of Greek, compared to native speakers of English (Thierry et al., 2009).”
    1. Here however, no evidence of lateralized categorical clustering from fMRI
  41. Neural research on Macaques and color, but there are differences in brain structure and sensitivities in photoreceptors between them and us so we need to keep that in mind when examining the results from animal experiments on color
  42. “We proposed a model that explains the clustering of the neural color spaces from V4v and VO1, as well as the changes in response amplitudes (gain) and SNR observed in all visual areas. In this model, the categorical clustering observed in V4v and VO1 is attributed to a color-specific gain change, such that the gain of each neuron changes as a function of its selectivity relative to the centers of the color categories.”
Tagged ,

2014 in review

The stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

A San Francisco cable car holds 60 people. This blog was viewed about 3,200 times in 2014. If it were a cable car, it would take about 53 trips to carry that many people.

Click here to see the complete report.

Color Categories in Thought and Language. Hardin, Maffi (eds). Book 1997.

Chapter 3: The Psychophysics of Color (Wooten, Miller)

  1. Having just two photopigments wouldn’t allow for the sort of color differentiation people have (1 just allows mono of course)
  2. On the other hand, In 1803 Thomas Young (later Young Helmholtz theory) noted that if we had many different photopigments, we would have problems with spatial vision
  3. The compromise is 3 – still not perfect color differentiation of course, but works well for our needs
  4. Understanding the basic system of photopigments explains much about color vision
  5. So Helmholtz thought we had receptors for R,G,B but 19th century physiologist Hering asked, how can you get yellow from additive R,G,B?  You can’t.  He proposed the opponent-process model
    1. 4 primary hue sensations: RGB+yellow, and G is blue and yellow
    2. Furthermore, we can see blue, and green-blue, and red-blue, but there is no such thing as yellow-blue.  Likewise, green can go with Y or B but not R
    3. Based on these observations, proposed that yellow and blue are opposites in an opposing process, likewise for red, green
    4. What we see is based on how these two separate opposed processes react
  6. White is activated directly by stimulation, but black is only by contrast when there is nearby stimulation
  7. When R/G, Y/B channels are in equilibrium, only Bk and W channels are excited giving black, grey, white.
    1. But most stimuli don’t produce perfectly balanced chromatic channels, as well as Bk and W.  Simultaneous activation of chromatic and achromatic leads to perception of saturation, where pure red is at one end (for example) and pure black or white at the other, with maroon or pink respectively in between
  8. Although still not proven, single cell recordings seem to confirm Hering’s proposal, and there are opponent color cells.
  9. It turns out both camps were right in a way – there are three types of receptors as Young-Helmholtz predicted, but the way they relate to each other was according to Hering’s model pigment function is as follows:
    1. alpha: -B, +R
    2. beta: +Y, -G
    3. gamma: +Y, +R
  10. In 1955, Jameson, Hurvich quantified Hering’s formulation with algebraic expressions.  The idea is that ‘redness’ can be described as how much green must be added to make the color appear neither red or green (likewise for Y/B)
  11. They then ran experiments asking what % of RGBY they saw in monochromatic light <how could it be monochromatic then??>  Subjects quickly found the task easy and reported red-greens or yellow-blues.  The numbers closely matched those that were predicted by the model (96%, although accuracy was still increasing by the end of the experiment so probably is even higher)
    1. There is also, however, some indication of nonlinearities in the actual data, but the model is strictly linear. Still, its pretty accurate
  12. Everyone has different “elemental” (think “primary” although slightly different meaning, as elemental also includes Bk/W) colors.  What each person would define as perfect Y with no R or G differs.
  13. Further experiments had subjects verbally describe colors based on a restricted set of color names.  A color term is sufficient and necessary to describe a color if it is elemental.  Given a set of terms, the set is sufficient if it can be used to describe any color.  If a color term is necessary and is excluded from the set, then not all stimuli can be described fully
    1. For example, if unable to respond yellow, when presented with a yellowish color, and the only response is 25% red, then the incomplete description (need another 75%) means the terms available weren’t adequate
    2. By these means, it was determined orange was not fundamental because it was described successfully with Y,R
  14. In the Munsell system, there is an implied elemental purple because it is given equal status in the color wheel with the 4 elemental colors – the above experiment however, showed that purple also was not elemental
  15.  Brightness is not the same as lightness and ranges from dim to bright, encoding stimulus intensity
  16. Experiments on monochromatic color require use of 2 stimulus fields to create contrast, this is most commonly done experimentally as a disc surrounded by a ring of another neutral light
    1. This experiment showed grey not elemental, while white black are
  17. Later, there were quantitative versions of these experiments where 4 hue terms were given along with a saturation judgement

Chapter 8: Color Systems for Cognitive Research

  1. The common way to measure the number of colors we can recognize is to put two blocks of the same color adjacent to each other and then vary one of them until a difference is noticed (“just noticeable difference”/JND).
    1. If these colors aren’t touching but are separated a bit, the ability to differentiate possibly millions of colors goes down to thousands
  2. Systems of describing colors fall into two main categories:
    1. Those for describing the physical characteristics of the color
    2. Those describing the psychometric/perceptual color sensations
  3. Many aspects of psychometrics of color are tricky to measure.  Taking light blue and strong yellow, when the colors are far apart , people say the light blue is lighter, but close together they will say the strong yellow.  Are people paying attention to lightness, or whiteness?
    1. Similarly, lightness can only be defined relatively and not absolutely (although whiteness can be defined absolutely)
  4. Two main systems are Hering’s Natural Color System (based on his opponent-process model), and Munsell’s.
  5. Munsell

  6. Munsell was actually a painter, there were a few ways he did things that were incorrect
  7. Totally phenomenological
  8. Vertical axis for “Value” (lightness), and perpendicular axis of “Chroma” (color strength) and then a color wheel for hue <still not sure I really understand distinction between value and chroma, but a helpful description>
  9. “Due to the fact that the colors of maximal strength for different Hues vary both in Chroma and Value, the graphical representation becomes both irregular and skewed…”
    1. Because of this, he later left phenomenological approach and moved to psychophysics <what is the difference exactly?>
  10. Value goes from 0 to 10 (theoretically), but the ends of those ranges can’t be produced (purely reflective, absorptive) so samples used ranged from 0.5-9.5
  11. Chroma is how far from the grey axis
  12. Hue is the weirdest in his system, because of how he set it up.  He originally wanted 10 colors (to make a metric system), but that didn’t work out so he decided he would make 5 main hues.  There were lots of arbitrary decisions.  He started at a green that he felt was neither warm nor cold, and then found 4 other colors through trial and error that made neutral grey when put on a disk with this green and spun.  This yielded a red, yellow, purple, and blue
    1. Based on this, the color circle is considered to be partitioned into 5 visually equal parts <although the way it was done would have any reason to make this claim true>
  13. He then ended up with the 10 colors he wanted by inserting additional hues midway between the 5 he started with, and then 10 more times each, which gave 100
    1. There is a pretty complex way of referring to color in his system
  14. After Munsell passed, technology allowed for analytical examination of his system, which turned up irregularities.  It was then revised through a tremendous undertaking.  Some seem to think the revision actually made it worse, by trying to smooth out <through distortions?> the original data, it made it less accurate
  15. The author of the chapter does not seem to like this system very much
  16. Natural Color System

  17. It was created because it was felt there were too many problems with existing systems (including Munsell’s>
  18. Munsell’s was stimulus based, but this is a theoretical/cognitive system <guess I’ll find out what that means more specifically>
  19. It was based on Hering’s approach – was based on 6 elemental colors (RGBYWBk)
  20. Because a color can’t resemble more than 4 elementary colors, so the unit of measurement is percent resemblance to the elementary color X
  21. When put in a color wheel, opposing colors are opposite.  The system wasn’t designed to fit in a wheel but it happens to do so nicely and is done this way because of convention
  22. Colors of a given hue in this system rendered in a equilateral triangle.  Corners are (elementary) white, (elementary) black, and max chromaticness.  “chromaticness” is similar to Munsell’s chroma but Munsell’s is open-ended, whereas chromaticness has a closed range 0-100
  23. NCS color space is then a wheel where a triangle goes around, so its a sort of diamond, or top-shape
  24. <Conceptually, I have a much easier time understanding this system than Munsell’s>
  25. People will usually place the same color in the same place in NCS color space with about 3-5% disagreement between them
  26. The notation is also simpler, 3 digits of blackness, chromaticness, and hue, although this is actually overdetermined, and you can go with just 2 of the 3 and infer the 3rd since everything is on 0-100 scale
  27. Comparison of Munsell and NCS

  28. There is an NCS atlas that attempts to document the NCS color diamond <my word> but the distinction between this and the Munsell atlas, is that in the case with Munsell, that atlas defines the colors itself (and is only valid in the lighting and viewing conditions specified for examining the atlas).  In NCS its the 3 digit description of the color that defines it
    1. “The Munsell system is thus dependent on its atlas, but NCS is not.”
  29. The 3 dimensions in Munsell and NCS do not mean the same thing, and there is not necessarily a straightforward way to go between them because of fundamental differences in what the variables are taken to mean (Munsell’s discriminative vs NCS’ descriptive)
  30. The color wheels used by the two are also not the same, with Munsell’s blue different from “unique” <elemental?> blue
  31. <rest of chapter goes on to things like color-word associations so I’m skipping that>
Tagged , ,

Mapping the stereotyped behaviour of freely-moving fruit flies. Berman, Choi, Bialek, Shaevitz. Journal of the Royal Society Interface 2014

  1. “A frequent assumption in behavioural science is that most of an animal’s activities can be described in terms of a small set of stereotyped motifs.”
  2. Create a system to analyze behavior and find that about half the time, behavior is based on 100 different stereotyped behavioral states
  3. Stereotypy – “that an organism’s be- haviours can be decomposed into discrete, reproducible elements”
  4. Although animals can move in a really enormous space of movements, they are thought to keep motion in a small set of motion, made up of stereotyped actions (may be specific to time range, individual throughout life, or a species)
  5. “A discrete behavioural repertoire can potentially arise via a number of mechanisms, including mechanical limits of gait control, habit formation, and selective pressure to generate robust or optimal actions.”
  6. For the most part, stereotypy hasn’t been studied experimentally, mostly because of a “lack of a comprehensive and compelling mathematical frame- work for behavioural analysis”
    1. They introduce a system for doing so
  7. Most previous methods for characterizing behavior fall into one of two categories:
    1. Very coarse metrics (like mean velocity, count # times a barrier is crossed).  This makes analysis and data collection easy, but only captures a tiny amount of relevant information
    2. The other approach is to log behavior in terms of a number of different possible categories.  This can be done by hand or by machine.  The problem with this is it introduces bias (you find only what classes you defined in the first place, and assumes small number of discrete high-level behaviors)
  8. A better system would be one that works from the bottom up, starting directly with the data as opposed to operator-defined classes
  9. “The basis of our approach is to view behaviour as a trajectory through a high-dimensional space of postural dynamics. In this space, discrete behaviours correspond to epochs in which the trajectory exhibits pauses, corresponding to a temporally-extended bout of a particular set of motions. Epochs that pause near particular, repeatable positions represent stereotyped behaviours. Moreover, moments in time in which the trajectory is not stationary, but instead moves rapidly, correspond to non-stereotyped actions.”
  10. Specifically, based on the data found from fruit flies
    1. “These stereotyped behaviours manifest themselves as distinguishable peaks in the behavioural space and correspond to recognizably distinct behaviours such as walking, running, head grooming, wing grooming, etc.”
  11. The fly is isolated into a 200×200 window in each frame (full image has 40k pixels), and then rotated/translated/resized to get standard representation
  12. Nearly all variance (93%) in images can be represented by PCA down to 50D
  13. Use a spectrogram representation of postural dynamics based on Morlet continuous wavelet transform <?>
    1. “Although similar to a Fourier spectrogram, wavelets possess a multi-resolution time- frequency trade-off, allowing for a more complete description of postural dynamics occurring at several time scales”
  14. Embedding is made up of 25 frequency channels for the 50 eigenmodes, so each point in time is represented by 1,250D
    1. Because behavior is highly correlated, they speculate a much lower dimensional manifold lies inside this space that describes behavior
  15. The goal is to map the 1250D data to something much smaller where trajectories in that space pause when stereotyped behavior occurs. “This means that our embedding should minimise any local distortions.”<I don’t know why>
  16. So they approach they chose “reduces dimensionality by altering the distances between more distant points on the manifold.”
    1. PCA, MDS, Isomap to exactly the opposite of this, prioritizing large-scale accuracy over local
  17. “t-Distributed Stochas- tic Neighbor Embedding (t-SNE)” however, does satisfy their requirement
    1. “For t-SNE, the conserved invariants are related to the Markov transition probabilities if a random walk is performed on the data set.”  Transition probabilities between two time points are based on a Gaussian kernel over distance
    2. Minimizes “local distortions” <?>
  18. With t-SNE, transition probabilities are similar to larger-space transition probabilities, but are proportional to Cauchy/Student-t kernel of points’ Euclidian distances
  19. Problem with t-SNE is quadratic memory use – they use importance sampling to subsample to 35k data points and then run on that
  20. A distance function is still needed.  They use KL-divergence
  21. They are able to embed data nicely in 2D – going to 3D leads to a 2% reduction of embedding cost function
  22. Get a probability density over behavior by convolving each point in embedded map with Gaussian
    1. Space has peaks, and trajectories pause at peak locations when conducting stereotyped behavior
  23. This 2D space is then decomposed by a “watershed transform”, where points are grouped together if hill-climbing from them leads to the same (local) maximum
  24. Peaks correspond to commonly defined behaviors, but here, everything is bottom-up from the data.  Also, nearby regions encode similar but distinct behavior
  25. As expected, many behaviors (like running) are confined to an orbit in the low-dimensional space
  26. Were able to pull apart distinguishing characteristics between male and female behavior

Automated monitoring and analysis of social behavior in Drosophila. Dankert, Wang, Hoopfer, Anderson, Perona. Nature Methods 2009.

  1. Machine vision approach to studying aggresion and courtship in flies.
    1. Computes location, orientation, and wing posture
  2. “Ethograms may be constructed automatically from these measurements, saving considerable time and effort.”
  3. “Both aggression and courtship consist of rich ensembles of stereotyped behaviors, which often unfold in a characteristic sequence.”
  4. Approach automates a task that was previously done by hand, so saves labor and is more objective, and may pick up on smaller or shorter-duration details
  5. Being able to study behavior in minute detail will allow better analysis of impact of genetics and environment on behavior
  6. Each video frame corresponds to 25 measurements/features like pose, velocity, wing pose, (size as well because it determines male/female, other special markings like a dot are used for male-male interaction) etc…
  7. Able to identify a number of different high-level actions/behaviors like fighting, courting (accuracy was 90+%)
  8. They then looked if they could pick up on behavioral changes by suppressing some neurons that had previously been determined to reduce aggressive behavior, and they were able to confirm the same result, looked at mutants as well
  9. System allows for visualization of behavior in a number of ways, such as by making histograms or graphs depicting transitions between behaviors with weights

Get every new post delivered to your Inbox.