Note: This is only a partially written book, which doesn’t seem to have any content since 2008. I read it anyway because:
- It was free
- Focused on the type of optimization I am interested in (direct search)
- Even though it is unfinished, at 800+ pages it is quite long and extensive
- The document itself was quite readable
So anyway, here goes:
(p. 63) The correlation length measures how the autocorrelation function (a simple function) decreases and
summarizes the ruggedness of the landscape
(p. 71) “The effects of noise in optimization have been studied by various researchers; Miller and Goldberg
[1416, 1415], Lee and Wong , and Gurin and Rastrigin  are some them. Many global optimization
algorithms and theoretical results have been proposed wich can deal with noise. Some of them are, for
instance, specialized genetic algorithms [685, 2062, 2060, 1799, 1800, 1146],
Evolution Strategies [195, 100, 881], and
Particle Swarm Optimization [1606, 884] approaches.
–  J. Michael Fitzpatrick and John J. Grefenstette. Genetic algorithms in noisy environments.
Machine Learning, 3(2–3):101–120, October 1988. ISSN: 0885-6125 (Print)1573-0565 (Online).
doi:10.1007/BF00113893. Online available at http://www.springerlink.com/content/n6w328441t63q374/fulltext.pdf
–  Brad L. Miller and David E. Goldberg. Genetic algorithms, tournament selection, and the e?ects
of noise. IlliGAL Report 95006, Online available at http://citeseer.ist.psu.edu/86198.
html and http://www.illigal.uiuc.edu/pub/papers/IlliGALs/95006.ps.Z
–  Brad L. Miller and David E. Goldberg. Genetic algorithms, selection schemes, and the varying e?
ects of noise. Evolutionary Computation, 4(2):113–131, Summer 1996. ISSN: 1063-6560.
doi:10.1162/evco.19184.108.40.206. See also . Online available http://citeseer.ist.psu.edu/110556.html and
(p. 71) More discussion of when tolerance of noise/robustness is necessary, along with discussion of unstable
global optimum (needle) vs stable local optimum
(p. 76) Discussion of non-stationary optimization
(p. 78) No Free Lunch Theorem: it is impossible for any optimization algorithm to outperform random walks or
exhaustive enumerations on all possible problems. For every problem wehere a given method leads to good
results, we can construct a problem wehere the same method has the opposite effect. Doing exactly this is a
common method for comparing/benchmarking optimization algorithms.
(p. 79) Because of this, all useful optimization algorithms utilize some problem specific
knowledge/assumptions. This means that having many optimization methods can be useful because each may have
different assumptions and only one may be most suitable for any particular problem.
(p. 87) Applications of global optimization and conferences/workshops
——-ch2 [Evolutionary Algorithms]
——-ch3 [Genetic Algorithms]
——-ch4 [Genetic Programming]
——-ch5 [Evolution Strategies]
——-ch6 [Evolutionary Programming]
——-ch7 [Learning Classifier Systems]
(p. 233) A special case of production systems
——-ch8 [Ant Colony Optimization]
——-ch9 [Particle Swarm Optimization]
(p. 249) There is a population in the swarm, each individual can communicate with a subset of the swarm in
(p. 250) Each individual updates its position based on its velocity and then updates velocity based on its
own and neighbor’s information
…Ch is very incomplete
——-ch10 [Hill Climbing]
(p. 253) Hill climbing as a genetic algorithm w/population size 1
(p. 257) Raindrop method
——-ch11 [Random Optimization]
(p. 259) This just feels like a special case of hill climbing where the neighborhood is sampled from
——-ch12 [Simulated Annealing]
——-ch13 [Extremal Optimization]
(p. 269) Also based on physics, comes from “metaphor of thermal equilibria.” Based on idea of “self
organized criticality,” that large interactive systems evolve into a state where a change in one element can
lead to drastic changes all over the system. # of elements involved in the change is proportional to n^-t.
(p. 270) There is a modified version of this where individual genes can be modified if each contributes by
itself to fitness. Normally this is hard to do, but can measure the difference of fitness between 2 genomes
when one element is changed.
——-ch14 [Tabu Search]
(p. 273) Basic idea is to store a list of configurations tested and not re-test them. Can also try to store
properties of tested configurations instead of actual configurations, which can lead to algo. neglecting
(potentially good) areas of the search space that were so far untested
(p. 274) It is very similar to hill-climbing or simulated annealing
——-ch15 [Memetic + Hybrid Algorithms]
(p. 277) Basic idea is to combine GAs with some other method (such as simulated annealing) in concert to do
(p. 280) “Baldwin evolution” can be added by doing a local search to each new individual genome
——-ch16 [Downhill Simplex/Nedler Mead] *read more on this*
(p. 283) Uses only values of objective functions w/o derivative information, so is a direct search method
(p. 283) Downhill simplex optimization uses n+1 points in the Rn. These points form a polytope3, a
generalization of a polygone, in the n-dimensional space Nondegenerated simplexes, i.e., those where the set
of edges adjacent to any vertex form a basis in the Rn, have one important festure: The result of replacing a
vertex with its re?ection through the opposite face is again, a nondegenerated simplex (see Fig. 16.1.a). The
goal of downhill simplex optimization is to replace the best vertex of the simplex with an even better one or
to ascertain that it is a candidate for the global optimum . Therefore, its other points are constantly
?ipped around in an intelligent manner as we will outline in Section 16.3.
(p. 283) Gets stuck in local optima, so restarts are recommended
——-ch17 [State Space Search]
(p. 289) Output of function being searched is binary, not real. Either configuration is successfull or not.
(p. 289) State space search algos can be applied to optimization by thresholding the output of the function
(p. 289) All of the search functions described are deterministic
(p. misc) Talk about BFS/DFS/Iterative Deepening, random walks, A*
——-ch18 [Parraelization and Distribution]
——-ch19 [Maintaining the Optimal Set]
(p. 309) This is only for multi-objective optimization. Otherwise this is trivial
(p. 310) Methods of gridding and clustering to keep track of optimal elements
Rest of book is applications, background, implementation details
Update, recently heard about pattern search.