Simulated annealing is a randomized procedure to find approximate solutions to optimization problems where greedy techniques don't work due to the presence of local mimima. (We will assume we are trying to find a minimum - to find a maximum we could just take the negative of the objective function.) Let S be a finite set, and f(s) be the objective function defined on S. Each point in S has a set of neighbors, N(s). At each step, the simulated annealing procedure picks a neighbor of the current point at random and compares the value of the objective function at the neighbor to the value at the current point. If the value is an improvement, the neighbor replaces the current point. If not, the neighbor replaces the current point with probability exp( -d/T), where d is the difference in value of the objective function, and T is the current temperature.

The temperature is a way of controlling the fraction of steps that are taken that don't improve the objective function. The temperature is gradually reduced according to a cooling schedule. The lower the temperature, the more it acts like a greedy algorithm.

References:

Simulated Annealing and Boltzmann Machines, by Emile Aarts and Jan Korst, Wiley, 1989

Finite Markov Chains and Algorithmic Applications, by Olle Haggstrom, Cambridge University Press, 2002

Probability and Algorithms, Chapter 2: Simulated Annealing, by Dimitris Bertsimas and John Tsitsiklis, National Academy Press, 1992