In this assignment you will learn some the basics of scipy
, a python
module for scientists. In addition, you will learn about optimization using
simulated annealing.
A common problem in computer science (and indeed all of science) is the
optimization of some known function over some range of its possible inputs. For
instance, let's find the optimal way to enclose a rectangular region using N
total feet of fencing. A constraint is that we aren't allowed to subdivide a
foot of fencing. A priori, we know the answer will be a square. Let's write this
up as a function:
def fencing(l, w, n):
assert 2 * (l + w) == n
return l * w
[10] Let's find the optimal input to fencing
when given a particular
amount of fencing, n
. Write a function called solve_fence(n)
which, given
a total amount of fencing n
(which we will assume is an even number) will
find the optimal length and width of the rectangular region. Use the naive
approach of looking through the entire potential solution space. Assume the
lengths and widths are non-negative.
[5] Why is this type of solution a bad idea in general? Write your answer in a comment.
A better strategy would be to use what is called "gradient descent". This means that we choose a point to start at and determine if increasing or decreasing a parameter along a particular direction would improve our solution. We choose the best direction to move along and repeat until there is no way to improve the solution by moving in any direction. This can be done in any number of dimensions, but it's particularly simple for a one-dimensional problem.
[10] Write grad_fence(n)
to implement gradient descent for the fencing
problem. Start at a random point inside of the solution space. Only consider
integer lengths, as before.
[5] Why is this an improvement? What could be a potential problem with this approach for general problems? Write your answer in a comment.
Consider this: what if we have a function with 2 maxima, one global and one only local? Let's look at a function that has these properties.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-6,3,40)
plt.plot(x,-2*(x+6)*(x+2)*(x-3)*x)
plt.show()
def f(x):
return -2*(x+6)*(x+2)*(x-3)*x
[10] Write grad_f(start, stop)
where start
and stop
are the
minimum and maximum x
's to consider. For simplicity, only consider integer
values of x
(so technically this is not a "gradient descent" but the
principle is the same). Consider the range from below the minimum root to above
the maximum root.
[5] Run your code a few times. What do you notice? Why is this the case?
[5] Now run your code once with the range within 1 of one of the maxima. What do you see now? Now try the same trick with the other maximum. What happens?
Another optimization method called "Simulated Annealing" can help solve this problem. Annealing is the process by which glass, metal or other molten material cools. The speed at which it cools affects the properties of the material and can introduce banding. Sometimes, the material is reheated in the middle of the process to slow the cooling down further to add additional order to the cooling material. Much as genetic algorithms use principles of biology to generate results in computer science, simulated annealing is a technique that uses a simulated physical "heating/cooling" process to achieve results.
Simulated annealing is the continuous repetition of the following process. The temperature decreases. Then, the current guess of the global maximum is randomly changed based on the temperature. If the new guess is better than the old guess, this change is accepted. Otherwise, there is a chance that this new (worse) guess will be accepted. For a very high temperature, this probability should be close to 1, and for a very low temperature, it should be close to zero. (Of course, probabilities should never be negative or > 1). One formula we can use to give us the probability that a bad move is accepted is the following:
Goal: find x for which f(x) is maximized
Given: Two values of x, old_x and new_x
If f(new_x) > f(old_x), accept.
Otherwise (f(new_x) < f(old_x)), accept with this probability:
p(accept) = exp((f(new) - f(old)) / T)
Other formulas for p(accept)
are also possible, as long as they have the
correct asymptotic behavior (i.e. the probability goes to 1 for high
temperatures and goes to 0 for low temperatures).
Note that as the temperature decreases, the system is less likely to accept a worse guess. This means that the algorithm gets less tolerant as time goes on and starts searching for the nearest local maxima as T gets small.
When running simulated annealing, the most important parameter is the temperature T. This variable measures how much randomness there is in the system. When the temperature is high, the parameter(s) change randomly. When the temperature is very low, the simulation has "annealed" and very little movement occurs. When the temperature hits 0, the algorithm is done and the answer is returned. You can also stop the algorithm early by limiting the number of steps it will take. However, you should make sure that there are enough steps for the temperature to get close to zero.
[30] Implement simulated annealing on the function f
. Choose a good
rate of temperature decrease. We found that multiplying by a fraction close to 1
worked well enough. Find good inputs so that the answer is almost always the
correct solution (say, 9 out of 10 times).
WARNING: Simulated annealing can be extremely sensitive to parameter
variations. These include the initial temperature, the rate of decrease of the
temperature, the distance to move in the parameter space per step, etc.. We
strongly recommend that you litter your code with print
statements
initially until you are convinced that things are working properly. Make sure
you search over a fixed parameter range, and if you go beyond the range, do
something intelligent (like wraparound). Don't let the magnitude of your
parameters get arbitrarily large! Make sure arguments to the exp
function
don't get too large; your code shouldn't crash part way through the simulation!
You should also use assert
statements to make sure that e.g. probabilities
are always in the range [0, 1]
Also, try running the annealing simulation starting from around the point where the local (but not global) minimum is. If things are working correctly, it should usually still end up at the global minimum. If not, you may be lowering your temperature too quickly, which defeats the purpose of the algorithm.
(See what we did there? :-))
NP-hard problems are essentially problems that are difficult to solve algorithmically, but where it is not difficult to verify that a solution is in fact a solution. You can learn more on this subject in courses like CS 21 and CS 38. For our purposes, it is sufficient to note that NP-hard problems are sufficiently complex that checking every possible solution is not feasible. In addition, a number of changes to the input of an NP-hard problem that each individually decrease the utility (i.e. the "desirability" of the output), can, when combined, lead to a higher overall utility. If you have a function that represents an NP-hard problem, it is likely to have many local maxima.
A classic NP-hard problem is the knapsack problem. In this problem, there is a collection of objects where each object has some weight and some value. Some objects may be duplicated i.e. there may be more than one of some kind of object. Your goal is to load up a knapsack such that the sum of the weights of all objects in the knapsack does not exceed the weight limit while optimizing for the combined value of the items in the knapsack. (There are variants of this problem, but we are not worried about them for this section.)
[15] Write the knapsack(lst, weight_budget)
function. It should take a
list of tuples, each of which represents a single object of the form
(size,value)
. It should return a tuple of two items: a list of tuples that
represent the objects that can go in the knapsack, and the total value of the
items in the list. Note that your solution will have to search over nearly the
entire search space to get the optimal answer. This is intended for now.
Here is an example of inputs for this function.
a = [(12,4),(1,2),(4,10),(1,1),(2,2)]
weight_a = 15
In this case, a
is the list, and weight_a
is the weight budget. The
correct answer is ([(2, 2), (1, 1), (4, 10), (1, 2)], 15)
. (The order of the
elements in the returned list is irrelevant, of course.)
[15] Now find a (possibly non-optimal) solution to the knapsack problem
using simulated annealing. There are a number of ways to change your guess
based on T
. Find one that works for you. Your answer should usually get the
right answer as the brute-force search for the list of tuples a
and
weight_a
.
Your function signature should be knapsack_anneal(all_items, max_weight,
max_steps, T)
where all_items
is a list of tuples as it was in knapsack
.
max_weight
is an integer as is max_steps
. T
should be a float.
You will also want to be clever with what the utility function is. Remember that knapsacks that try to fit more weight than the maximum are not allowed.
Since this solution method should run much faster than brute-force methods, you can test it on a more complex set of objects.
b = [(70,135), (73,139), (77,149), (80,150), (82,156),
(87,163), (90,173), (94,184), (98,192), (106,201),
(110,210), (113,214), (115,221), (118,229), (120,240)]
weight_b = 750
The best possible score for this data set has a weight of 749 and a value of 1458:
([(70, 135), (77, 149), (82, 156), (90, 173),
(94, 184), (98, 192), (118, 229), (120, 240)], 1458)
Don't worry about getting this exact value, but you should at least be close.
[5] SciPy is a library related to numpy and pandas that is specifically
meant for science. As a brief taste of its potential, use
scipy.optimize.basinhopping
to solve for f
from part A. You know what the
answers should look like, so find some decent parameters. Be aware that the
basinhopping
algorithm looks for a global minimum, not maximum, so you will
need to take this into account. Make sure you print the output of the
basinhopping
function so you can see what the search results are.
By the way, the basinhopping
algorithm isn't exactly simulated annealing
but is in the same broad class of stochastic search algorithms.
[10] Write whatever functions you need for scipy.optimize.basinhopping
to be applicable to the knapsack problem. Note how much cleaner and easier this
is. Test it on both knapsack problems from part B.
Note that you might want to write your own step function because knapsack is discrete and basinhopping works for continuous functions by default. Here is some example code to help.
class MyTakeStep(object):
def __call__(self, x):
# x is the array that represents which items are in the knapsack.
# Each value in the array is 0 if the item is not present or 1 if it
# is. Modify the array here in some random way.
return x
Similarly, the function you want to optimize should take as its input an array of 0/1 values and compute the resulting value of the knapsack given that the 0 items were discarded and the 1 items were retained.
Running the basinhopping
algorithm might give you runtime warnings. These
are not usually a problem. You can pipe the warning (stderr
) output to
/dev/null
if you want to ignore it.
[10] Play with the T
and niter
arguments. Find what parameters work
well on knapsack_b
from part B. This knapsack has a weight limit of 750.
What is the best answer you get? Can you get the actual best value (1458)?
The files lab6a.py
, lab6b.py
, and lab6c.py
.