It's by no means polished, but it can be pretty fun to play around with, visualizing how the iterates of different LP algorithms (described in sections 11, 12 of the book) react to changes in the feasible region/objective, by just dragging the vertices/constraints around.
If you go to https://lpviz.net/?demo it will draw a polytope for you, and click around the interface to show off some of the features. I'm constantly chipping away at it in my free time, I welcome any feedback and suggestions!
- "Essentials of Metaheuristics" by Sean Luke https://cs.gmu.edu/~sean/book/metaheuristics/
- "Clever Algorithms" by Jason Brownlee https://cleveralgorithms.com/
Timefold uses the metaheuristic algorithms in these books (Tabu Search, Late Acceptance, Simulated Annealing, etc.) to find near-optimal solutions quickly from a score function (typically defined in a Java stream-like/SQL-like syntax so score calculation can be done incrementally to improve score calculation speed).
You can see simplified diagrams of these algorithms in action in Timefold's docs: https://docs.timefold.ai/timefold-solver/latest/optimization....
Disclosure: I work for Timefold.
var score = 0;
for (var shiftA : solution.getShifts()) {
for (var shiftB : solution.getShifts()) {
if (shiftA != shiftB && shiftA.getEmployee() == shiftB.getEmployee() && shiftA.overlaps(shiftB)) {
score -= 1;
}
}
}
return score
usually takes shift * shift evaluations of overlaps, we only check the shifts affected by the change (changing it from O(N^2) to O(1) usually).That being said, it might be useful for a move selector. I need to give it a more in depth reading.
I've been wanting something like this for a long time, and I regret not knowing about the first edition.
If you are wondering why this is a more interesting problem than, say, sorting a list, the answer is that optimization algorithms are attempts at the ideal of a fully general problem solver. Instead of writing a program to solve the problem, you write a program to recognize what a solution would look like, which is often much easier, for example with a labeled dataset. Then you apply the optimization algorithm on your program. And that is how current AI is being done, with automatic differentiation and variants of Adam, but there are many other algorithms for optimization which may be better alternatives in some circumstances.
In practice that's basically the mindset, but full generality isn't technically possible because of the no free lunch theorem.
If so, I agree it is impossible for a fully general problem solver to find the optimal solution to a problem in a reasonable amount of time (unless P = NP, which is unlikely).
However, if a "good enough" solution that is only 1% worse than optimal works, then a fully general solver can do the job in a reasonable amount of time.
One such example of a fully general solver is Timefold; you express your constraints using plain old Java objects, so you can in theory do whatever you want in your constraint functions (you can even do network calls, but that is extremely ill-advised since that will drastically slow down score calculation speeds).
Disclosure: I work for Timefold.
There cannot be a guarantee to find a solution to a given percentage worse than optimal for a fully general problem, since you would need to know optimal to give such a guarantee (and since the problem fully general, you cannot use the structure of the problem to reduce it).
Most constraint problems have many feasible solutions, and have a way to judge how much worse or better one solution is to another.
There are good and bad way to write constraints.
One bad way to write constraints is score traps, where between one clearly better solution has the same score as a clearly worse solution.
For example, for shift scheduling, a solution with only 1 overlapping shift with the same employee is better than a solution with 2 overlapping shifts with the same employee.
A bad score function would penalize both solutions by 1, meaning a solver have no idea which of the two solutions are better.
A good score function would penalize the schedule with 1 overlapping shift with the same employee by 1, and the schedule with 2 overlapping shifts with the same employee by 2.
The class of problems I am talking about is the class of problems where you can assign a score to a possible solution, with limited score traps.
Timefold has no guarantees about finding a solution in reasonable time (but unless you done something terribly wrong or have a truly massive dataset, it finds a good solution really quickly 99.99% of the time). Instead, you set the termination condition of the solver; it could be time-based (say 60 minutes), unimproved time spent (solve until no new better solutions are found after 60 minutes), or the first feasible solution (there are other termination conditions that can be set).
It says that averaging over all possible objective functions that no optimization algorithm is better than median.
If you don't know anything about the objective, then the probability that Timefold is better than a randomly chosen optimization algorithm is 50%.
The key is the point about averaging over all possible objective functions. You don't presumably expect to be even close to successful on such a general set.
So if I understand correct, it states that all optimization algorithm must choose an order to evaluate solutions, and the faster it evaluates the "best" solution, the "better" the algorithm.
For example, say the search space is {"a", "b", "c"} and the best solution is "c". Then there are 3! (6) ways to evaluate all solutions:
"a", "b", "c"
"a", "c", "b"
"b", "a", "c"
"b", "c", "a"
"c", "a", "b"
"c", "b", "a"
(of course, in a real problem, the search space is much, MUCH larger).
The way Timefold tackles this is move selectors. In particular, you can design custom moves that uses knowledge of your particular problem which in turn affect when certain solutions are evaluated. Additionally, you can design custom phases that runs your own algorithm and then pass the solution found to local search. One of the projects we are working on currently is the neighborhood API, to make it much easier to write your own custom moves. That being said, for most problems that humans deal with, you don't need to configure Timefold and just work with the defaults (which is best-fit followed by a late acceptance local search with change and swap moves).
For what it's worth: metaheuristics solvers tend to be better than linear solvers for shift scheduling and vehicle routing, but worse for bin packing. But it still works, and is still fast.
The algorithm descriptions are clear, the visualizations are great, and, as someone who does a lot of ML work, they cover a lot of (important) topics beyond just what is covered in your standard ML book. This is especially refreshing if you're looking for thinking around optimization that is not just gradient descent (which has been basically the only mainstream approach to optimization in ML for two decades now).
I've known a few people to complain that the code examples are in Julia, but, as someone who doesn't know Julia, if you have experience doing quantitative programming at all it should be trivial convert the Julia examples to your favorite language for implementation (and frankly, I'm a bit horrified that so many people "smart" people interested in these sorts of topics seem trapped into reasoning in one specific language).
Optimization is such a rich field and should be of interest to any computer scientist who would describe themselves as "interested in solving hard problems" rather than just applying a well known technique to a specific class of hard problems.
Yes, but even if you are only interested in pragmatically solving problems, off-the-shelf solvers for various optimisation problems are a great toolkit to bring to bear.
Reformulating your specific problem as eg a mixed integer linear programming problem can often give you a quick baseline of performance. Similar for SMT. It also teaches you not to be afraid of NP. And it teaches you a valuable lesson in separation of specification (= your formulation of the problem) and how to compute the solution (= whatever the solver does), which can be applicable in other domains, too.
I use multi armed bandits a lot at work, and I wonder what other algorithms I should look into. Its hard to read the name of an algorithm or category and get a sense of whether or not its applicable. It seems like the general process of "which path out of N options is the best?" can be reframed in many ways depending on contextual details such as how large is N, what's the feedback latency, what are the constraints around evaluation and exploration, etc
There is even a paper in ITOR about this: https://onlinelibrary.wiley.com/doi/abs/10.1111/itor.12001
Unfortunately, there are some "research bubbles" in the community of people that only work with these shady metaheuristics and keep citing each other. This is getting so bad to the point that I still frequently see in conferences people using such metaheuristics and believing that they are ok, only to get criticized by someone in the audience later.
Alg4Opt covers more topics, providing the motivation behind the algorithm, sometimes a basic derivation, and a concrete implementation. It has citations in the margin for more info.
Nocedal and Wright will go more in-depth on derivation, proving theorems, etc. Implementations are pseudocode, and fewer topics are covered.
Picking on what I'd consider one of the major workhorse methods of continous constrained optimization, Interior Point Methods get a 2-3 page super high level summary in this book. Nocedal and Wright give an entire chapter on the topic (~25 pages) (which of course still is probably insufficient detail to implement anything like a competitive solver).
Numerical Optimization (Jorge Nocedal, Stephen J. Wright).