I do theoretical computer science research on heuristic algorithms for the TSP. My main contribution is my yet-to-be-published paper “Theoretical Basis for TSP Heuristics” which outlines new methods of analysis for explaining when TSP heuristic algorithms find the optimal tour and when they do not.
My hypothesis is that existing heuristics in the k-opt family of algorithms are actually much better than historically thought (data somewhat supports this because the LKH algorithm is currently state of the art for solving large instances of the Traveling Salesman Problem). What is novel about my research is that it exposes under what conditions 2-opt, 3-opt, and k-opt will or will not find the optimal tour. This is important because as far as I know it’s the first framework for being able to compare instances between algorithms. The questions I’m trying to answer are: “What instances of the TSP are hard? Are they hard for all algorithms? Or at least: are they hard for all algorithms in the k-opt family of algorithms? Is it the case that these hard instances are hard for all algorithms (they overlap) or are they only hard for certain algorithms so running a different algorithm in parallel will solve the instances sub-exponentially (they do not overlap)?”
The NP vs P Problem
The reason I began studying TSP Heuristics, is that in college I was exposed to the NP vs P problem that was announced as one of the Millennium Prize Problems which carries a $1,000,000 prize. A math professor from a Calculus 3 course I took in high school told me about how he would work on an open problem related to his PhD thesis in his free time in airports and places where he had to wait. I took up this practice for a while, but in 2013 in between startups I worked on it full time for a while and began taking it seriously.
Since then it’s become my goal to solve the NP vs. P problem – I will probably never achieve it given all the computer science luminaries that have worked on it and have failed to solve it, but man is it fun to try.
My approach has been to analyze NP-complete problems, because solving one of those would topple the whole class. The only NP-complete problems I ever had any intuition for were the Traveling Salesman Problem and 3-SAT, but with SAT I always ended up going in circles.
The Non-Euclidean Fully Connected TSP
So I decided to focus on the TSP – since I wanted to solve the most generic version of the problem, I focus on the discrete non-euclidean TSP. This is important because it means the adversary can set the edge costs arbitrarily and they don’t need to be related as they are in the Euclidean TSP. I only focus on fully connected graphs because it’s easier to reason about when you can always assume two cities are connected, and besides if you can solve the fully connected case you can solve the non fully connected case by putting infinity as edge weights on the missing edges.
After years of research, I eventually rediscovered 2-opt, then found it in the literature and did a deep dive. I then found the Lin-Kernighan algorithm and Helsguan’s famed implementation. The effectiveness of these algorithms led me to a question that nobody seemed to be asking: “When do these algorithms find the optimal tour and when don’t they?” I’ve answered that question in my paper, and the question I’m working on now is: “Can we tweak a Lin-Kernighan or k-opt like algorithm to get an exponential speedup that will always find the optimal tour?”
My Hypothesis on the NP vs. P Problem
88% of experts in the CS Theory Community believe NP ≠ P, though no one has a proof. After studying the TSP for years, I’m convinced that there are fast algorithms. I think the major problem is with algorithms like Lin-Kernighan no one knows how to analyze them – and the question is just how fast are they and do they always find the optimal tour? That’s been a focus of my research.
Some CS Theory Resources
What is the NP vs P Problem?
The best, most consumable explanation of the NP vs P problem I’ve seen is by Jade Tan-Holmes from Up and Atom on YouTube.
If you’re up for it and want the formal definition – the official problem statement from the Clay Mathematics institute for the Millennium Prize Problems is thorough but a hard read. If you don’t have a CS background or degree, MIT News did a pretty good layman’s explanation.
Brilliant also has a public page on NP vs P which gets a little more technical (read accurate) but written clearly.
Some resources and explanations from LMU. David S. Johnson’s NP-Completeness Columns – I had to do some digging to find David S. Johnson’s columns.
MIT OpenCourseWare’s take on NP vs P – good explanation on complexity & algorithmic reductions.
Scott Aaronson’s comprehensive survey of the NP vs P problem.
David S. Johnson’s TSP Course Slides
The YouTube channel Reducible has the best most up-to-date overview of approaches to the TSP.
It’s over 15 years old, but “The Traveling Salesman Problem: A Computational Study” is still the most comprehensive book on the TSP. It’s a must read if you’re seriously interested in the problem.
For the latest TSP performance records and test instances head over to University of Waterloo’s TSP page.
Amazon put on a contest for solving the TSP a few years back. Bill cook was one of the winners (with Held & Helsgaun) and did a write up.
2021 Bill Cook talk on the history & solutions to the TSP
The Concorde TSP project – open source code that is state of the art for solving large tsp instances. (And a python concorde wrapper)
A Julia wrapper for LKH by Changhyun Kwon
“The Traveling Salesman Problem: A Case Study in Local Optimization” by David S. Johnson & Lyle A. McGeoch – all about local search algorithms for the tsp, pretty good, comprehensive read.
Lecture slides on state of the art local search methods with some implementation details by Thomas Stutzle
Learning About the TSP for Newbies
Basic discrete math introduction to the TSP
TSP State of the Art
The LKH Algorithm is the current state of the art algorithm for finding good tours.
Held–Karp algorithm is the best known exact algorithm for the TSP (though it takes exponential space).
Concorde is the current best tsp open source solver.
The Karlin-Klein-Gharan Algorithm is the best approximation algorithm for the TSP (at least the metric TSP).
Selected Recent Papers on the TSP
“Smoothed Analysis of the 2-Opt Heuristic for the TSP: Polynomial Bounds for Gaussian Noise” by Bodo Manthey & Rianne Veenstra
“Improved Smoothed Analysis of 2-Opt for the Euclidean TSP” by Bodo Manthey and Jesse van Rhijn
“Smoothed Analysis of the 2-Opt Algorithm for the General TSP” by Matthias Englert, Heiko Roglin, & Berthold Vocking
Classic Papers on the TSP
“A Method for Solving Traveling-Salesman Problems” by G. A. Croes. One of the original 2-opt papers.
“The Traveling-Salesman Problem” by Merrill M. Flood. One of the original 2-opt papers.
“Computer Solutions of the Traveling Salesman Problem” by Shen Lin. The original 3-opt paper.
“An Effective Heuristic Algorithm for the Traveling-Salesman Problem” by S. Lin and B. W. Kernighan. The original Lin-Kernighan algorithm paper.
“An Effective Implementation of the Lin-Kernighan Traveling Salesman Heuristic” by Keld Helsgaun. The original LKH paper.
“A dynamic programming approach to sequencing problems” by Michael Held and Richard M. Karp. The original Held-Karp algorithm paper. I couldn’t find a free version of the original paper online – if you have one or have a link email me and I’ll update this page.
“On the Complexity of Local Search for the Traveling Salesman Problem” & “Some Examples of Difficult Traveling Salesman Problems” by Christos H. Papadimitriou and Kenneth Steiglitz (and a refutation of Papdimitriou and Steiglitz’s paper’s conclusions about hardness)
Collection of Interesting Articles
1985 LA Times article interviewing Shen Lin about solving the TSP for the capitals of the 48 contiguous states. (Lin of Bell Labs who invented 3-opt & k-opt).