Lasted Update: 2021/08/20
Current Version: v0.2 (P)
This is a brief survey on machine learning for combinatorial optimization, a fascinating and significant topic. We mainly collect relevant papers from top conferences, classify them into several categories by method types, and indicate additional information.
It's inevitable to have omissions or errors during the discussion for the limitation of the author's knowledge. Welcome to feedback them you found.
Survey |
Analysis |
End-to-end |
RNN/Attention-based |
GNN-based |
Other |
Local Search |
B&B-based |
On Learning and Branching: a Survey
Publication
: TOP 2017Keyword
: BranchLink
: paperBoosting Combinatorial Problem Modeling with Machine Learning
Publication
: IJCAI 2018Keyword
: MLLink
: arXivA Review of Combinatorial Optimization with Graph Neural Networks
Publication
: BigDIA 2019Keyword
: GNNLink
: paperLearning Graph Matching and Related Combinatorial Optimization Problems
Publication
: IJCAI 2020Keyword
: GNN, Graph MatchingLink
: paperLearning Combinatorial Optimization on Graphs: A Survey with Applications to Networking
Publication
: IEEE ACCESS 2020Keyword
: GNN, Computer NetworkLink
: paperA Survey on Reinforcement Learning for Combinatorial Optimization
Publication
: arXiv 2020Keyword
: RLLink
: arXivA Survey on Reinforcement Learning for Combinatorial Optimization
Publication
: arXiv 2020Keyword
: RLLink
: arXivReinforcement Learning for Combinatorial Optimization: A Survey
Publication
: arXiv 2020Keyword
: RLLink
: arXivGraph Learning for Combinatorial Optimization: A Survey of State-of-the-Art
Publication
: Data Science and Engineering 2021Keyword
: GNNLink
: arXivCombinatorial optimization and reasoning with graph neural networks
Publication
: arXiv 2021Keyword
: GNNLink
: arXivLearning to Branch
Publication
: ICML 2018Keyword
: BranchLink
: arXivApproximation Ratios of Graph Neural Networks for Combinatorial Problems
Publication
: NeurIPS 2019Keyword
: GNNLink
: arXivOn Learning Paradigms for the Travelling Salesman Problem
Publication
: NeurIPS 2019 (Workshop)Keyword
: RL vs SLLink
: arXivOn the Difficulty of Generalizing Reinforcement Learning Framework for Combinatorial Optimization
Publication
: ICML 2021 (Workshop)Keyword
: GNNLink
: arXivPointer Networks
Publication
: NeurIPS 2015CO-problem
: Finding planar convex hulls, computing Delaunay triangulations, TSPML-type
: SLComponent
: LSTM, Seq2Seq, AttentionInnovation
: Ptr-Net not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries.Link
: arXivNeural Combinatorial Optimization with Reinforcement Learning
Publication
: ICLR 2017CO-problem
: TSPML-type
: RL (Actor-critic)Component
: LSTM, Seq2Seq, AttentionInnovation
: This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning.Link
: arXivMulti-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems
Publication
: AAAI 2021CO-problem
: VRP, TSPML-type
: RL (REINFORCE)Component
: Self-AttentionInnovation
: This paper proposes a Multi-Decoder Attention Model (MDAM) to train multiple diverse policies, which effectively increases the chance of finding good solutions compared with existing methods that train only one policy. A customized beam search strategy is designed to fully exploit the diversity of MDAM.Link
: arXivLearning Improvement Heuristics for Solving Routing Problems
Publication
: TNNLS 2021CO-problem
: TSP, VRPML-type
: RL (n-step Actor-Critic)Component
: Self-AttentionInnovation
: This paper proposes a self-attention based deep architecture as the policy network to guide the selection of next solution.Link
: arXivThe Transformer Network for the Traveling Salesman Problem
Publication
: arXiv 2021CO-problem
: TSPML-type
: RLComponent
: TransformerInnovation
: This paper proposes to adapt the recent successful Transformer architecture originally developed for natural language processing to the combinatorial TSP.Link
: arXivMatrix Encoding Networks for Neural Combinatorial Optimization
Publication
: arXiv 2021CO-problem
: TSPML-type
: RL (REINFORCE)Component
: AttentionInnovation
: MatNet is capable of encoding matrix-style relationship data found in many CO problemsLink
: arXivLearning Combinatorial Optimization Algorithms over Graphs
Publication
: NeurIPS 2017CO-problem
: MVC, MAXCUT, TSPML-type
: RL (Q-learning)Component
: GNN (structure2vec, S2V)Innovation
: This paper proposes a unique combination of reinforcement learning and graph embedding to address this challenge. The learned greedy policy behaves like a meta-algorithm that incrementally constructs a solution, and the action is determined by the output of a graph embedding network capturing the current state of the solution.Link
: arXivReinforcement Learning for Solving the Vehicle Routing Problem
Publication
: NeurIPS 2018CO-problem
: VRPML-type
: RLComponent
: GNN (GCN)Innovation
: This paper presents an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning.Link
: arXivAttention, Learn to Solve Routing Problems!
Publication
: NeurIPS 2018CO-problem
: TSP, VRPML-type
: RL (REINFORCE+rollout baseline)Component
: Attention, GNNInnovation
: This paper proposes a model based on attention layers with benefits over the Pointer Network and we show how to train this model using REINFORCE with a simple baseline based on a deterministic greedy rollout, which we find is more efficient than using a value function.Link
: arXivLearning to Solve NP-Complete Problems - A Graph Neural Network for Decision TSP
Publication
: AAAI 2019CO-problem
: TSPML-type
: SLComponent
: RNN, GNNInnovation
: This paper proposes a GNN model where edges (embedded with their weights) communicate with vertices.Link
: arXivLearning a SAT Solver from Single-Bit Supervision
Publication
: ICLR 2019CO-problem
: SATML-type
: SLComponent
: GNN, LSTMInnovation
: NeuroSAT enforces both permutation invariance and negation invariance.Link
: arXivEnd to end learning and optimization on graphs
Publication
: NeurIPS 2019CO-problem
: MAXCUTML-type
: ULComponent
: GNNInnovation
: This paper proposed a new approach CLUSTERNET to this decision-focused learning problem: include a differentiable solver for a simple proxy to the true, difficult optimization problem and learn a representation that maps the difficult problem to the simpler one. (relax+differentiate)Link
: arXivEfficiently Solving the Practical Vehicle Routing Problem: A Novel Joint Learning Approach
Publication
: KDD 2020CO-problem
: VRPML-type
: SL, RL (REINFORCE+rollout baseline)Component
: GCNInnovation
: GCN-NPEC model is based on the graph convolutional network (GCN) with node feature (coordination and demand) and edge feature (the real distance between nodes) as input and embedded. Separate decoders are proposed to decode the representations of these two features. The output of one decoder is the supervision of the other decoder. This paper also proposes a strategy that combines the reinforcement learning manner with the supervised learning manner to train the model.Link
: paperLearning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning
Publication
: NeurIPS 2020CO-problem
: JSSPML-type
: RL (PPO)Component
: GNNInnovation
: This paper exploit the disjunctive graph representation of JSSP , and propose a Graph Neural Network based scheme to embed the states encountered during solving. Link
: arXivLearning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time
Publication
: arXiv 2020CO-problem
: TSP, VRPML-type
: RLComponent
: GNNInnovation
: This paper develops a new framework to solve any combinatorial optimization problem over graphs that can be formulated as a single player game defined by states, actions, and rewards, including minimum spanning tree, shortest paths, traveling salesman problem, and vehicle routing problem, problem, without expert knowledge.Link
: arXivCombinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning
Publication
: AAAI 2020 (Workshop)CO-problem
: TSPML-type
: Hierarchical RLComponent
: GNNInnovation
: GPNs build upon Pointer Networks by introducing a graph embedding layer on the input, which captures relationships between nodes. Furthermore, to approximate solutions to constrained combinatorial optimization problems such as the TSP with time windows, we train hierarchical GPNs (HGPNs) using RL, which learns a hierarchical policy to find an optimal city permutation under constraints.Link
: arXivA Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs
Publication
: arXiv 2021CO-problem
: Directed Acyclic Graph scheduling, Graph Edit Distance, Hamiltonian Cycle ProblemML-type
: RL (PPO)Component
: GNN, ResNet, AttentionInnovation
: This paper proposes a bi-level framework is developed with an upper-level learning method to optimize the graph (e.g. add, delete or modify edges in a graph), fused with a lower-level heuristic algorithm solving on the optimized graph. Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity.Link
: arXivSOLO: Search Online, Learn Offline for Combinatorial Optimization Problems
Publication
: arXiv 2021CO-problem
: VRP, JSSPML-type
: RL (DQN, MCTS)Component
: GNNInnovation
: Learn Offline -> DQN + Search Online -> MCTSLink
: arXivLearning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning
Publication
: NeurIPS 2020CO-problem
: VRPML-type
: RLComponent
: /Innovation
: Policy evaluation with mixed-integer optimization. At the policy evaluation step, they formulate the action selection problem from each state as a mixed-integer program, in which they combine the combinatorial structure of the action space with the neural architecture of the value function by adapting the branch-and-cut approach.Link
: arXivCombinatorial Optimization with Graph Convolutional Networks and Guided Tree Search
Publication
: NeurIPS 2018CO-problem
: MIS, MVC, MC, SATML-type
: SLComponent
: GNN (GCN)Innovation
: This paper proposes a model whose central component is a graph convolutional network that is trained to estimate the likelihood, for each vertex in a graph, of whether this vertex is part of the optimal solution. The network is designed and trained to synthesize a diverse set of solutions, which enables rapid exploration of the solution space via tree search.Link
: arXivLearning to Perform Local Rewriting for Combinatorial Optimization
Publication
: NeurIPS 2019CO-problem
: JSSP, VRPML-type
: RL (Actor-Critic)Component
: LSTMInnovation
: NeuRewriter learns a policy to pick heuristics, and rewrite local components of the current solution to iteratively improve it until convergence. The policy factorizes into a region-picking and a rule-picking component, each of which parameterized by an NN trained with actor-critic in RL.Link
: arXivA Learning-based Iterative Method for Solving Vehicle Routing Problems
Publication
: ICLR 2020
CO-problem
: VRP
ML-type
: RL (REINFORCE)
Component
: /
Innovation
: “Learn to Improve” (L2I)
Link
: paper
Exploratory Combinatorial Optimization with Reinforcement Learning
Publication
: AAAI 2020CO-problem
: MAXCUTML-type
: RL (DQN)Component
: GNN (MPNN)Innovation
: ECO-DQN combines S2V-DQN, Reversible Actions, Observation Tuning and Intermediate Rewards.Link
: arXivRewriting By Generating: Learn To Solve Large-Scale Vehicle Routing Problems
Publication
: ICLR 2021CO-problem
: VRPML-type
: Hierarchical RL (REINFORCE)Component
: LSTM, K-means, PCAInnovation
: Inspired by the classical idea of Divide-and-Conquer, this paper presents a novel Rewriting-by-Generating(RBG) framework with hierarchical RL agents to solve large-scale VRPs. RBG consists of a rewriter agent that refines the customer division globally and an elementary generator to infer regional solutions locally.Link
: paperLearning to Search in Branch and Bound Algorithms
Publication
: NeurIPS 2014CO-problem
: MILPML-type
: RL (Imitation learning)Component
: /Innovation
: This paper proposed an approach that learns branch-and-bound by imitation learning. (node selection policy and node pruning policy)Link
: paperLearning to Branch in Mixed Integer Programming
Publication
: AAAI 2016CO-problem
: MIPML-type
: RL (Imitation Learning)Component
: /Innovation
: This paper proposes the first successful ML framework for variable selection in MIP.Link
: paperLearning to Run Heuristics in Tree Search
Publication
: IJCAI 2017CO-problem
: MIPML-type
: SLComponent
: classifierInnovation
: Central to this approach is the use of Machine Learning (ML) for predicting whether a heuristic will succeed at a given node.Link
: arXivImproving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement Learning
Publication
: AAAI 2019CO-problem
: MIS, MAXCUTML-type
: RL (Q-learning)Component
: /Innovation
: This paper proposes an innovative and generic approach based on deep reinforcement learning for obtaining an ordering for tightening the bounds obtained with relaxed and restricted DDs (decision diagrams).Link
: arXivExact Combinatorial Optimization with Graph Convolutional Neural Networks
Publication
: NeurIPS 2019CO-problem
: MIIPML-type
: RL (Imitation learning)Component
: GNN (GCN)Innovation
: This paper proposes a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs.Link
: arXivCombining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
Publication
: AAAI 2021CO-problem
: TSP, Portfolio Optimization ProblemML-type
: RL (DQN, PPO)Component
: GNN (GAT), Set TransformerInnovation
: This paper propose a general and hybrid approach, based on DRL and CP (Constraint Programming), for solving combinatorial optimization problems. The core of this approach is based on a dynamic programming formulation, that acts as a bridge between both techniques.Link
: arXivSolving Mixed Integer Programs Using Neural Networks
Publication
: arXiv 2021
CO-problem
: MIP
ML-type
: RL (Imitation Learning)
Component
: GNN (Bipartite graph)
Innovation
: Neural Diving + Neural Branching
Link
: arXiv
Publication
: CO-problem
: ML-type
: Component
: Innovation
: Link
: arXiv
-Publication
: CO-problem
: Link
: arXivReinforcement Learning for (Mixed) Integer Programming: Smart Feasibility Pump
Publication
: ICML 2021 (Workshop)CO-problem
: MIPLink
: arXivLearning Local Search Heuristics for Boolean Satisfiability
Publication
: NeurIPS 2019CO-problem
: SATLink
: paperReinforcement Learning on Job Shop Scheduling Problems Using Graph Networks
Publication
: arXiv 2020CO-problem
: JSSPLink
: arXivA Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing
Publication
: AAAI 2020 (Workshop)CO-problem
: 3D-BPPLink
: arXivA Reinforcement Learning Approach to Job-shop Scheduling
Publication
: IJCAI 2020CO-problem
: JSSPLink
: paperA Reinforcement Learning Environment For Job-Shop Scheduling
Publication
: arXiv 2021CO-problem
: JSSPLink
: arXivImproving Optimization Bounds Using Machine Learning ~ Decision Diagrams Meet Deep Reinforcement Learning
Publication
: AAAI 2019CO-problem
: MIS, MAXCUTLink
: arXivOnline 3D Bin Packing with Constrained Deep Reinforcement Learning
Publication
: AAAI 2021CO-problem
: 3D-BPPLink
: arXivA Data-Driven Approach for Multi-level Packing Problems in Manufacturing Industry
Publication
: KDD 2019CO-problem
: 3D-BPPLink
: ACM DLSolving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method
Publication
: arXiv 2017CO-problem
: 3D-BPPLink
: arXivMeta-Learning-based Deep Reinforcement Learning for Multiobjective Optimization Problems
Publication
: arXiv 2017CO-problem
: TSPLink
: arXivDynamic Job-Shop Scheduling Using Reinforcement Learning Agents
Publication
: ROBOT AUTON SYST 2000CO-problem
: JSSPLink
: paper