Papers on ML4CO

Gemini Light
wtfly2018@163.com

Lasted Update: 2021/08/20 Current Version: v0.2 (P)

Preface

This is a brief survey on machine learning for combinatorial optimization, a fascinating and significant topic. We mainly collect relevant papers from top conferences, classify them into several categories by method types, and indicate additional information.

It's inevitable to have omissions or errors during the discussion for the limitation of the author's knowledge. Welcome to feedback them you found.

To-Do List

Combinatorial Optimization Problems

Categories of Machine Learning

Components of Neural Network

Content

Survey
Analysis
End-to-end
  RNN/Attention-based
  GNN-based
  Other
Local Search
B&B-based

Survey

  1. On Learning and Branching: a Survey

    • Publication: TOP 2017
    • Keyword: Branch
    • Link: paper
  2. Boosting Combinatorial Problem Modeling with Machine Learning

    • Publication: IJCAI 2018
    • Keyword: ML
    • Link: arXiv
  3. A Review of Combinatorial Optimization with Graph Neural Networks

    • Publication: BigDIA 2019
    • Keyword: GNN
    • Link: paper
  4. Learning Graph Matching and Related Combinatorial Optimization Problems

    • Publication: IJCAI 2020
    • Keyword: GNN, Graph Matching
    • Link: paper
  5. Learning Combinatorial Optimization on Graphs: A Survey with Applications to Networking

    • Publication: IEEE ACCESS 2020
    • Keyword: GNN, Computer Network
    • Link: paper
  6. A Survey on Reinforcement Learning for Combinatorial Optimization

    • Publication: arXiv 2020
    • Keyword: RL
    • Link: arXiv
  7. A Survey on Reinforcement Learning for Combinatorial Optimization

    • Publication: arXiv 2020
    • Keyword: RL
    • Link: arXiv
  8. Reinforcement Learning for Combinatorial Optimization: A Survey

    • Publication: arXiv 2020
    • Keyword: RL
    • Link: arXiv
  9. Graph Learning for Combinatorial Optimization: A Survey of State-of-the-Art

    • Publication: Data Science and Engineering 2021
    • Keyword: GNN
    • Link: arXiv
  10. Combinatorial optimization and reasoning with graph neural networks

    • Publication: arXiv 2021
    • Keyword: GNN
    • Link: arXiv

Analysis

  1. Learning to Branch

    • Publication: ICML 2018
    • Keyword: Branch
    • Link: arXiv
  2. Approximation Ratios of Graph Neural Networks for Combinatorial Problems

    • Publication: NeurIPS 2019
    • Keyword: GNN
    • Link: arXiv
  3. On Learning Paradigms for the Travelling Salesman Problem

    • Publication: NeurIPS 2019 (Workshop)
    • Keyword: RL vs SL
    • Link: arXiv
  4. On the Difficulty of Generalizing Reinforcement Learning Framework for Combinatorial Optimization

    • Publication: ICML 2021 (Workshop)
    • Keyword: GNN
    • Link: arXiv

End-to-end

RNN/Attention-based

  1. Pointer Networks

    • Publication: NeurIPS 2015
    • CO-problem: Finding planar convex hulls, computing Delaunay triangulations, TSP
    • ML-type: SL
    • Component: LSTM, Seq2Seq, Attention
    • Innovation: Ptr-Net not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries.
    • Link: arXiv
    • image-20210815124017776
  2. Neural Combinatorial Optimization with Reinforcement Learning

    • Publication: ICLR 2017
    • CO-problem: TSP
    • ML-type: RL (Actor-critic)
    • Component: LSTM, Seq2Seq, Attention
    • Innovation: This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning.
    • Link: arXiv
    • image-20210815131544615
  3. Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems

    • Publication: AAAI 2021
    • CO-problem: VRP, TSP
    • ML-type: RL (REINFORCE)
    • Component: Self-Attention
    • Innovation: This paper proposes a Multi-Decoder Attention Model (MDAM) to train multiple diverse policies, which effectively increases the chance of finding good solutions compared with existing methods that train only one policy. A customized beam search strategy is designed to fully exploit the diversity of MDAM.
    • Link: arXiv
    • image-20210816135657473
  4. Learning Improvement Heuristics for Solving Routing Problems

    • Publication: TNNLS 2021
    • CO-problem: TSP, VRP
    • ML-type: RL (n-step Actor-Critic)
    • Component: Self-Attention
    • Innovation: This paper proposes a self-attention based deep architecture as the policy network to guide the selection of next solution.
    • Link: arXiv
    • image-20210816171739141
  5. The Transformer Network for the Traveling Salesman Problem

    • Publication: arXiv 2021
    • CO-problem: TSP
    • ML-type: RL
    • Component: Transformer
    • Innovation: This paper proposes to adapt the recent successful Transformer architecture originally developed for natural language processing to the combinatorial TSP.
    • Link: arXiv
    • image-20210816161717118
  6. Matrix Encoding Networks for Neural Combinatorial Optimization

    • Publication: arXiv 2021
    • CO-problem: TSP
    • ML-type: RL (REINFORCE)
    • Component: Attention
    • Innovation: MatNet is capable of encoding matrix-style relationship data found in many CO problems
    • Link: arXiv
    • image-20210816173143328

GNN-based

  1. Learning Combinatorial Optimization Algorithms over Graphs

    • Publication: NeurIPS 2017
    • CO-problem: MVC, MAXCUT, TSP
    • ML-type: RL (Q-learning)
    • Component: GNN (structure2vec, S2V)
    • Innovation: This paper proposes a unique combination of reinforcement learning and graph embedding to address this challenge. The learned greedy policy behaves like a meta-algorithm that incrementally constructs a solution, and the action is determined by the output of a graph embedding network capturing the current state of the solution.
    • Link: arXiv
    • image-20210815122727875
  2. Reinforcement Learning for Solving the Vehicle Routing Problem

    • Publication: NeurIPS 2018
    • CO-problem: VRP
    • ML-type: RL
    • Component: GNN (GCN)
    • Innovation: This paper presents an end-to-end framework for solving the Vehicle Routing Problem (VRP) using reinforcement learning.
    • Link: arXiv
    • image-20210815125454294
  3. Attention, Learn to Solve Routing Problems!

    • Publication: NeurIPS 2018
    • CO-problem: TSP, VRP
    • ML-type: RL (REINFORCE+rollout baseline)
    • Component: Attention, GNN
    • Innovation: This paper proposes a model based on attention layers with benefits over the Pointer Network and we show how to train this model using REINFORCE with a simple baseline based on a deterministic greedy rollout, which we find is more efficient than using a value function.
    • Link: arXiv
    • image-20210815133136779
  4. Learning to Solve NP-Complete Problems - A Graph Neural Network for Decision TSP

    • Publication: AAAI 2019
    • CO-problem: TSP
    • ML-type: SL
    • Component: RNN, GNN
    • Innovation: This paper proposes a GNN model where edges (embedded with their weights) communicate with vertices.
    • Link: arXiv
    • image-20210815133954824
  5. Learning a SAT Solver from Single-Bit Supervision

    • Publication: ICLR 2019
    • CO-problem: SAT
    • ML-type: SL
    • Component: GNN, LSTM
    • Innovation: NeuroSAT enforces both permutation invariance and negation invariance.
    • Link: arXiv
    • image-20210815135231623
    • image-20210815135755778
  6. End to end learning and optimization on graphs

    • Publication: NeurIPS 2019
    • CO-problem: MAXCUT
    • ML-type: UL
    • Component: GNN
    • Innovation: This paper proposed a new approach CLUSTERNET to this decision-focused learning problem: include a differentiable solver for a simple proxy to the true, difficult optimization problem and learn a representation that maps the difficult problem to the simpler one. (relax+differentiate)
    • Link: arXiv
    • image-20210815143420270
  7. Efficiently Solving the Practical Vehicle Routing Problem: A Novel Joint Learning Approach

    • Publication: KDD 2020
    • CO-problem: VRP
    • ML-type: SL, RL (REINFORCE+rollout baseline)
    • Component: GCN
    • Innovation: GCN-NPEC model is based on the graph convolutional network (GCN) with node feature (coordination and demand) and edge feature (the real distance between nodes) as input and embedded. Separate decoders are proposed to decode the representations of these two features. The output of one decoder is the supervision of the other decoder. This paper also proposes a strategy that combines the reinforcement learning manner with the supervised learning manner to train the model.
    • Link: paper
    • image-20210816144127741
  8. Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning

    • Publication: NeurIPS 2020
    • CO-problem: JSSP
    • ML-type: RL (PPO)
    • Component: GNN
    • Innovation: This paper exploit the disjunctive graph representation of JSSP , and propose a Graph Neural Network based scheme to embed the states encountered during solving.
    • Link: arXiv
    • image-20210816170033736
  9. Learning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time

    • Publication: arXiv 2020
    • CO-problem: TSP, VRP
    • ML-type: RL
    • Component: GNN
    • Innovation: This paper develops a new framework to solve any combinatorial optimization problem over graphs that can be formulated as a single player game defined by states, actions, and rewards, including minimum spanning tree, shortest paths, traveling salesman problem, and vehicle routing problem, problem, without expert knowledge.
    • Link: arXiv
    • image-20210816150049627
  10. Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning

    • Publication: AAAI 2020 (Workshop)
    • CO-problem: TSP
    • ML-type: Hierarchical RL
    • Component: GNN
    • Innovation: GPNs build upon Pointer Networks by introducing a graph embedding layer on the input, which captures relationships between nodes. Furthermore, to approximate solutions to constrained combinatorial optimization problems such as the TSP with time windows, we train hierarchical GPNs (HGPNs) using RL, which learns a hierarchical policy to find an optimal city permutation under constraints.
    • Link: arXiv
    • image-20210816171353265
  11. A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

    • Publication: arXiv 2021
    • CO-problem: Directed Acyclic Graph scheduling, Graph Edit Distance, Hamiltonian Cycle Problem
    • ML-type: RL (PPO)
    • Component: GNN, ResNet, Attention
    • Innovation: This paper proposes a bi-level framework is developed with an upper-level learning method to optimize the graph (e.g. add, delete or modify edges in a graph), fused with a lower-level heuristic algorithm solving on the optimized graph. Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity.
    • Link: arXiv
    • image-20210816182855878
  12. SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems

    • Publication: arXiv 2021
    • CO-problem: VRP, JSSP
    • ML-type: RL (DQN, MCTS)
    • Component: GNN
    • Innovation: Learn Offline -> DQN + Search Online -> MCTS
    • Link: arXiv
    • image-20210816163030736

Other

  1. Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning

    • Publication: NeurIPS 2020
    • CO-problem: VRP
    • ML-type: RL
    • Component: /
    • Innovation: Policy evaluation with mixed-integer optimization. At the policy evaluation step, they formulate the action selection problem from each state as a mixed-integer program, in which they combine the combinatorial structure of the action space with the neural architecture of the value function by adapting the branch-and-cut approach.
    • Link: arXiv
  1. Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search

    • Publication: NeurIPS 2018
    • CO-problem: MIS, MVC, MC, SAT
    • ML-type: SL
    • Component: GNN (GCN)
    • Innovation: This paper proposes a model whose central component is a graph convolutional network that is trained to estimate the likelihood, for each vertex in a graph, of whether this vertex is part of the optimal solution. The network is designed and trained to synthesize a diverse set of solutions, which enables rapid exploration of the solution space via tree search.
    • Link: arXiv
    • image-20210815124857154
  2. Learning to Perform Local Rewriting for Combinatorial Optimization

    • Publication: NeurIPS 2019
    • CO-problem: JSSP, VRP
    • ML-type: RL (Actor-Critic)
    • Component: LSTM
    • Innovation: NeuRewriter learns a policy to pick heuristics, and rewrite local components of the current solution to iteratively improve it until convergence. The policy factorizes into a region-picking and a rule-picking component, each of which parameterized by an NN trained with actor-critic in RL.
    • Link: arXiv
    • image-20210816174005858
  3. A Learning-based Iterative Method for Solving Vehicle Routing Problems

    • Publication: ICLR 2020

    • CO-problem: VRP

    • ML-type: RL (REINFORCE)

    • Component: /

    • Innovation: “Learn to Improve” (L2I)

      • hierarchical framework: separate heuristic operators into two classes, namely improvement operators and perturbation operators. At each state, we choose the class first and then choose operators within the class. Learning from the current solution is made easier by focusing RL on the improvement operators only.
      • ensemble method: trains several RL policies at the same time, but with different state input features.
    • Link: paper

    • image-20210816174813049

  4. Exploratory Combinatorial Optimization with Reinforcement Learning

    • Publication: AAAI 2020
    • CO-problem: MAXCUT
    • ML-type: RL (DQN)
    • Component: GNN (MPNN)
    • Innovation: ECO-DQN combines S2V-DQN, Reversible Actions, Observation Tuning and Intermediate Rewards.
    • Link: arXiv
  5. Rewriting By Generating: Learn To Solve Large-Scale Vehicle Routing Problems

    • Publication: ICLR 2021
    • CO-problem: VRP
    • ML-type: Hierarchical RL (REINFORCE)
    • Component: LSTM, K-means, PCA
    • Innovation: Inspired by the classical idea of Divide-and-Conquer, this paper presents a novel Rewriting-by-Generating(RBG) framework with hierarchical RL agents to solve large-scale VRPs. RBG consists of a rewriter agent that refines the customer division globally and an elementary generator to infer regional solutions locally.
    • Link: paper
    • image-20210816175009718

    image-20210816175223823

B&B-based

  1. Learning to Search in Branch and Bound Algorithms

    • Publication: NeurIPS 2014
    • CO-problem: MILP
    • ML-type: RL (Imitation learning)
    • Component: /
    • Innovation: This paper proposed an approach that learns branch-and-bound by imitation learning. (node selection policy and node pruning policy)
    • Link: paper
    • image-20210816160633086
  2. Learning to Branch in Mixed Integer Programming

    • Publication: AAAI 2016
    • CO-problem: MIP
    • ML-type: RL (Imitation Learning)
    • Component: /
    • Innovation: This paper proposes the first successful ML framework for variable selection in MIP.
    • Link: paper
  3. Learning to Run Heuristics in Tree Search

    • Publication: IJCAI 2017
    • CO-problem: MIP
    • ML-type: SL
    • Component: classifier
    • Innovation: Central to this approach is the use of Machine Learning (ML) for predicting whether a heuristic will succeed at a given node.
    • Link: arXiv
  4. Improving Optimization Bounds using Machine Learning: Decision Diagrams meet Deep Reinforcement Learning

    • Publication: AAAI 2019
    • CO-problem: MIS, MAXCUT
    • ML-type: RL (Q-learning)
    • Component: /
    • Innovation: This paper proposes an innovative and generic approach based on deep reinforcement learning for obtaining an ordering for tightening the bounds obtained with relaxed and restricted DDs (decision diagrams).
    • Link: arXiv
    • image-20210816182155426
  5. Exact Combinatorial Optimization with Graph Convolutional Neural Networks

    • Publication: NeurIPS 2019
    • CO-problem: MIIP
    • ML-type: RL (Imitation learning)
    • Component: GNN (GCN)
    • Innovation: This paper proposes a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs.
    • Link: arXiv
    • image-20210816134100224
  6. Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization

    • Publication: AAAI 2021
    • CO-problem: TSP, Portfolio Optimization Problem
    • ML-type: RL (DQN, PPO)
    • Component: GNN (GAT), Set Transformer
    • Innovation: This paper propose a general and hybrid approach, based on DRL and CP (Constraint Programming), for solving combinatorial optimization problems. The core of this approach is based on a dynamic programming formulation, that acts as a bridge between both techniques.
    • Link: arXiv
    • image-20210816155601272
  7. Solving Mixed Integer Programs Using Neural Networks

    • Publication: arXiv 2021

    • CO-problem: MIP

    • ML-type: RL (Imitation Learning)

    • Component: GNN (Bipartite graph)

    • Innovation: Neural Diving + Neural Branching

      • Neural Diving learns a deep neural network to generate multiple partial assignments for its integer variables, and the resulting smaller MIPs for un-assigned variables are solved with SCIP to construct high quality joint assignments.
      • Neural Branching learns a deep neural network to make variable selection decisions in branch-and-bound to bound the objective value gap with a small tree.
    • Link: arXiv

    • image-20210816161449504

Appendix

Template


    • Publication:
    • CO-problem:
    • ML-type:
    • Component:
    • Innovation:
    • Link: arXiv -

    • Publication:
    • CO-problem:
    • Link: arXiv

To Be Classified

  1. Reinforcement Learning for (Mixed) Integer Programming: Smart Feasibility Pump

    • Publication: ICML 2021 (Workshop)
    • CO-problem: MIP
    • Link: arXiv
  2. Learning Local Search Heuristics for Boolean Satisfiability

    • Publication: NeurIPS 2019
    • CO-problem: SAT
    • Link: paper
  3. Reinforcement Learning on Job Shop Scheduling Problems Using Graph Networks

    • Publication: arXiv 2020
    • CO-problem: JSSP
    • Link: arXiv
  4. A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing

    • Publication: AAAI 2020 (Workshop)
    • CO-problem: 3D-BPP
    • Link: arXiv
  5. A Reinforcement Learning Approach to Job-shop Scheduling

    • Publication: IJCAI 2020
    • CO-problem: JSSP
    • Link: paper
  6. A Reinforcement Learning Environment For Job-Shop Scheduling

    • Publication: arXiv 2021
    • CO-problem: JSSP
    • Link: arXiv
  7. Improving Optimization Bounds Using Machine Learning ~ Decision Diagrams Meet Deep Reinforcement Learning

    • Publication: AAAI 2019
    • CO-problem: MIS, MAXCUT
    • Link: arXiv
  8. Online 3D Bin Packing with Constrained Deep Reinforcement Learning

    • Publication: AAAI 2021
    • CO-problem: 3D-BPP
    • Link: arXiv
  9. A Data-Driven Approach for Multi-level Packing Problems in Manufacturing Industry

    • Publication: KDD 2019
    • CO-problem: 3D-BPP
    • Link: ACM DL
  10. Solving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method

    • Publication: arXiv 2017
    • CO-problem: 3D-BPP
    • Link: arXiv
  11. Meta-Learning-based Deep Reinforcement Learning for Multiobjective Optimization Problems

    • Publication: arXiv 2017
    • CO-problem: TSP
    • Link: arXiv
  12. Dynamic Job-Shop Scheduling Using Reinforcement Learning Agents

    • Publication: ROBOT AUTON SYST 2000
    • CO-problem: JSSP
    • Link: paper