Please cite our papers below if our dataset or competition aids you in your research.

@inproceedings{ramamonjison-etal-2022-augmenting,
  title = "Augmenting Operations Research with Auto-Formulation of Optimization Models From Problem Descriptions",
  author = "Ramamonjison et al,.",
  booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track",
  month = dec,
  year = "2022",
  address = "Abu Dhabi, UAE",
  publisher = "Association for Computational Linguistics",
  url = "https://aclanthology.org/2022.emnlp-industry.4",
  pages = "29--62",
}


@InProceedings{pmlr-v220-ramamonjison23a,
  title = 	 {NL4Opt Competition: Formulating Optimization Problems Based on Their Natural Language Descriptions},
  author =       {Ramamonjison, Rindranirina and Yu, Timothy and Li, Raymond and Li, Haley and Carenini, Giuseppe and Ghaddar, Bissan and He, Shiqi and Mostajabdaveh, Mahdi and Banitalebi-Dehkordi, Amin and Zhou, Zirui and Zhang, Yong},
  booktitle = 	 {Proceedings of the NeurIPS 2022 Competitions Track},
  pages = 	 {189--203},
  year = 	 {2022},
  editor = 	 {Ciccone, Marco and Stolovitzky, Gustavo and Albrecht, Jacob},
  volume = 	 {220},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {28 Nov--09 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v220/ramamonjison23a/ramamonjison23a.pdf},
  url = 	 {https://proceedings.mlr.press/v220/ramamonjison23a.html},
  abstract = 	 {The Natural Language for Optimization (NL4Opt) Competition was created to investigate methods of extracting the meaning and formulation of an optimization problem based on its text description. Specifically, the goal of the competition is to increase the accessibility and usability of optimization solvers by allowing non-experts to interface with them using natural language. We separate this challenging goal into two sub-tasks: (1) recognize and label the semantic entities that correspond to the components of the optimization problem; (2) generate a meaning representation (i.e. a logical form) of the problem from its detected problem entities. The first task aims to reduce ambiguity by detecting and tagging the entities of the optimization problems. The second task creates an intermediate representation of the linear programming (LP) problem that is converted into a format that can be used by commercial solvers. In this report, we present the LP word problem dataset and shared tasks for the NeurIPS 2022 competition. Furthermore, we present the winning solutions. Through this competition, we hope to bring interest towards the development of novel machine learning applications and datasets for optimization modeling.}
}

Please also see some of our lab’s relevant works”

@inproceedings{ramamonjison-etal-2023-latex2solver,
    title = "{L}a{T}e{X}2{S}olver: a Hierarchical Semantic Parsing of {L}a{T}e{X} Document into Code for an Assistive Optimization Modeling Application",
    author = "Ramamonjison, Rindra  and
      Yu, Timothy  and
      Xing, Linzi  and
      Mostajabdaveh, Mahdi  and
      Li, Xiaorui  and
      Fu, Xiaojin  and
      Han, Xiongwei  and
      Chen, Yuanzhe  and
      Li, Ren  and
      Mao, Kun  and
      Zhang, Yong",
    editor = "Bollegala, Danushka  and
      Huang, Ruihong  and
      Ritter, Alan",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-demo.45",
    doi = "10.18653/v1/2023.acl-demo.45",
    pages = "471--478",
    abstract = "We demonstrate an interactive system to help operations research (OR) practitioners convert the mathematical formulation of optimization problems from TeX document format into the solver modeling language. In practice, a manual translation is cumbersome and time-consuming. Moreover, it requires an in-depth understanding of the problem description and a technical expertise to produce the modeling code. Thus, our proposed system TeX2Solver helps partially automate this conversion and help the users build optimization models more efficiently. In this paper, we describe its interface and the components of the hierarchical parsing system. A video demo walk-through is available online at \url{http://bit.ly/3kuOm3x}",
}

Winners of Nl4Opt

For more details of the final standings, access the leaderboard. You can also find details on how the winners were determined.

Subtask 1

1st place: Infrrd AI Lab

Team members: JiangLong He, Mamatha N., Shiv Vignesh, Deepak Kumar, Akshay Uppal
Affiliation: Infrrd
Method: Ensemble learning with text augmentation and segment shuffling
Test F1-score: 0.939

2nd place: mcmc

Team members: Kangxu Wang, Ze Chen, Jiewen Zheng
Affiliation: OPD
Method: Ensemble learning with fast gradient method and sentence retokenization
Test F1-score: 0.933

3rd place: PingAn-zhiniao

Team members: Qi Zeng
Affiliation: PingAn International Smart City
Method: Global pointer and multi-head decoder
Test F1-score: 0.932

4th place: Long

Team members: Jiayu Liu, Longhu Qin, Yuting Ning, Tong Xiao, Shangzi Xue, Zhenya Huang, Qi Liu, Enhong Chen, Jinze Wu
Affiliation: BDAA-BASE
Method: Ensemble learning with adversarial training and post processing
Test F1-score: 0.931

5th place: VTCC-NLP

Team members: Xuan-Dung Doan
Affiliation: Viettel Cyberspace Center, Viettel Group
Method: Ensemble learning
Test F1-score: 0.929

Subtask 2

1st place: UIUC-NLP

Team members: Neeraj Gangwar, Nickvash Kani
Affiliation: University of Illinois Urbana-Champaign
Method: Tagged input and decode all-at-once strategy
Test Accuracy: 0.899

2nd place: Sjang

Team members: Sanghwan Jang
Affiliation: POSTECH
Method: Learn entity tag embedding and data augmentation
Test Accuracy: 0.878

3rd place: Long

Team members: Jiayu Liu, Longhu Qin, Yuting Ning, Tong Xiao, Shangzi Xue, Zhenya Huang, Qi Liu, Enhong Chen, Jinze Wu
Affiliation: BDAA-BASE
Method: Prompt re-design + data augmentation + adversarial training
Test Accuracy: 0.867

4th place: PingAn-zhiniao

Team members: Xiuyuan Yang, Yixiu Wang
Affiliation: PingAn International Smart City
Method: Data augmentation and dropout regularization
Test Accuracy: 0.866

5th place: Infrrd AI Lab

Team members: JiangLong He, Mamatha N., Shiv Vignesh, Deepak Kumar, Akshay Uppal
Affiliation: Infrrd
Method: Multitask training with input preprocessing
Test Accuracy: 0.780

News

2023-02-28

We have released the test set for both the generation and NER tasks.
We encourage you to evaluate your new methods against the results reported above.

2022-11-25

The winners have been announced (see results).
We invite you to register for a “virtual only pass” at NeurIPS 2022 pass and join our online workshop. Our winners and some invited teams will be presenting talks and more details can be discussed during our poster session. The workshop will be on Dec. 8th 01:00 - 04:00 AM UTC or Dec. 7th 1700-20:00 PM PST.
Thank you all for your hard work and participation. We look forward to seeing you at our online workshop or in-person poster session.

2022-10-26

We will be announcing the winner on November 4th, the deadline for submitting your training scripts is set to be October 28th.

2022-09-16

We will be hosting two additional Zoom Q&A Sessions (focusing on subtask-2)
- The first session will be on September 22nd 11am-12pm PST, the second session will be on September 22nd 10-11pm PST
- Please check your email for links to both Zoom sessions

2022-08-25

Subtask-2’s baseline has been updated to use spans instead of obj_declaration and const_declaration in the dictionaries. See discussion 32 for more details. Please pull this code if you are planning on using the baseline code.

2022-08-22

For subtask-1 baseline, default argument for max_length for input has been updated from 100 to 200. You should also update this value for you own model. Please pull our latest version from our repo if you plan on running the baseline code.
TBA: similar changes will announced for the baseline model for subtask-2

2022-08-18

Due to the problems with CUDA compatibility for older libraries, we have decided to switch the GPU instance for evaluation to RTX 6000 (14 vCPUs, 46 GiB RAM).
We have changed the evaluation schedule to two per week on Wednesday and Friday, please ensure that your submission is uploaded to the Google Drive Folder by 8am PST on the same days.

2022-08-09 Announcements

We have created an example submission folder on Google Drive (Link) using the baseline models for both subtasks.
The slides from the Zoom Q\&A Sessions are available in the Competition Github Repo (Link)

2022-07-26 Announcements

We will begin evaluating the submissions for both tasks after July 26 12pm PST
- GPU instance on Lambda Labs will be used to test your submissions, the instance has 1x A6000 (48GB) with 14 vCPUs, and 100 GiB RAM (You can test your script by creating an instance and running it on dev data).
- Please ensure your script finish execution in a reasonable amount of time (<10 min), and refrain from downloading unnecessarily large files to the local file system.
- We will upload the shell output evaluation.out to your Google Drive directory, so you can debug your code before the next submission.
We will be hosting two Zoom Q&A Sessions
- The first session will be on August 9th 1-2pm PST, the second session will be on August 9th 10-11pm PST
- Please check your email for links to both Zoom sessions

2022-07-22 Announcements

Please check if your team has received an email with link to your Google Drive folder used for evaluating your submissions
- More instructions for the evaluations on the test set will be available in the upcoming week
- If you received more than one email with different Google Drive links, please check and use the one that is accessible to you.
Revisions for Sub-task 1 dataset: As described in the Data repository, we have removed white space tokens from the NER dataset. The held-out test set has also been updated to the same format.

2022-07-05 Announcements

The data and starter-kits have been released!
The organizers are planning to host a Q&A session by the end of the month. If you are running into issues with your submission or have any other questions, please feel free to join us. More details will be shared closer to the date!
Keep an eye out on your email for the instructions for submission in the upcoming week! The submission portal opens on July 15th.

Important Dates

The following dates use the anywhere on earth (AoE) time zone:

Event	Date
Competition kickoff. The registration is opened and participants can download the starterkit and the training/validation datasets.	July 1st, 2022
Submission available. The leaderboard and forum are opened, and the submissions are accepted.	July 15th, 2022
Deadline for registration	October 8th, 2022
Deadline for submission.	October 15th, 2022
Winners notification. Winning teams are notified and instructed to provide information that will be included in the workshop report.	November 4th, 2022
Report submission deadline.	November 20th, 2022
NeurIPS competition workshop.	December 2022

Introduction

The Natural Language for Optimization (NL4Opt) NeurIPS 2022 competition aims to improve the accessibility and usability of optimization solvers. The task of converting text description of a problem into proper formulations for optimization solvers can often be time-consuming and require an optimizations research expert. The participants will investigate methods of automating this process to be more efficient and accessible to non-experts. This competition aims to answer the following question: can learning-based natural language interfaces be leveraged to convert linear programming (LP) word problems into a format that solvers can understand?

This competition presents two challenges for ML: (1) detect problem entities from the problem description and (2) generate a precise meaning representation of the optimization formulation.

Challenges

The competition is split into two main tasks that are related, but tackled independently. Participants can compete in any subset of these two challenges and the 5 best winning submissions of each task will be awarded (see the Prizes Section below).

The two inter-related tasks are to find an intelligent solution to:

detect problem entities from the problem statement,
generate a precise meaning representation of the optimization formulation.

Sub-task 1 - named entity recognition

The goal of this task is to recognize the label the semantic entities that correspond to the components of the optimization problem. The solutions of the task will take as input an optimization description as a word problem that describes the decision variables, the objective, and a set of constraints. The multi-sentence word problem input exhibits a high level of ambiguity due to the variability of the linguistic patterns, problem domains, and problem structures. This first task aims to reduce the ambiguity by detecting and tagging the entities of the optimization problems such as the objective name, decision variable names, or the constraint limits. This is a preliminary step to simplify the second sub-task and can be seen as a preprocessing task.

Metric: F1 score

Relevant resources:

Review Article - Chinese Named Entity Recognition. Liu et al., Neurocomputing, 2022,
A Survey on Deep Learning for Named Entity Recognition. Li et al., IEEE Transactions on Knowledge and Data Engineering, 2022,
MultiCoNER Competition. SemEval 2022 Task 11.
Hugging Face Tutorial. Hugging Face.
PyTorch Tutorial - Named Entity Recognition Tagging. Stanford Blog.
Keras Tutorial - Named Entity Recognition using Transformers. Keras.
Tutorial - How to Fine-Tune BERT for Named Entity Recognition (NER). Skimai.

Sub-task 2 - generating the precise meaning representation

The goal of this task is to take as input the problem description, the labeled semantic entities, and the order mapping of variable mentions and formulate the precise meaning representation. This meaning representation will be converted into a format that solvers can understand. The solutions will be evaluated on the canonical form and conversion scripts from our pilot study has been released as part of the starter kit. We welcome you to create your own meaning representation or use the representation and conversion scripts provided in the starter kit.

Metric: Declaration-level mapping accuracy

Relevant resources:

Natural Language Processing with Transformers: Building Language Applications with Hugging Face. Tunstall et al., (O’Reilly Media, 2022)
Hugging Face Tutorial. Hugging Face.
Using different decoding methods for language generation with Transformers. Alexander et al., (Colab notebook).
Constrained Language Models Yield Few-Shot Semantic Parsers. Shin et al., ACL Anthology.
The Power of Prompt Tuning for Low-Resource Semantic Parsing. Schucher et al., arXiv.
Text-to-Table: A New Way of Information Extraction. Wu et al., ACL Anthology.
CodeBERT: A Pre-Trained Model for Programming and Natural Languages. Feng et al., Findings of the Association for Computational Linguistics: EMNLP 2020.
Few-Shot Semantic Parsing with Language Models Trained On Code. Shin et al., arXiv.
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. Scholak et al., ACL Anthology.

For more information regarding the type of data in each sub-task and the resources provided to help you get started, visit the Getting Started page of our website.

Metrics

Sub-task 1 (named entity recognition): The solutions will be evaluated on their achieved micro-averaged F1 score:

\[\text{F1} ={2\times P \times R \over P+R},\]

where $P$ and $R$ are the average precision and average recall averaged over all entity types, respectively.

Sub-task 2 (generation): The solutions will be evaluated using an application-specific metric since the task is motivated by the need to precisely formulate the optimization problem. The models will be benchmarked based on the declaration-level mapping accuracy given by:

\[\text{Acc} = 1-\frac{\sum_{i=1}^N\text{FP}_{i} + \text{FN}_i}{\sum_{i=1}^{N}D_{i}},\]

where $N$ is the number of LP problems in the test set. For each LP problem $i$, $D_{i}$ is the number of ground-truth declarations. The false positive $\text{FP}_{i}$ is the number of non-matched predicted declarations whereas the false negative $\text{FN}_{i}$ denotes the number of excess unmatched ground-truth declarations. In other words, false negatives are counted when there are more ground-truth declarations than predicted declarations. A false positive is counted when there is a predicted declaration that does not match any ground-truth declaration.

Prizes

A total monetary prize of $22,000 USD will be awarded. The 5 best winning submissions of each task will be awarded the following prizes:

1st place: $6,000
2nd place: $3,000
3rd place: $1,000
4th and 5th place: $500

All participants will receive a certificate of participation. Winners will be invited to give talks at the workshop.