Puneet S. Bagga

Hi! I'm currently a Staff Associate I (Visiting Researcher/Student) at Columbia IEOR generously hosted by Dr. Tianyi Peng. Previously, I completed my undergrad in Computer Science at Georgia Tech working with Dr. Arthur Delarue and Dr. Jean Pauphilet, and spent four months after graduating continuing my research.

I currently work as an engineer at Mercury on international wires and supporting the AI Agents project. Previously, I worked on fine-tuning LLMs for agentic behavior at Software Applications Incorporated (acquired by OpenAI), and as a machine learning intern at Mercury, where I trained gradient boosted decision trees to automate spend limit requests.

I am also an avid photographer, shooting mostly nature, landscapes, and cityscapes. I was recently selected as a finalist for a solo exhibition at a New York City art gallery, and you can view my photography here. I also enjoy biking and the outdoors (visited 15/63 national parks).

I am looking for Fall 2026 graduate opportunities! Feel free to reach out.

Email / Columbia Email / CV / Scholar / Github / LinkedIn

Research Interests

My research lies at the intersection of machine learning and optimization, with a focus on Neural Combinatorial Optimization and optimization-driven approaches to data- and compute-efficient ML.

E-GEO: A Testbed for Generative Engine Optimization in E-Commerce
Puneet S. Bagga, Vivek F. Farias, Tamar Korkotashvili, Tianyi Peng, Yuhang Wu (Authors listed alphabetically)
arXiv, 2025
arXiv / code

With the rise of large language models (LLMs), generative engines are becoming powerful alternatives to traditional search, reshaping retrieval tasks. In e-commerce, for instance, conversational shopping agents now guide consumers to relevant products. This shift has created the need for generative engine optimization (GEO)--improving content visibility and relevance for generative engines. Yet despite its growing importance, current GEO practices are ad hoc, and their impacts remain poorly understood, especially in e-commerce. We address this gap by introducing E-GEO, the first benchmark built specifically for e-commerce GEO. E-GEO contains over 7,000 realistic, multi-sentence consumer product queries paired with relevant listings, capturing rich intent, constraints, preferences, and shopping contexts that existing datasets largely miss. Using this benchmark, we conduct the first large-scale empirical study of e-commerce GEO, evaluating 15 common rewriting heuristics and comparing their empirical performance. To move beyond heuristics, we further formulate GEO as a tractable optimization problem and develop a lightweight iterative prompt-optimization algorithm that can significantly outperform these baselines. Surprisingly, the optimized prompts reveal a stable, domain-agnostic pattern--suggesting the existence of a "universally effective" GEO strategy. Our data and code are publicly available at https://github.com/psbagga17/E-GEO.

Learning Knapsack Decision Rules using Permutation-Invariant Neural Architectures
Puneet S. Bagga, Arthur Delarue, Jean Pauphilet
In preparation

Solving the Quadratic Assignment Problem using Deep Reinforcement Learning
Puneet S. Bagga, Arthur Delarue
arXiv, 2023
arXiv / code

The Quadratic Assignment Problem (QAP) is an NP-hard problem which has proven particularly challenging to solve: unlike other combinatorial problems like the traveling salesman problem (TSP), which can be solved to optimality for instances with hundreds or even thousands of locations using advanced integer programming techniques, no methods are known to exactly solve QAP instances of size greater than 30. Solving the QAP is nevertheless important because of its many critical applications, such as electronic wiring design and facility layout selection. We propose a method to solve the original Koopmans-Beckman formulation of the QAP using deep reinforcement learning. Our approach relies on a novel double pointer network, which alternates between selecting a location in which to place the next facility and a facility to place in the previous location. We train our model using A2C on a large dataset of synthetic instances, producing solutions with no instance-specific retraining necessary. Out of sample, our solutions are on average within 7.5% of a high-quality local search baseline, and even outperform it on 1.2% of instances.

Template from Jon Barron's website.