I gave a brief presentation about the college application problem at a research fair hosted by our department. English subtitles are included.

### College Application on arXiv

In advance of a conference I will be attending with my labmates next month in Jeju, my thesis advisor and I have posted a preliminary version of “The College Application Problem” on arXiv. Here’s the abstract:

This paper considers the maximization of the expected maximum value of a portfolio of random variables subject to a budget constraint. We refer to this as the optimal college application problem. When each variable's cost, or each college's application fee, is identical, we show that the optimal portfolios are nested in the budget constraint, yielding an exact polynomial-time algorithm. When colleges differ in their application fees, we show that the problem is NP-complete. We provide four algorithms for this more general setup: a branch-and-bound routine, a dynamic program that produces an exact solution in pseudopolynomial time, a different dynamic program that yields a fully polynomial-time approximation scheme, and a simulated-annealing heuristic. Numerical experiments demonstrate the algorithms' accuracy and efficiency.

Rolling updates to the paper as well as various slideshows presenting the research are available on GitHub.

### The School Location Problem

I’ve spent a few days thinking about a facility location problem that we might call the *school location problem.* The goal is to place *n* schools to serve *m* families, such that the *furthest* distance between any family and their *nearest* school is *minimized.* This problem is *almost* a k-means or ellipsoid fitting problem, but its unusual “minimin” makes it computationally challenging.

In the video below, and associated Jupyter notebook (Github, HTML viewer), I discuss the formulation of this problem, show to express it as a mixed-integer second-order convex program, and then solve a small instance using the Julia/JuMP/Juniper/SCS stack.

### College Application as Portfolio Optimization

As I approach graduation, I have narrowed the scope of my thesis research to a particular subproblem arising in equilibrium model of the college admissions market which turns out to be quite interesting on its own: the *optimal college application strategy.* For each school, you are given a utility value, your probability of admission, and the cost of submitting an application. Given a fixed budget to spend on application fees, to which schools should you apply?

We can think of the college application strategy as a *portfolio optimization* problem. Given a selection of random variables, you are to purchase the subset that maximizes a payoff function subject to a budget constraint. However, unlike the traditional Markowitz model, the portfolio is not continuously allocated: You cannot apply to half of one school and half of another. Moreover, students care not about the *aggregate* performance of the portfolio (for example, the expected sum of stock prices minus their variance times a risk coefficient) but about the performance of its *best* asset: If you get into your favorite school, it doesn’t matter what happens at your second and third choice.

These two modifications significantly increase the difficulty of the problem. In the special case where all colleges have identical application fees, which is equivalent to the Korean admissions system in which you are only allowed to apply to *h* = 3 schools, the intuitive algorithm that picks the top *h* schools after discounting their utility values by the probability of rejection can be off by a factor of *h.* There is an exact, polynomial-time algorithm, but it is somewhat subtle. In the general setup with arbitrary application fees, we can show that the problem is NP-complete. Nonetheless, we can produce an exact solution in pseudopolynomial time using dynamic programming, and there also exists a fully polynomial-time approximation scheme.

A pre-alpha version of my work is now on Github. My immediate next steps are to improve the documentation of the Julia implementation and produce benchmarks to confirm the theoretical performance guarantees on time complexity, memory usage, and approximation error. Going forward, there are further generalizations of the problem and alternative framings that may warrant further exploration.