Pokémon team optimization

I grew up as a big Pokémon fan, watching the anime on TV (even the movies), playing it on my Gameboy color (which was Pikachu-themed too), collecting the cards. I had figurines, posters, blankets, plushies, everything. Truth be told, it was mostly that once my family figured out I liked it, they had just found an endless resource of Christmas and birthday gifts. And 6-year-old me was not complaining.

I do remember the games fondly. I played generations I-IV religiously. However, I was soured by the soft reboot of generation V, and as I was on the cusp of teenagehood, I kind of gave up on it. Fast forward 10 years and I rediscovered the joy of it: playing through my favorite games again as an adult was a blast!

However I quickly realized I had lost some of the magic by not being a kid anymore, and it took me a bit to realize why. I was not only much older than the target audience, but I was also working as a scientist. I was just constantly trying to min-max my way through the game. I was always hunting for Pokémon with better abilities, better type coverage, analyzing synergy between moves… If you’ve ever played a mainline Pokémon game before, you must know how utterly unnecessary this is. Twenty years ago, I would have just powered through on Blastoise or Typhlosion alone. But adult-me likes to have a balanced team, always ready for any encounter. So, at the peak of this frenzy, I decided to build a little tool to help myself over-engineer a team. And in this post, I’ll guide you through how I did it.

Formulating the problem as a Mixed-Integer Problem (MIP)

Before we can move on to the formulation part, what is it exactly that we’re trying to optimize?

If you’ve read this far, I assume you at least are familiar with what Pokémon is. At its core, it’s a turn-based game where you have a party of at most 6 “POcKEt MONsters” (or, Pokémon), which fight for you. There are currently 1025 Pokémon split into 9 generations. Each Pokémon belongs to one or two types (Fire, Water, Grass, etc.) and can learn up to 4 moves, each of which has its own type. Each type has its own strengths and weaknesses against other types. For example, water is strong against fire, so a Pokémon using a water attack against a fire Pokémon gets a 2x multiplicative factor.

Pokemon type matchup table — Pokémon type matchup table - Aussie Evil and OmegaFallon, CC0, via Wikimedia Commons

Each Pokémon also has stats, like attack, defense, speed, etc., but in this post we’ll only consider the overall base stat as an indicator of a strong Pokémon. There are a lot of other important details I won’t get into (e.g., there is a Same Type Attack Bonus (STAB), a physical/special split, etc.) but these all contribute to making the gameplay rich and complex. For now, I’m interested in finding the best teams of Pokémon such that:

the total base stat is as high as possible,
for each type $A$ , the team has a Pokémon resistant to it.

In optimization terms, 1. is called the objective while 2. is a constraint. Other notable constraints of the problem are:

a Pokémon can be chosen at most once,
the total amount of chosen Pokémon is at least 1 and at most 6.

We could solve some variation of the same problem by swapping/combining 2. with

“for each type $A$ , the team has a Pokémon super effective to it”.

Ideally, you would want a more complex version 2. which would be

“for each type $A$ , the team has a Pokémon resistant to it which can learn a super effective attack against it”.

But that’s a lot more complicated so for now let’s stick with 2. To recap, we have an objective and some constraints, what we need now are decision variables which will encode the solution. Mathematically, we can denote by $x_n$ whether a Pokémon labeled by $n$ is in the team or not. This is a boolean variable, i.e., $x_n \in \{0, 1\}$ . Focusing on the simple constraints for now, the basic problem to solve is

\mathrm{argmax}_{x} \sum_n b_n x_n~, \text{ such that } 1 \leq \sum_n x_n \leq 6 \text{ and } x_n \in \{ 0, 1 \}~.

The first part is the maximisation of the overall base stat, where $b_n$ is the base stat of Pokémon $n$ . The second part just tells us that a team must have between 1 and 6 Pokémon.

Optimization of linear problems 101

Before adding more complexity to this model, we can use this simpler example to illustrate how these kinds of problems are usually solved in operations research (OR). This is a rich area, with decades of research and a lot of industrial interest in it: routing, scheduling, stocking are all examples of complex problems which can be solved with these methods.

First of all, solving the problem with integer constraints is hard, so let’s relax that assumption for now. The problem is then a pure linear optimization problem (LP): all objectives and constraints are linear expressions in the decision variables. Geometrically, linear expressions match to shapes like lines, planes, etc. which means that:

constraints each define a region of feasibility whose boundary is a hyperplane (a line in 2D, a plane in 3D, etc.),
the intersection of all constraints define a restricted region of feasibility in which points are possible solutions.

Feasibility region illustration — Example: if you have only two decision variables which must be positive and the constraint $x + y \leq 3$ , this is what the feasibility region looks like. It is the intersection of the region below the line $x + y = 3$ with the regions $x, y \geq 0$ .

Since the objective is also linear, the surfaces on which it is constant (also called isosurfaces) are also geometrically matched to lines, planes, etc. When you move that surface along a direction or another, the objective either increases or decreases monotonically. If you combine these two facts, you can find the optimum quite easily: slide the hyperplane of constant objective value over the region of feasibility, in the direction where the value increases. Eventually, you will reach the edge of the surface, which will be either a single vertex or a whole parallel edge. This is where the optimal solution lives.

Simplex method illustration — Let’s take as objective $y - x$ , which is constant along lines parallel to the dashed lines in black, increasing when $x$ decreases and $y$ increases. The colored dashed line is that isosurface with maximal value which also intersects the feasibility region. This is done on the vertex $(0,3)$ , which is then the optimum.

This concept is behind the simplex algorithm used to solve linear problems. The basic idea of this algorithm is to explore the edge of the feasibility region, vertex to vertex, until it finds an optimum.

Does this also work with the integer constraint?

Unfortunately, no. Points on the edge of the region might be candidate optima to the objective, but there is generally no reason for them to obey the integer constraint. The candidates for this would be found inside the feasibility region (instead of at the edge) which is a much larger region to explore. Instead of doing that, what you can do is branch and bound:

solve the relaxed LP and obtain a solution: if the integer constraint is obeyed, you’re done!
Otherwise, pick one of the non-integer variables $N < x < N+1$ , and define two constraints $x \leq N$ and $x \geq N+1$ . For each of these constraints, define the LP with the extra constraint and go back to 1.

This algorithm continues recursively until either a solution obeying integer constraints is found, or no solutions are found. If you want to make sure to find a global optimum, you can also keep going after you’ve found a valid candidate solution, and use that objective to trim the recursive tree from branches which just won’t get any better!

Adding non-linear constraints

So far I only focused on simple constraints like the number of Pokémon in the team, which are linear and easy to visualize graphically, but we are going to be a bit more ambitious than that! Of course, in a LP, all constraints must be linear. For type-based constraints, let’s denote by $t_{An}$ the matrix entry containing the damage factor of type $A$ against Pokémon $n$ . The constraint 2. is equivalent to the statement $\min_n(x_n t_{An}) \leq 0.5$ (this means at least one selected Pokémon is resistant to this type). But here comes the difficulty! The operation $\min{}$ is not linear. Thankfully there is a trick for this situation using auxiliary decision variables $y_{An} \in \{0, 1\}$ . Take this system of constraints: ¹

\begin{align*} x_n t_{An} + M \times (y_{An} - 1) & \leq 0.5 \text{ with } M \gg 1~, \\ \sum_n y_{An} & \geq 1~. \end{align*}

The second constraint basically means that for all $A$ , at least one of $y_{An}$ must be 1. Since $M$ is taken very large (compared to the other numbers), then whenever $y_{An} = 0$ , the first constraint is automatic. However, since $y_{An} = 1$ for some $n$ , then for that value $x_n t_{An} \leq 0.5$ . And voilà, the Pokémon will be resistant to the type $A$ !

There is one tiny flaw in this method. Whichever Pokémon sets $y_{An} = 1$ might not be one of the selected Pokémon. What we want is not only that $y_{An} = 1$ but also that $x_n = 1$ at the same time (so either both 1 or both 0). To make this happen, we need more auxiliary decision variables $z_{An}$ with the constraints:

\begin{align*} z_{An} & \leq x_n~,\\ z_{An} & \leq y_{An}~,\\ z_{An} & \geq x_n + y_{An} - 1~. \end{align*}

This might sound complicated at first, but we can very easily summarize the possible values of each variable in the following table

$x_n$	$y_{An}$	$z_{An}$
0	0	0
0	1	0
1	0	0
1	1	1

When either variable is 0, then the first two constraints impose that $z_{An} = 0$ and the last one is automatic. But if both variables are 1, the last constraint imposes that $z_{An} = 1$ . So really $z = x \land y$ (logical AND operation). That’s great, this means that if we replace the condition $\sum_n y_{An} \geq 1$ by $\sum_n z_{An} \geq 1$ , then the Pokémon which will be resistant to type $A$ will also be in the team!

Getting our hands dirty with PuLP

After the theory comes some practice! In this section, I’ll show you briefly how to actually solve the model we just defined. For this I used the Python library PuLP but I can also recommend Google’s excellent OR-tools library.

Of course, the first thing you need for this project is a dataset of Pokémon data with all the necessary information to use in the constraints. I chose this one from Kaggle. To define a maximization problem and its decision variables, all you need to do is

import pulp

def solve_pokemon_team(number_pkmn: int, number_types: int, types_matrix: list[list[float]], pkmn_base_stat: list[int]):
    prob = pulp.LpProblem("Pokemon_Team_Optimization", pulp.LpMaximize)

    # Define the boolean variables
    x = pulp.LpVariable.dicts("x", range(number_pkmn), cat="Binary")
    y = pulp.LpVariable.dicts("y", (range(number_types), range(number_pkmn)), cat="Binary")
    z = pulp.LpVariable.dicts("z", (range(number_types), range(number_pkmn)), cat="Binary")

    # the rest of the code will go here...

Constraints or objective can be added to the prob object like this:

# Define the size constraint
prob += pulp.lpSum(x[i] for i in range(number_pkmn)) == size_team, "Team Size"

# Define the base total sum objective
prob += pulp.lpSum(pkmn_base_stat[i] * x[i] for i in range(number_pkmn)), "Maximal base total"

Notice the difference between all the sums involved:

lpSum is an abstract index sum over a generator of constraints terms,
prob += constraint means this constraint should be obeyed during the maximization process.

Of course, for more complex constraints like the type-based one, you can programmatically do it

# Define the weakness sum bound for each type
m = 100
for a, type_col in enumerate(types_matrix):
    prob += pulp.lpSum(z[a][i] for i in range(number_pkmn)) >= 1  # Overall constraint for each type
    for i in range(number_pkmn):
        prob += z[a][i] <= x[i], f"Constraint z 1 for {a},{i}"
        prob += z[a][i] <= y[a][i], f"Constraint z 2 for {a},{i}"
        prob += z[a][i] >= x[i] + y[a][i] - 1, f"Constraint z 3 for {a},{i}"
        prob += x[i] * pkmn_base_stat[i] <= 0.5 + m * (1 - y[a][i]), f"Weakness {type_col} for pokemon {i}"

And finally, once your problem is well-defined, you can call the solver to find a solution

out_code = prob.solve() # will be 1 if successful
# Indices of Pkmn in the solution
pkmn_in_team = [i for i in range(number_pkmn) if pulp.value(x[i]) == 1]

The function pulp.value is useful to get the value of any decision variable for the solution found.

Results and closing words

So what kind of team does this give?

The basic answer to this question is pretty boring: the optimum team is entirely composed of legendary and pseudo-legendary Pokémon across generations, with: ²


Mewtwo (#150)	Tyranitar (#248)	Metagross (#376)	Kyogre (#382)	Groudon (#383)	Rayquaza (#384)
Psychic	Rock, Dark	Steel, Psychic	Water	Ground	Dragon, Flying

This makes sense of course, these Pokémon tend to have the highest base stat. In terms of types, if you refer back to the type matchup table above, you can see the two types with the most resistance are Dragon and Steel. So Metagross and Rayquaza already account for a chunk of the resistance we need to satisfy the constraints. If you look by rows, you can see some types encounter very few resistance in type matches, so any good team should have at least:

one Dark Pokémon to resist Ghost (Tyranitar),
one Fighting or Ground Pokémon to resist Rock (Groudon).

Now, if you remove legendary Pokémon from the pool, you end up with the second most logical choice, the pseudo-legendaries. These are Pokémon whose base stat is significantly larger than most Pokémon yet they don’t officially have the legendary title. For example, the team I got with these constraints (without legendaries) is


Gyarados (#130)	Tyranitar (#248)	Metagross (#376)	Slaking (#289)	Salamence (#373)	Garchomp (#445)
Water, Flying	Rock, Dark	Steel, Psychic	Normal	Dragon, Flying	Dragon, Ground

In this team, Tyranitar, Metagross, Salamence and Garchomp are all pseudo-legendary (the other two are very strong ordinary Pokémon). Notice how here, Slaking is a pure Normal Pokémon which contributes almost no resistance (only to Ghost types), which means all the other Pokémon take care of the constraints and Slaking was mostly chosen for its high base stat. This is a really good illustration of how the model is blind to effects it doesn’t know about. Slaking might have a high base stat, but its ability also prevents him from acting 50% of the time, effectively neutering the advantage.

Okay let’s do this one more time, but this time without any legendaries, pseudo-legendaries, and with at most one starter to be realistic (I needed an extra constraint for that, you can try to figure out how to write it out yourself!)


Gyarados (#130)	Swampert (#260)	Gardevoir (#282)	Slaking (#289)	Aggron (#306)	Lucario (#448)
Water, Flying	Water, Ground	Fairy, Psychic	Normal	Steel, Rock	Steel, Fighting

This is an interesting team: most Pokémon are from gen. III (4 of them), the other two being from gen. I and gen. IV. This makes sense, out of the first four generations, gen. III is famous for introducing some well-typed high base stat Pokémon which will be preferred by our model. It’s easy to see which Pokémon contributes the most to which resistance using our decision variables

Pokémon	Type resistance
Gyarados	Fire, Ground, Steel, Water
Swampert	Electric
Gardevoir	Dragon, Fighting, Psychic
Slaking	Ghost
Aggron	Fairy, Flying, Ice, Normal, Poison
Lucario	Bug, Dark, Grass, Rock

From this table, you can see that Gardevoir, Lucario and Aggron are carrying the team resistance-wise! Gyarados is also contributing a lot, although a generic Water Pokémon would have also helped about the same here. You could easily swap Swampert and Slaking for about any other Pokémon (sacrificing some of the base stat of course) as long as you cover Ghost and Electric types.

This project was overall a fun showcase for OR methods. There are a lot more constraints you can consider before reaching the true perfect Pokémon team, but already the ones we used distilled some very strong combinations which would take you through any game without difficulty. If you are interested in customizing your own team, the CLI tool in the Github repo is also able to complete a partial team, optimizing the remaining seats for you!

Footnotes

In such expressions, if an index appears without a sum, assume it means there is one constraint per value of that index. ↩
All Pokémon sprites in this article are from pokeapi.co. ↩