Connect your moderator Slack workspace to receive post notifications:
Sign in with Slack

Week 12: Differentiable Nash equilibrium

Hi, I have a question regarding the equations shown in slide 5 of the first lecture of week 12:

We first define the Nash equilibrium as follows, where \( theta\) is the minimizer and \(phi\) the maximizer.

Capture_1.JPG

But in the following lines we define the differentiable Nash equilibrium as follows:

Capture_2.JPG

But if the second derivative of the Loss with respect to theta is > 0 doesn't that mean that the loss is minimized in \(theta*\) , not maximized like we state in the previous expression ?

Player G (Generator): "I want to minimize loss with only parameter that I got and that is theta that is in set BIG_THETA."

Player D (Discriminator/Adversary): "I want to maximize loss with only parameter that I got and that is phi that is in set BIG_PHI."

So DNE is defined:

1) as "classic" Nash Equilibrium, i.e.
for every theta that is in the set BIG_THETA
for every phi that is in the set BIG_PHI
it holds that
loss(theta := theta_star, phi := phi ) <= loss(theta := theta_star, phi := phi_star) <= loss(theta := theta, phi := phi_star)

In this possition both players say the following:

Player G (Generator): "I cant take any theta from the set BIG_THETA so that I can lower the loss function given as loss(theta := theta_star, phi := phi_star), at the point (theta := theta_star, phi := phi_star), so that means I will be happy with this what I got, hence I cant make any more moves with my parameter that I control - theta."

Player D (Discriminator\Adversary): "I cant take any phi from the set BIG_PHI so that I can make the loss function bigger, where the loss function is given as loss(theta := theta_star, phi := phi_star), at the point (theta := theta_star, phi := phi_star), so that means I will be happy with this what I got, hence I cant make any more moves with my parameter that I control - phi."

And this is why it is called an Equilibrium - players make no decisions no more when we are at the point (theta := theta_star, phi := phi_star), i.e. players stick to their decisions (player G with theta_star and player D with phi_star).

2) The second part of definition of DNE is stationarity for both gradients (with resect to theta and with respect to phi) need to be zero. Also, because this is locally optimal point it need to satisfy that the curvature at that optimal point (theta := theta_star, phi := phi_star) is going up for parameter theta (player G wants to minimize, hence want strictly convex function with respect to the parameter theta at the optimal point) and going down for parameter phi (player D wants to maximize, hence want strictly concave function with respect to the parameter phi at the optimal point).

Hope it helps! And I hope that what I have written is okay :))

If I am wrong please correct me!

If want we can talk over it, email me at :
milos.novakovic@epfl.ch

I did not ask the question but I think the explanation is clear. However I believe the question is more about the notation and indeed it seems like there is a little error. When the hessian is positive definite, it means the function is "curved upwards", hence increases when\(\theta\) changes. Therefore, it is an equilibrium for the player that tries to minimize the function (hence the \(\phi\) player.

If you look at the equation on the first line, I think it is not the same as you say in your first sentence

@milos5 said:
Player G (Generator): "I want to minimize loss with only parameter that I got and that is theta that is in set BIG_THETA."

Player D (Discriminator/Adversary): "I want to maximize loss with only parameter that I got and that is phi that is in set BIG_PHI."

Page 1 of 1

Add comment

Post as Anonymous Dont send out notification