Generating High-Quality Images
Alternative Losses: Least Squares GAN and Wasserstein
GAN
Wasserstein GAN, Arjovsky et al. (2017), Gulrajani et al. (2017)
The Earth-Mover (EM) or Wasserstein-1 distance
W(P
r
,P
θ
) = inf
γ∈
∏
(P
r
,P
g
)
E
(x,y)∼γ
[
k
x − y
k
]
where
∏
(P
r
,P
g
) denotes the set of all γ(x,y) whose marginals are P
r
, P
g
.
According to Kantorovich-Rubinstein duality,
W(P
r
,P
θ
) = sup
k
f
k
L
≤1
E
x∼P
r
[ f (x) ] − E
x∼P
θ
[ f (x) ]
where the supremum is over all the 1-Lipschitz functions f
:
X → R.
No log in the losses. We do not apply softmax in D’s output.
There is weight clipping in D.
D is trained more than G.
Use RMSProp instead of ADAM
Very low learning rate (α = 0.00005)
(improved) an alternative to clipping weights: penalize the norm of
gradient of the critic with respect to its input.
Ömer Sümer GANs that work well empirically 28 September 2019 14 / 37