> 文章列表 > 使用log_softmax而不是softmax

使用log_softmax而不是softmax

文章列表

使用log_softmax而不是softmax

log_softmax在理论上等于先softmax再log，但在实际使用中，后者会导致inf和nan
举个例子

import torch
import torch.nn.functional as Fa=torch.tensor([0.01,999])
b=F.softmax(a,dim=0)       # tensor([0., 1.])
torch.log(b)               # tensor([-inf, 0.])
c=F.log_softmax(a,dim=0)   # tensor([-998.9900,    0.0000])
torch.exp(c)               # tensor([0., 1.])

原因：log_softmax的公式是 $ln⁡exiex1+...+exn\\ln \\frac {e^{xi}} {e^{x1}+...+e^{xn}}$
计算时会先令分子和分母除以 $e^M$ ，M是xi的最大值，即 $ln⁡exi−Mex1−M+...+exn−M\\ln \\frac {e^{xi-M}} {e^{x1-M}+...+e^{xn-M}}$
再根据对数运算法则，变成 $xi-M-\\ln(e^{x1-M}+...+e^{xn-M})$
其中 $e^{x1-M}+...+e^{xn-M}$ 是肯定不小于1的，保证了计算的可行性。