The Gaussian error linear unit (GELU) is defined as:

`gelu(x) = x * P(X <= x)`

where `P(X) ~ N(0, 1)`

,
i.e. `gelu(x) = 0.5 * x * (1 + erf(x / sqrt(2)))`

.

GELU weights inputs by their value, rather than gating inputs by their sign as in ReLU.

## See also

Other activations: `activation_elu()`

`activation_exponential()`

`activation_hard_sigmoid()`

`activation_leaky_relu()`

`activation_linear()`

`activation_log_softmax()`

`activation_mish()`

`activation_relu()`

`activation_relu6()`

`activation_selu()`

`activation_sigmoid()`

`activation_silu()`

`activation_softmax()`

`activation_softplus()`

`activation_softsign()`

`activation_tanh()`