# Idea
Sigmoid function (aka logistic or inverse logit function).
The sigmoid function $\sigma(x)=\frac{1}{1+e^{-x}}$ is frequently used in neural networks because its derivative is very simple and computationally fast to calculate, making it great for backpropagation.
Let's denote the sigmoid function as the following:
$\sigma(x)=\frac{1}{1+e^{-x}}$
$\sigma(x)=\frac{e^{x}}{e^{x}+1}$
$\frac{1}{1+e^{-x}}=
\frac{1}{1+e^{-x}} \frac{e^{x}}{e^{x}}
=\frac{e^{x}}{e^{x}+1}
$
Since $\frac{e^x}{e^x} = 1$, so in essence, we're just multiplying $\frac{1}{1+e^{-x}}$ by 1.
The derivative of the sigmoid function $\sigma(x)$ is the sigmoid function $\sigma(x)$ multiplied by $1 - \sigma(x)$.
$\sigma(x)=\frac{1}{1+e^{-x}}$
$\sigma'(x)=\frac{d}{dx}\sigma(x)=\sigma(x)(1-\sigma(x))$
Before we begin, here's a reminder of how to find the derivatives of exponential functions.
$ \frac{d}{dx}e^x = e^x$
$ \frac{d}{dx}e^{-3x^2 + 2x} = (-6x + 2)e^{-3x^2 + 2x}$
And here's the [[chain rule]]: $\frac{d}{dx} \left[ f(g(x)) \right] = f'\left[g(x) \right] * g'(x)$
Example: Find the derivative of $f(x) = (x^2 + 1)^3$:
$
\begin{aligned}
f'(x) &= 3(x^2 + 1)^{3-1} * 2x^{2-1}\\
&= 3(x^2 + 1)^2(2x) \\
&= 6x(x^2 + 1)^2
\end{aligned}
$
## Derivative via chain rule
Line 2 of the sigmoid derivation below uses this rule.
$
\begin{aligned}
\frac{d}{dx} \sigma(x) &= \frac{d}{dx} \left[ \frac{1}{1+e^{-x}} \right] =\frac{d}{dx}(1+e^{-x})^{-1} \\
&=-1*(1+e^{-x})^{-2}(-e^{-x}) \\
&=\frac{-e^{-x}}{-(1+e^{-x})^{2}} \\
&=\frac{e^{-x}}{(1+e^{-x})^{2}} \\
&=\frac{1}{1+e^{-x}} \frac{e^{-x}}{1+e^{-x}} \\
&=\frac{1}{1+e^{-x}} \frac{e^{-x} + (1 - 1)}{1+e^{-x}} \\
&=\frac{1}{1+e^{-x}} \frac{(1 + e^{-x}) - 1}{1+e^{-x}} \\
&=\frac{1}{1+e^{-x}} \left[ \frac{(1 + e^{-x})}{1+e^{-x}} - \frac{1}{1+e^{-x}} \right] \\
&=\frac{1}{1+e^{-x}} \left[ 1 - \frac{1}{1+e^{-x}} \right] \\
&=\sigma(x) (1-\sigma(x)) \\
\end{aligned}
$
## Derivative via quotient rule
[[quotient rule]]: If $f(x) = \frac{g(x)}{h(x)}$, then $f'(x) = \frac{g'(x)h(x) - h'(x)g(x)}{(h(x))^2}$.
Example: Find the derivative of $f(x) = \frac{3x}{1 + x}$:
$
\begin{aligned}
f'(x) &= \frac{(\frac{d}{dx}(3x))*(1+x) - (\frac{d}{dx}(1+x)) * (3x)} {(1+x)^2} \\
&= \frac{3(1 + x) - 1(3x)}{(1+x)^2} \\
&= \frac{3 + 3x - 3x}{(1+x)^2} \\
&= \frac{3}{(1+x)^2}
\end{aligned}
$
Line 2 of the sigmoid derivation below uses this rule.
$
\begin{aligned}
\frac{d}{dx} \sigma(x) &= \frac{d}{dx} \left[ \frac{1}{1+e^{-x}} \right] \\
&=\frac{(0)(1 + e^{-x}) - (-e^{-x})(1)}{(1 + e^{-x})^2} \\
&=\frac{e^{-x}}{(1 + e^{-x})^2} \\
&=\frac{1}{1+e^{-x}} \frac{e^{-x}}{1+e^{-x}} \\
&=\frac{1}{1+e^{-x}} \frac{e^{-x} + (1 - 1)}{1+e^{-x}} \\
&=\frac{1}{1+e^{-x}} \frac{(1 + e^{-x}) - 1}{1+e^{-x}} \\
&=\frac{1}{1+e^{-x}} \left[ \frac{(1 + e^{-x})}{1+e^{-x}} - \frac{1}{1+e^{-x}} \right] \\
&=\frac{1}{1+e^{-x}} \left[ 1 - \frac{1}{1+e^{-x}} \right] \\
&=\sigma(x) (1-\sigma(x)) \\
\end{aligned}
$
# References
- [Data science: Neural networks: Deriving the sigmoid derivative via chain and quotient rules](https://hausetutorials.netlify.app/posts/2019-12-01-neural-networks-deriving-the-sigmoid-derivative/)