深度学习基础知识
NG深度学习中用到的函数
sigmoid函数
$$
\text{For } x \in \mathbb{R}^n \text{, } sigmoid(x) = sigmoid\begin{pmatrix}
x_1 \
x_2 \
… \
x_n \
\end{pmatrix} = \begin{pmatrix}
\frac{1}{1+e^{-x_1}} \
\frac{1}{1+e^{-x_2}} \
… \
\frac{1}{1+e^{-x_n}} \
\end{pmatrix}\tag{1}
$$
python代码实现:
1 | import numpy as np |
Sigmoid gradient(梯度,求导)
经过dS = ds/dx的求导运算发现dS = s*(1-s)
$$
sigmoid_derivative(x) = \sigma’(x) = \sigma(x) (1 - \sigma(x))\tag{2}
$$
1 | def sigmoid_derivative(x): |
image2vector函数
For example, in computer science, an image is represented by a 3D array of shape (length,height,depth=3). However, when you read an image as the input of an algorithm you convert it to a vector of shape (length∗height∗3,1). In other words, you “unroll”, or reshape, the 3D array into a 1D vector.
将image矩阵转换为1个向量
1 | def images2vector(image): |
输出:
v.shape: (18, 1)
image2vector(image):
[[0.67826139]
[0.29380381]
[0.90714982]
[0.52835647]
[0.4215251 ]
[0.45017551]
[0.92814219]
[0.96677647]
[0.85304703]
[0.52351845]
[0.19981397]
[0.27417313]
[0.60659855]
[0.00533165]
[0.10820313]
[0.49978937]
[0.34144279]
[0.94630077]]
[[0.67826139 0.29380381 0.90714982 0.52835647 0.4215251 0.45017551
0.92814219 0.96677647 0.85304703 0.52351845 0.19981397 0.27417313
0.60659855 0.00533165 0.10820313 0.49978937 0.34144279 0.94630077]]
Normalizing rows (单位化行向量)
Another common technique we use in Machine Learning and Deep Learning is to normalize our data. It often leads to a better performance because gradient descent converges faster after normalization. Here, by normalization we mean changing x to x∥x∥ (dividing each row vector of x by its norm).
$$
\frac{x}{| x|}
$$
For example, if
$$
x =
\begin{bmatrix}
0 & 3 & 4 \
2 & 6 & 4 \
\end{bmatrix}\tag{3}
$$
then
$$
| x| = np.linalg.norm(x, axis = 1, keepdims = True) = \begin{bmatrix}
5 \
\sqrt{56} \
\end{bmatrix}\tag{4}
$$
and
$$
x_normalized = \frac{x}{| x|} = \begin{bmatrix}
0 & \frac{3}{5} & \frac{4}{5} \
\frac{2}{\sqrt{56}} & \frac{6}{\sqrt{56}} & \frac{4}{\sqrt{56}} \
\end{bmatrix}\tag{5}
$$
Note that you can divide matrices of different sizes and it works fine: this is called broadcasting and you’re going to learn about it in part 5.
Exercise: Implement normalizeRows() to normalize the rows of a matrix. After applying this function to an input matrix x, each row of x should be a vector of unit length (meaning length 1).
1 | # 将矩阵中的 横向量 单位化 |
输出:
[[0.26726124 0.53452248 0.80178373]
[0.6 0.8 0. ]
[0.57735027 0.57735027 0.57735027]]
Implement the L1 and L2 loss functions(损失函数)
$$
\begin{align*} & L_1(\hat{y}, y) = \sum_{i=0}^m|y^{(i)} - \hat{y}^{(i)}| \end{align*}\tag{6}
$$
$$
\begin{align*} & L_2(\hat{y},y) = \sum_{i=0}^m(y^{(i)} - \hat{y}^{(i)})^2 \end{align*}\tag{7}
$$
1 | def L1(yhat, y): |
常用的一些基本数学知识
机器学习中的基本数学知识
LaTex数学公式语法
矩阵內积 dot
就是简单的矩阵相乘 ab
1 | import numpy as np |
输出:
[[ 6 12 18]
[ 6 12 18]
[ 6 12 18]]
矩阵外积 outer
$$
x\oplus y=\begin{bmatrix}x_1&…&x_{1n}\
x_2 &…&x_{2n}\
\dots&\dots&\dots\
x_m&\dots&x{mn}
\end{bmatrix}\begin{bmatrix}y_1&…&y_{1q}\
y_2 &\dots&y_{2q}\
\dots&\dots&\dots\
y_p&\dots&y_{pq}
\end{bmatrix}
\
=\begin{bmatrix}x_1y_1&\dots&x_{1}y_{1q}&x_1y_2&\dots&x_1y_{pq}\
\dots&\dots&\dots&\dots&\dots&\dots\
x_{1n}y_1&\dots&x_{1n}y_{1q}&x_{1n}y_2&\dots&x_{1n}y_{pq}\
x_2y_1&\dots&x_{1}y_{1q}&x_2y_2&\dots&x_2y_{pq}\
\dots&\dots&\dots&\dots&\dots&\dots\
x_{mn}y_1&\dots&x_{mn}y_{1q}&x_{mn}y_2&\dots&x_{mn}y_{pq}\
\end{bmatrix}\tag{8}
$$
1 | # 矩阵的 外积 np.outer |
输出:
[[ 1 2 1 2 1 2]
[ 2 4 2 4 2 4]
[ 3 6 3 6 3 6]
[ 4 8 4 8 4 8]
[ 5 10 5 10 5 10]
[ 6 12 6 12 6 12]]
矩阵元素积 elementwise
根据python的广播特性
1 | import numpy as np |
输出:
[[1 2 3]
[4 4 4]
[9 9 9]]