定义

给定两个矩阵 A=(ai,j)Rm×n,B=(bi,j)Rm×nA=(a_{i,j})\in \mathbb{R}^{m\times n}, B=(b_{i,j})\in \mathbb{R}^{m\times n}

1. 阿达马积 Hadamard product(又称作逐元素积)

阿达马积,也称为逐元素积,是两个相同维度的矩阵对应元素相乘的结果。

AB=[a1,1b1,1a1,2b1,2a1,nb1,na2,1b2,1a2,2b2,2a2,nb2,nam,1bm,1am,2bm,2am,nbm,n]A \circ B = \begin{bmatrix} a_{1,1}b_{1,1} & a_{1,2}b_{1,2} & \cdots & a_{1,n}b_{1,n} \\ a_{2,1}b_{2,1} & a_{2,2}b_{2,2} & \cdots & a_{2,n}b_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1}b_{m,1} & a_{m,2}b_{m,2} & \cdots & a_{m,n}b_{m,n} \end{bmatrix}

注:
三、矩阵运算 - 图1
雅克-所罗门·阿达马(Jacques-Salomon Hadamard,1865 年 12 月 8 日生于法国凡尔赛 - 1963 年 10 月 17 日卒于巴黎)是一位法国数学家,他证明了素数定理,该定理指出,当 n 接近无穷大时,π(n)接近 nlnn\frac{n}{lnn},其中π(n)是不大于 n 的正素数数。

注:
为什么不翻译为 “哈达玛”,因为看网上说此处的h在法语中不发音。

2. 克罗内克积(Kronecker Product)

克罗内积是两个矩阵的张量积,结果是一个分块矩阵。

AB=[a1,1Ba1,2Ba1,nBa2,1Ba2,2Ba2,nBam,1Bam,2Bam,nB]A \otimes B = \begin{bmatrix} a_{1,1}B & a_{1,2}B & \cdots & a_{1,n}B \\ a_{2,1}B & a_{2,2}B & \cdots & a_{2,n}B \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1}B & a_{m,2}B & \cdots & a_{m,n}B \end{bmatrix}

注:
三、矩阵运算 - 图2
克罗内克,德国数学家。对代数和代数数论,特别是椭圆函数理论有突出贡献。

小结:

  • 阿达马积对应元素相乘,要求两个矩阵维度相同。
  • 克罗内克积:生成一个分块矩阵不要求两个矩阵维度相同

基本公式

a,b,c,x \vec{a}, \vec{b}, \vec{c}, \vec{x} 为 n 阶向量A,B,C,XA,B,C,X为 n 阶方阵,则有:

注:
a,b,c \vec{a}, \vec{b}, \vec{c} 是常量向量
x \vec{x} 是变量常量
A,B,C A,B,C 是常量矩阵
X X 是变量矩阵

1. 向量导数的基本公式

(aTx)x=(xTa)x=a\frac{\partial (\vec{a}^T \vec{x})}{\partial \vec{x}} = \frac{\partial (\vec{x}^T \vec{a})}{\partial \vec{x}} = \vec{a}

2. 矩阵导数的基本公式

(aTXb)X=abT=abRn×n\frac{\partial (\vec{a}^T X \vec{b})}{\partial X} = \vec{a} \vec{b}^T = \vec{a} \otimes \vec{b} \in \mathbb{R}^{n \times n}

(aTXTb)X=baT=baRn×n\frac{\partial (\vec{a}^T X^T \vec{b})}{\partial X} = \vec{b} \vec{a}^T = \vec{b} \otimes \vec{a} \in \mathbb{R}^{n \times n}

(aTXa)X=(aTXTa)X=aa\frac{\partial (\vec{a}^T X \vec{a})}{\partial X} = \frac{\partial (\vec{a}^T X^T \vec{a})}{\partial X} = \vec{a} \otimes \vec{a}

(aTXTXb)X=X(ab+ba)\frac{\partial (\vec{a}^T X^T X \vec{b})}{\partial X} = X (\vec{a} \otimes \vec{b} + \vec{b} \otimes \vec{a})

3. 复合函数的导数

[(Ax+a)TC(Bx+b)]x=ATC(Bx+b)+BTC(Ax+a)\frac{\partial [(\mathbf{A} \vec{x} + \vec{a})^T \mathbf{C} (\mathbf{B} \vec{x} + \vec{b})]}{\partial \vec{x}} = \mathbf{A}^T \mathbf{C} (\mathbf{B} \vec{x} + \vec{b}) + \mathbf{B}^T \mathbf{C} (\mathbf{A} \vec{x} + \vec{a})

(xTAx)x=(A+AT)x\frac{\partial (\vec{x}^T \mathbf{A} \vec{x})}{\partial \vec{x}} = (\mathbf{A} + \mathbf{A}^T) \vec{x}

[(Xb+c)TA(Xb+c)]X=(A+AT)(Xb+c)bT\frac{\partial [(\mathbf{X} \vec{b} + \vec{c})^T \mathbf{A} (\mathbf{X} \vec{b} + \vec{c})]}{\partial X} = (\mathbf{A} + \mathbf{A}^T)(\mathbf{X} \vec{b} + \vec{c}) \vec{b}^T

(bTXTAXc)X=ATXbcT+AXcbT\frac{\partial (\vec{b}^T X^T \mathbf{A} X \vec{c})}{\partial X} = \mathbf{A}^T \mathbf{X} \vec{b} \vec{c}^T + \mathbf{A} X \vec{c} \vec{b}^T

逐(zhú)元

1. 逐元向量函数

如果 f f 是一元函数,则:

  • 逐元向量函数为:

f(x)=(f(x1),f(x2),,f(xn))T f(\vec{x}) = (f(x_1),f(x_2),\cdots,f(x_n))^T

  • 逐矩阵函数为:

f(X)=[f(x1,1)f(x1,2)f(x1,n)f(x2,1)f(x2,2)f(x2,n)f(xm,1)f(xm,2)f(xm,n)] f(\mathbf{X}) = \begin{bmatrix} f(x_{1,1}) & f(x_{1,2}) & \cdots & f(x_{1,n}) \\ f(x_{2,1}) & f(x_{2,2}) & \cdots & f(x_{2,n}) \\ \vdots & \vdots & \ddots & \vdots \\ f(x_{m,1}) & f(x_{m,2}) & \cdots & f(x_{m,n}) \end{bmatrix}

问题:???
我看网上有人说://需要考证
逐元向量函数为

f(X)=[f(x1,1)f(x1,2)f(x1,n)f(x2,1)f(x2,2)f(x2,n)f(xm,1)f(xm,2)f(xm,n)] f(\mathbf{X}) = \begin{bmatrix} f(x_{1,1}) & f(x_{1,2}) & \cdots & f(x_{1,n}) \\ f(x_{2,1}) & f(x_{2,2}) & \cdots & f(x_{2,n}) \\ \vdots & \vdots & \ddots & \vdots \\ f(x_{m,1}) & f(x_{m,2}) & \cdots & f(x_{m,n}) \end{bmatrix}

2. 逐元导数

逐元导数分别为:

f(x)=(f(x1),f(x2),,f(xn))T f'(\vec{x}) = (f'(x_1),f'(x_2),\cdots,f'(x_n))^T

f(X)=[f(x11)f(x12)f(x1n)f(x21)f(x22)f(x2n)f(xm1)f(xm2)f(xmn)] f'(\mathbf{X}) = \begin{bmatrix} f'(x_{11}) & f'(x_{12}) & \cdots & f'(x_{1n}) \\ f'(x_{21}) & f'(x_{22}) & \cdots & f'(x_{2n}) \\ \vdots & \vdots & \ddots & \vdots \\ f'(x_{m1}) & f'(x_{m2}) & \cdots & f'(x_{mn}) \end{bmatrix}

总结

逐元向量函数:将一元函数 f f 应用于矩阵 X\mathbf{X} 的每个元素。
逐元导数:将一元函数的导数 f f' 应用于矩阵 X\mathbf{X} 的每个元素。

偏导数

1. 标量对标量的偏导数

uv\frac{\partial u}{\partial v}

2. 标量对向量的偏导数

uv=(uv1,uv2,,uvn)T \frac{\partial u}{\partial \mathbf{v}} = \left( \frac{\partial u}{\partial v_1}, \frac{\partial u}{\partial v_2}, \cdots, \frac{\partial u}{\partial v_n} \right)^T

3. 标量对矩阵(m×n阶矩阵)的偏导数

uV=[uV1,1uV1,2uV1,nuV2,1uV2,2uV2,nuVm,1uVm,2uVm,n]\frac{\partial u}{\partial \mathbf{V}} = \begin{bmatrix} \frac{\partial u}{\partial V_{1,1}} & \frac{\partial u}{\partial V_{1,2}} & \cdots & \frac{\partial u}{\partial V_{1,n}} \\ \frac{\partial u}{\partial V_{2,1}} & \frac{\partial u}{\partial V_{2,2}} & \cdots & \frac{\partial u}{\partial V_{2,n}} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial u}{\partial V_{m,1}} & \frac{\partial u}{\partial V_{m,2}} & \cdots & \frac{\partial u}{\partial V_{m,n}} \end{bmatrix}

4. 向量(m维向量)对标量的偏导数

uv=(u1v,u2v,,umv)T\frac{\partial \mathbf{u}}{\partial v} = \left( \frac{\partial u_1}{\partial v}, \frac{\partial u_2}{\partial v}, \cdots, \frac{\partial u_m}{\partial v} \right)^T

5. 向量(m维向量)对向量(n维向量)的偏导数(雅可比矩阵,行优先)

uv=[u1v1u1v2u1vnu2v1u2v2u2vnumv1umv2umvn]\frac{\partial \mathbf{u}}{\partial \mathbf{v}} = \begin{bmatrix} \frac{\partial u_1}{\partial v_1} & \frac{\partial u_1}{\partial v_2} & \cdots & \frac{\partial u_1}{\partial v_n} \\ \frac{\partial u_2}{\partial v_1} & \frac{\partial u_2}{\partial v_2} & \cdots & \frac{\partial u_2}{\partial v_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial u_m}{\partial v_1} & \frac{\partial u_m}{\partial v_2} & \cdots & \frac{\partial u_m}{\partial v_n} \end{bmatrix}

如果为列优先,则需要将上面矩阵的进行转置。

6. 矩阵(m×n阶矩阵)对标量的偏导数

Uv=[U1,1vU1,2vU1,nvU2,1vU2,2vU2,nvUm,1vUm,2vUm,nv]\frac{\partial \mathbf{U}}{\partial v} = \begin{bmatrix} \frac{\partial U_{1,1}}{\partial v} & \frac{\partial U_{1,2}}{\partial v} & \cdots & \frac{\partial U_{1,n}}{\partial v} \\ \frac{\partial U_{2,1}}{\partial v} & \frac{\partial U_{2,2}}{\partial v} & \cdots & \frac{\partial U_{2,n}}{\partial v} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial U_{m,1}}{\partial v} & \frac{\partial U_{m,2}}{\partial v} & \cdots & \frac{\partial U_{m,n}}{\partial v} \end{bmatrix}

对于矩阵的迹,有下列偏导数成立

tr(AXB)X=ATBT\frac{\partial \text{tr}(AXB)}{\partial X} = A^T B^T

tr(AXTB)X=BA\frac{\partial \text{tr}(AX^T B)}{\partial X} = B A

tr(AX)X=AT\frac{\partial \text{tr}(A \otimes X)}{\partial X} = A^T

tr(AXBX)X=ATXTBT+BTXAT\frac{\partial \text{tr}(AXBX)}{\partial X} = A^T X^T B^T + B^T X A^T

tr(XTBXC)X=BXC+BTXC\frac{\partial \text{tr}(X^T B X C)}{\partial X} = B X C + B^T X C

tr(CTXTBXC)X=(B+BT)XCCT\frac{\partial \text{tr}(C^T X^T B X C)}{\partial X} = (B + B^T) X C C^T

tr(AXBXTC)X=ATCTXBT+CAXB\frac{\partial \text{tr}(AXBX^T C)}{\partial X} = A^T C^T X B^T + C A X B

tr((AXB+C)(AXB+C))X=2AT(AXB+C)BT\frac{\partial \text{tr}((AXB + C)(AXB + C))}{\partial X} = 2 A^T (AXB + C) B^T

tr(f(X))X=(f(X))T \frac{\partial \text{tr}(f(X))}{\partial X} = (f'(\partial X))^T