1 | from __future__ import print_function |
Import success.
Torch Version:0.2.1+a4fc05a
LSTM - Parameters
input_size – The number of expected features in the input x
hidden_size – The number of features in the hidden state h
num_layers – Number of recurrent layers.
bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature)
dropout – If non-zero, introduces a dropout layer on the outputs of each RNN layer except the last layer
bidirectional – If True, becomes a bidirectional RNN. Default: False
1 | input = Variable(torch.randn(4,3,5)) # (seq_len, batch, input_size) |
Output Variable containing:
(0 ,.,.) =
0.0798 -0.3524 -0.0318 -0.2497 0.1932 -0.0912 0.0585
0.0715 0.1636 0.2120 0.2547 -0.3115 0.4829 0.0978
0.1510 -0.4919 -0.0118 0.1087 -0.2944 -0.0191 0.4525
(1 ,.,.) =
0.0460 -0.1349 0.1419 -0.1101 0.0451 -0.1520 -0.0424
0.0307 0.0099 0.1939 0.1564 -0.2541 0.1459 -0.0038
0.1022 -0.2297 0.1253 0.0450 -0.1736 -0.0761 0.1345
(2 ,.,.) =
0.0382 -0.1091 0.1898 -0.0535 -0.0493 -0.1513 -0.1140
0.0181 -0.0784 0.1801 0.0833 -0.2366 0.0152 -0.1006
0.0564 -0.1635 0.1952 -0.0127 -0.1812 -0.0981 -0.0383
(3 ,.,.) =
0.0269 -0.1075 0.2085 -0.0134 -0.1187 -0.1357 -0.1500
0.0225 -0.1104 0.2012 0.0484 -0.2287 -0.0512 -0.1547
0.0267 -0.1547 0.1928 -0.0397 -0.1941 -0.0921 -0.1526
[torch.FloatTensor of size 4x3x7]
h_n Variable containing:
(0 ,.,.) =
0.1074 -0.0985 0.0460 0.0892 0.0026 -0.0203 0.0786
0.0937 -0.0193 -0.0566 0.1505 0.0223 0.1503 0.0570
-0.0040 0.1010 -0.2472 -0.0768 -0.0267 0.3387 0.0787
(1 ,.,.) =
0.0269 -0.1075 0.2085 -0.0134 -0.1187 -0.1357 -0.1500
0.0225 -0.1104 0.2012 0.0484 -0.2287 -0.0512 -0.1547
0.0267 -0.1547 0.1928 -0.0397 -0.1941 -0.0921 -0.1526
[torch.FloatTensor of size 2x3x7]
c_n Variable containing:
(0 ,.,.) =
0.2615 -0.3211 0.0973 0.2190 0.0056 -0.0338 0.1595
0.3154 -0.0683 -0.1427 0.2748 0.0584 0.3398 0.1123
-0.0157 0.3064 -0.3791 -0.1399 -0.0585 0.8162 0.1604
(1 ,.,.) =
0.0552 -0.2677 0.4853 -0.0205 -0.2206 -0.2210 -0.2661
0.0474 -0.2559 0.5212 0.0737 -0.4361 -0.0798 -0.2621
0.0541 -0.3830 0.5064 -0.0609 -0.3632 -0.1520 -0.2786
[torch.FloatTensor of size 2x3x7]
Bi-LSTM
Add parameter
bidirectional
=True
and take care ofnum_directions
is double
other parameters are the same asLSTM
1 | h0 = Variable(torch.randn(2*2,3,7)) # (num_layers * num_directions, batch, hidden_size) |
Output Variable containing:
(0 ,.,.) =
Columns 0 to 8
-0.0328 0.1329 -0.1598 0.0962 -0.2130 0.2678 -0.0482 0.0515 -0.0290
-0.4693 -0.1502 -0.0117 0.0687 0.6282 -0.1565 0.7240 0.0644 -0.0035
-0.1117 0.3106 -0.0701 -0.0226 -0.5075 0.1602 -0.1437 0.0071 0.1246
Columns 9 to 13
0.0238 0.0820 -0.1239 -0.0197 0.0165
-0.2156 0.0808 0.0332 0.1923 -0.0358
-0.1882 -0.1621 0.0670 0.0959 0.0796
(1 ,.,.) =
Columns 0 to 8
-0.0134 0.0914 -0.1416 0.1550 -0.0034 0.1058 -0.0382 0.0279 -0.0681
-0.3385 -0.0606 -0.1338 -0.0166 0.3892 -0.1016 0.3074 0.1098 -0.0233
-0.1066 0.1902 -0.1043 0.0515 -0.2649 0.0779 -0.1027 0.0279 0.0465
Columns 9 to 13
0.0358 0.1217 -0.1425 -0.0235 0.0493
-0.2052 0.0850 0.0339 0.1756 -0.0640
-0.1521 -0.2092 0.0313 0.0631 0.0642
(2 ,.,.) =
Columns 0 to 8
0.0486 0.0285 -0.1354 0.1621 0.1405 0.0095 -0.0334 0.0275 -0.1737
-0.2761 -0.0289 -0.1539 -0.0745 0.3068 -0.1381 0.1749 0.1334 0.0352
-0.0790 0.1042 -0.1282 0.1465 -0.0645 -0.0118 -0.0119 0.0132 -0.0647
Columns 9 to 13
0.0710 0.2234 -0.1106 -0.0286 0.1901
-0.1466 0.1494 0.0337 0.2215 -0.1346
-0.1716 -0.2517 0.0648 0.0666 0.0637
(3 ,.,.) =
Columns 0 to 8
0.1408 -0.0332 -0.1137 0.1926 0.2436 -0.0803 -0.0155 0.0165 -0.3842
-0.2130 0.0105 -0.1439 -0.1528 0.2133 -0.1947 0.0306 0.3092 0.1739
-0.0390 0.0827 -0.1251 0.2153 0.0422 -0.0405 -0.0032 -0.0257 -0.0834
Columns 9 to 13
0.2120 0.5565 0.0042 -0.0354 0.4640
-0.0739 0.2532 -0.0068 0.3069 -0.2176
-0.1889 -0.3534 -0.0101 0.1968 -0.0465
[torch.FloatTensor of size 4x3x14]
h_n Variable containing:
(0 ,.,.) =
-0.0627 0.2550 0.3279 0.0315 -0.1899 -0.0517 -0.1582
-0.1523 0.0896 0.2832 -0.0463 -0.0302 0.0724 -0.0680
0.1372 0.2027 0.2818 0.0424 -0.2472 -0.0850 -0.2549
(1 ,.,.) =
-0.1606 -0.1595 -0.1187 0.0032 -0.0036 0.0083 -0.1326
0.1111 -0.1617 0.1805 -0.0115 -0.1544 0.0465 -0.1717
0.0162 0.1585 0.2094 -0.1939 -0.0532 0.0590 -0.1576
(2 ,.,.) =
0.1408 -0.0332 -0.1137 0.1926 0.2436 -0.0803 -0.0155
-0.2130 0.0105 -0.1439 -0.1528 0.2133 -0.1947 0.0306
-0.0390 0.0827 -0.1251 0.2153 0.0422 -0.0405 -0.0032
(3 ,.,.) =
0.0515 -0.0290 0.0238 0.0820 -0.1239 -0.0197 0.0165
0.0644 -0.0035 -0.2156 0.0808 0.0332 0.1923 -0.0358
0.0071 0.1246 -0.1882 -0.1621 0.0670 0.0959 0.0796
[torch.FloatTensor of size 4x3x7]
c_n Variable containing:
(0 ,.,.) =
-0.1302 0.6549 0.6786 0.3079 -0.5095 -0.1389 -0.4543
-0.2786 0.1790 0.5202 -0.1162 -0.0607 0.1905 -0.1051
0.3220 0.5443 0.7510 0.1576 -0.5975 -0.2758 -0.3876
(1 ,.,.) =
-0.3360 -0.2363 -0.2426 0.0050 -0.0094 0.0226 -0.4114
0.2075 -0.2975 0.3651 -0.0201 -0.4975 0.0950 -0.2769
0.0260 0.2856 0.3388 -0.3938 -0.2133 0.1107 -0.1942
(2 ,.,.) =
0.2412 -0.0912 -0.2701 0.4474 0.3928 -0.1122 -0.0346
-0.5209 0.0326 -0.2324 -0.2867 0.4848 -0.3340 0.0575
-0.0617 0.2354 -0.2384 0.4614 0.0843 -0.0706 -0.0074
(3 ,.,.) =
0.1120 -0.0555 0.0582 0.1377 -0.2276 -0.0279 0.0292
0.1428 -0.0056 -0.5134 0.1684 0.0564 0.3211 -0.0665
0.0190 0.2116 -0.4337 -0.3472 0.1136 0.1530 0.1594
[torch.FloatTensor of size 4x3x7]
LSTMCell
show how does one single cell work
1 | rnn = nn.LSTMCell(5, 8) |
H_0's Variable containing:
0.0161 0.0820 0.1264 0.5698 0.1555 0.1019 -0.6331 0.4666
-0.0466 -0.3108 -0.3889 -0.0909 0.0535 -0.0389 -0.0661 -0.1533
-0.0957 0.4769 -0.5832 0.2228 0.0177 -0.1251 -0.1822 -0.5942
[torch.FloatTensor of size 3x8]
H_1's Variable containing:
-0.0217 0.0698 0.1196 0.3380 0.1997 0.0103 -0.3124 0.1821
-0.0285 -0.1948 -0.2075 0.0533 -0.1547 0.2410 -0.1686 -0.1814
-0.0989 0.1679 -0.5986 0.3037 -0.0145 0.1371 -0.1285 -0.3058
[torch.FloatTensor of size 3x8]
H_2's Variable containing:
-0.1365 0.1022 0.0917 0.3858 0.1318 0.0406 -0.2220 0.1262
0.0806 -0.0453 -0.1894 0.1336 -0.0463 0.2549 -0.1588 -0.4024
0.0784 0.0149 -0.3712 0.2240 0.0573 -0.0053 -0.2268 -0.2647
[torch.FloatTensor of size 3x8]
H_3's Variable containing:
0.0085 0.0597 0.0405 0.2158 0.0499 0.0835 -0.3906 -0.0203
0.0866 0.0170 -0.1895 0.1637 -0.1015 0.3329 -0.2589 -0.3394
0.0964 -0.0598 -0.2832 0.2294 0.0313 0.1325 -0.1107 -0.3527
[torch.FloatTensor of size 3x8]
Dropout Layers
- Input: Any. Input can be of any shape
- Output: Same. Output is of the same shape as input
1 | m = nn.Dropout(p=0.3) |
Variable containing:
1.3842 -0.1793 1.5022 -1.2981 1.9774 -0.0000 2.1460 -0.0000
-2.2350 0.9031 -0.2413 0.0366 -3.1928 1.2902 -0.0000 0.0522
-0.0988 0.6401 0.8262 -0.4700 -0.0000 0.9144 0.0000 -0.6714
0.4723 0.0987 -0.1887 1.6187 0.6748 0.0000 -0.2696 2.3124
1.6831 0.1688 0.8709 1.0857 2.4044 0.0000 0.0000 1.5510
[torch.FloatTensor of size 5x8]
Variable containing:
0.7550 0.9882 -0.9701 -0.1685 1.1041 1.3049 -1.0595 0.3090
-1.9965 0.8511 -0.0932 -1.7224 -1.2648 -1.0595 0.3738 -1.0288
-0.3788 -2.1000 1.0952 -0.9518 0.1280 -1.3539 -1.0595 -0.3653
-1.4031 0.5019 -0.6636 -0.0361 -1.0595 0.8862 -0.1172 0.4230
-1.3436 -1.0563 1.4147 0.5900 -0.7027 -0.4553 1.6721 0.9620
[torch.FloatTensor of size 5x8]
Padding Layers
N_Batches x Channels x Height x Width
- $ (N, C, H, W) \rightarrow (N, C, H_{out}, W_{out}) $
- $ H_{out} = H_{in} + paddingTop + paddingBottom $
- $ W_{out} = W_{in} + paddingLeft + paddingRight $
1 | # Only 4D and 5D padding is supported for now |
Variable containing:
(0 ,0 ,.,.) =
2.3333 2.3333 2.3333 2.3333 2.3333 2.3333
2.3333 2.0598 0.5779 0.7410 -0.2043 2.3333
2.3333 2.0359 -1.6858 0.4359 0.3211 2.3333
2.3333 0.3481 0.5727 0.5786 -0.7968 2.3333
2.3333 2.3333 2.3333 2.3333 2.3333 2.3333
(0 ,1 ,.,.) =
2.3333 2.3333 2.3333 2.3333 2.3333 2.3333
2.3333 1.1789 -0.4450 0.4749 0.2136 2.3333
2.3333 1.2923 0.8678 1.6216 -0.1105 2.3333
2.3333 2.1250 0.8989 -0.2381 1.7026 2.3333
2.3333 2.3333 2.3333 2.3333 2.3333 2.3333
[torch.FloatTensor of size 1x2x5x6]
=======================
Variable containing:
(0 ,0 ,.,.) =
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 2.0598 0.5779 0.7410 -0.2043 0.0000
0.0000 2.0359 -1.6858 0.4359 0.3211 0.0000
0.0000 0.3481 0.5727 0.5786 -0.7968 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
(0 ,1 ,.,.) =
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 1.1789 -0.4450 0.4749 0.2136 0.0000
0.0000 1.2923 0.8678 1.6216 -0.1105 0.0000
0.0000 2.1250 0.8989 -0.2381 1.7026 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
[torch.FloatTensor of size 1x2x7x6]
Non-linear Activations
- nn.ReLu(inplace=False)
- ${ReLU}(x)= max(0, x)$
- nn.Softmax()
- $f_i(x) = exp(x_i) / sum_j exp(x_j)$
- nn.Sigmoid()
- $f(x) = 1 / ( 1 + exp(-x))$
- nn.Tanh()
- $f(x) = (exp(x) - exp(-x)) / (exp(x) + exp(-x))$
- nn.Threshold(threshold, value, inplace=False)
1 | input = autograd.Variable(torch.randn(2, 3)) |
Variable containing:
0.2629 -0.5756 -0.4757
-0.2046 -0.1826 0.5311
[torch.FloatTensor of size 2x3]
ReLU Variable containing:
0.2629 0.0000 0.0000
0.0000 0.0000 0.5311
[torch.FloatTensor of size 2x3]
SoftMax Variable containing:
0.5235 0.2264 0.2501
0.2434 0.2488 0.5079
[torch.FloatTensor of size 2x3]
1 | import torch |
<class 'torch.FloatTensor'>
None
Variable containing:
-1.3514 -0.1052 1.0056 0.2811 -0.3309
0.5037 0.9949 -1.6392 -0.4351 1.2254
0.2477 -0.0502 0.4510 0.7238 0.1114
0.4799 0.3167 -0.6135 -0.4998 0.2620
1.0254 0.7146 -0.5500 0.3868 0.1841
0.1149 -0.0351 -0.3343 -0.4571 0.3408
-0.2435 -0.3256 -0.8101 -1.4030 0.4093
0.8297 -0.1577 -1.8171 -0.7431 1.0062
0.0229 0.1829 -0.4641 -0.4319 0.2729
-1.2153 -1.2480 0.6714 -0.4719 -0.4976
-0.7302 -0.0150 0.6535 0.0073 -0.0176
0.1842 -0.8359 -0.1110 -0.3290 -0.2575
1.0419 1.0069 -2.1212 -1.4792 1.2291
1.1946 1.1317 0.0296 1.1031 0.2735
0.4553 0.2371 0.4601 0.9679 0.0660
-1.1472 -0.2064 1.0872 -0.3853 -0.5404
0.7875 0.4278 -0.1380 0.7322 0.0687
1.9909 1.2813 -1.6926 -0.0396 1.0175
0.8311 0.6657 -0.8842 -0.3210 0.5822
-0.1637 -0.3244 0.3780 0.1150 -0.2662
[torch.FloatTensor of size 20x5]
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
[torch.FloatTensor of size 20x5]
Combine layers with senquential
1 | # Example of using Sequential |
How to train a model with GPU
Example for training LR model with GPU
batch_cpu = Variable(torch.from_numpy(x[idx])).float()
batch = batch_cpu.cuda() # 很重要
target_cpu = Variable(torch.from_numpy(y[idx])).float()
target = target_cpu.cuda() # 很重要
1 | import matplotlib.pyplot as plt |
<generator object Module.parameters at 0x000000A21BAEB830>
Loss at epoch[0]: 3.325
Loss at epoch[10]: 2.538
Loss at epoch[20]: 2.621
Loss at epoch[30]: 2.538
Loss at epoch[40]: 2.584
Loss at epoch[50]: 1.451
Loss at epoch[60]: 0.588
Loss at epoch[70]: 3.073
Loss at epoch[80]: 2.032
Loss at epoch[90]: 0.864
Loss at epoch[100]: 0.594