#1. 체인룰과 역전파기법

주어진 자료가 아래와 같다고 하자.

  • ${\bf X} = \begin{bmatrix} 1 & 2.1 \\ 1 & 3.0 \end{bmatrix}$

  • ${\bf y} = \begin{bmatrix} 3.0 \\ 5.0 \end{bmatrix}$

손실함수의 정의가 아래와 같다고 하자.

$$loss={\bf v}^\top {\bf v}$$

이때 ${\bf v}= {\bf y}-{\bf u}$ 이고 ${\bf u}= {\bf X}{\bf W}$ 이다. ${\bf W} =\begin{bmatrix} 0.5 \\ 0.6 \end{bmatrix}$ 지점에서의 $\frac{\partial}{\partial {\bf W}}loss$를 역전파 기법을 이용하여 구하고 파이토치의 backward()를 이용하여 검증하라. 즉 (1)-(6)을 계산하라.

(1) 파이토치를 이용하여 순전파를 계산하라. 즉 ${\bf u}$를 계산하라.

ones= torch.ones(2)
x = torch.tensor([2.1,3.0])
X = torch.vstack([ones,x]).T
y = torch.tensor([3.0,5.0])
W = torch.tensor([0.5,0.6]) 
u = X@W 
u
tensor([1.7600, 2.3000])

(2) 파이토치를 이용하여 오차를 계산하라. 즉 ${\bf v}$를 계산하라.

v= y-u
v
tensor([1.2400, 2.7000])

(3) 파이토치를 이용하여 오차제곱합을 계산하라. 즉 $loss={\bf v}^\top {\bf v}$를 계산하라.

loss=v.T@v
loss
tensor(8.8276)

(4) $\frac{\partial}{\partial {\bf v}} loss$ 를 해석적으로 계산하라(=이론적인 값을 계산하라). 파이토치를 이용하여 검증하라.

v
tensor([1.2400, 2.7000])
_v= torch.tensor([1.2400, 2.7000],requires_grad=True)
_loss = _v.T @ _v 
_loss.backward() 
_v.grad.data, 2*v 
(tensor([2.4800, 5.4000]), tensor([2.4800, 5.4000]))

(5)$\frac{\partial }{\partial {\bf u}}{\bf v}^\top$와 $\frac{\partial }{\partial {\bf W}}{\bf u}^\top$의 값을 해석적으로 계산하라. (파이토치를 이용한 검증은 불필요)

A = torch.zeros((2,2))
for i in range(2): 
    _u = torch.tensor([2.1,3.0],requires_grad=True)
    _v = (y-_u)[i]
    _v.backward()
    A[:,i]= _u.grad.data
A
tensor([[-1., -0.],
        [-0., -1.]])
B = torch.zeros((2,2))
for i in range(2): 
    _W = torch.tensor([0.5,0.6],requires_grad=True)
    _u = (X@_W)[i]
    _u.backward()
    B[:,i]= _W.grad.data
B
tensor([[1.0000, 1.0000],
        [2.1000, 3.0000]])

(6) (4)~(5)의 결과와 체인룰을 이용하여 $\frac{\partial}{\partial {\bf W}}loss$를 계산하라. 그리고 아래의 코드를 이용하여 검증하라.

import torch
ones= torch.ones(2)
x = torch.tensor([2.1,3.0])
X = torch.vstack([ones,x]).T
y = torch.tensor([3.0,5.0])
W = torch.tensor([0.5,0.6],requires_grad=True) 
loss = (y-X@W).T @ (y-X@W)
loss.backward()
W.grad.data
A @ B @ _v.grad.data
tensor([ -7.8800, -21.4080])
X.T@-torch.eye(2)@(2*v)
tensor([ -7.8800, -21.4080])
import torch
ones= torch.ones(2)
x = torch.tensor([2.1,3.0])
X = torch.vstack([ones,x]).T
y = torch.tensor([3.0,5.0])
W = torch.tensor([0.5,0.6],requires_grad=True) 
loss = (y-X@W).T @ (y-X@W)
loss.backward()
W.grad.data
tensor([ -7.8800, -21.4080])

#2. 음료추천

아래는 200명의 사용자가 차가운커피, 따뜻한커피, 차가운홍차, 따듯한홍차 각 10종씩을 먹고 평점을 넣은 자료이다.

import pandas as pd 
import torch 
from fastai.collab import * 
from fastai.tabular.all import * 
df = pd.read_csv("https://raw.githubusercontent.com/guebin/2021BDA/master/_notebooks/2021-12-04-recommend.csv")
df
user item rating item_name
0 1 27 2.677878 차가운홍차7
1 1 28 2.382410 차가운홍차8
2 1 38 0.952034 따뜻한홍차8
3 1 21 2.359307 차가운홍차1
4 1 24 2.447412 차가운홍차4
... ... ... ... ...
3995 200 28 2.401077 차가운홍차8
3996 200 31 3.798483 따뜻한홍차1
3997 200 22 2.104705 차가운홍차2
3998 200 26 2.248165 차가운홍차6
3999 200 39 4.007320 따뜻한홍차9

4000 rows × 4 columns

(1) user-item matrix 생성하라.

생성예시는 아래와 같다.

차가운커피1 차가운커피2 차가운커피3 차가운커피4 차가운커피5 차가운커피6 차가운커피7 차가운커피8 차가운커피9 차가운커피10 따듯한커피1 따듯한커피2 따듯한커피3 따듯한커피4 따듯한커피5 따듯한커피6 따듯한커피7 따듯한커피8 따듯한커피9 따듯한커피10 차가운홍차1 차가운홍차2 차가운홍차3 차가운홍차4 차가운홍차5 차가운홍차6 차가운홍차7 차가운홍차8 차가운홍차9 차가운홍차10 따뜻한홍차1 따뜻한홍차2 따뜻한홍차3 따뜻한홍차4 따뜻한홍차5 따뜻한홍차6 따뜻한홍차7 따뜻한홍차8 따뜻한홍차9 따뜻한홍차10
user1 None 3.937672 None 3.989888 4.133222 None None None None 4.015579 2.103387 2.361724 None 2.273406 2.295347 None None None 2.791477 None 2.359307 2.565654 None 2.447412 None None 2.677878 2.38241 2.194201 None None None None None 0.887225 1.014088 None 0.952034 0.658081 1.235058
user2 4.098147 4.094224 None 3.765555 None None 3.988153 None 4.349755 3.640496 None None 2.707521 2.765143 2.310812 2.458836 None None None 2.22282 2.621137 None 2.510424 None None None 2.788081 None 2.404252 2.908625 None 1.400812 None 0.654011 None 1.129268 None None 0.703928 None
user3 3.819119 None 4.228748 3.79414 None 4.08909 3.776395 None 4.583121 None None 2.7361 None 2.219188 None None None None 2.791662 None 2.729578 None None None None None None None 2.494008 2.440778 0.695669 None 0.840201 0.960158 None 1.019722 1.287193 1.354343 1.237186 0.985125
user4 4.243031 3.985556 4.3557 4.200771 None 4.068798 None None None 4.149567 None None 2.466804 None None 2.104525 2.341672 2.463411 2.56218 None None None 2.37737 2.37356 None 2.317104 2.5877 None None None 1.014652 None None None None None 1.09685 0.664659 1.148056 1.302336
user5 3.855109 None None None None 3.772252 4.18115 4.077935 None 3.905809 2.566041 2.412227 None None None 2.715758 None None 2.651073 None 2.454781 2.654822 2.382804 None None None 2.599824 None None None 0.851721 1.313315 None 1.093123 None 0.759305 1.336896 None 0.742396 1.064772
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
user196 0.788662 0.704273 0.776555 0.8481 None None None 0.686273 None None 2.164656 2.549222 2.614974 None None None None None 2.51912 2.355786 2.509917 2.382942 2.494133 None None None None None 2.457732 None 4.014754 4.184846 None 4.126758 None None 4.364885 None 3.767153 4.405117
user197 1.303235 1.43626 1.00433 None None None 1.486788 1.295232 None 0.920782 2.511827 None 2.361798 None 2.354619 None None None None 2.21937 2.401316 None None None None None 2.793289 None 2.464333 2.426258 4.253895 None None 4.369466 None 3.996908 3.853673 None 3.917286 4.57724
user198 1.251698 None 1.017147 None None None None None None 0.806444 None 2.520115 2.646957 None 2.952988 None None 2.190244 None None 2.282611 None 2.480411 2.663661 2.402259 None None 2.708267 2.109672 2.824608 4.380199 4.022162 None 3.895619 None 3.887536 None 3.862879 None 4.261574
user199 1.007993 None 0.955789 None 0.846838 None 0.58893 1.046728 None 1.139212 2.739859 2.459454 None None None 2.430707 None 2.413188 2.608065 None None 2.764538 2.389897 2.29379 None 2.428555 2.406729 2.507149 None None None 4.039527 None None 3.837071 4.103043 None None None None
user200 0.717826 None 1.23011 None 0.994098 None None None 1.14695 None None None None None None 2.487716 2.56307 None None 2.300041 2.552453 2.104705 2.862709 2.416833 None 2.248165 2.401267 2.401077 None 2.21877 3.798483 None 4.224537 None None 4.117838 None 3.920277 4.00732 None

df2 = pd.DataFrame([[None]*40]*200,columns=['차가운커피'+str(i) for i in range(1,11)]+['따뜻한커피'+str(i) for i in range(1,11)]+['차가운홍차'+str(i) for i in range(1,11)]+['따뜻한홍차'+str(i) for i in range(1,11)]) 
df2.index = pd.Index(['user'+str(i) for i in range(1,201)])
df2
차가운커피1 차가운커피2 차가운커피3 차가운커피4 차가운커피5 차가운커피6 차가운커피7 차가운커피8 차가운커피9 차가운커피10 ... 따뜻한홍차1 따뜻한홍차2 따뜻한홍차3 따뜻한홍차4 따뜻한홍차5 따뜻한홍차6 따뜻한홍차7 따뜻한홍차8 따뜻한홍차9 따뜻한홍차10
user1 None None None None None None None None None None ... None None None None None None None None None None
user2 None None None None None None None None None None ... None None None None None None None None None None
user3 None None None None None None None None None None ... None None None None None None None None None None
user4 None None None None None None None None None None ... None None None None None None None None None None
user5 None None None None None None None None None None ... None None None None None None None None None None
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
user196 None None None None None None None None None None ... None None None None None None None None None None
user197 None None None None None None None None None None ... None None None None None None None None None None
user198 None None None None None None None None None None ... None None None None None None None None None None
user199 None None None None None None None None None None ... None None None None None None None None None None
user200 None None None None None None None None None None ... None None None None None None None None None None

200 rows × 40 columns

for (i,j) in zip(df.user.to_list(), df.item.to_list()):
    df2.iloc[i-1,j-1]=df.query('user == @i and item == @j')['rating'].to_list()[0]
df2
차가운커피1 차가운커피2 차가운커피3 차가운커피4 차가운커피5 차가운커피6 차가운커피7 차가운커피8 차가운커피9 차가운커피10 ... 따뜻한홍차1 따뜻한홍차2 따뜻한홍차3 따뜻한홍차4 따뜻한홍차5 따뜻한홍차6 따뜻한홍차7 따뜻한홍차8 따뜻한홍차9 따뜻한홍차10
user1 None 3.937672 None 3.989888 4.133222 None None None None 4.015579 ... None None None None 0.887225 1.014088 None 0.952034 0.658081 1.235058
user2 4.098147 4.094224 None 3.765555 None None 3.988153 None 4.349755 3.640496 ... None 1.400812 None 0.654011 None 1.129268 None None 0.703928 None
user3 3.819119 None 4.228748 3.79414 None 4.08909 3.776395 None 4.583121 None ... 0.695669 None 0.840201 0.960158 None 1.019722 1.287193 1.354343 1.237186 0.985125
user4 4.243031 3.985556 4.3557 4.200771 None 4.068798 None None None 4.149567 ... 1.014652 None None None None None 1.09685 0.664659 1.148056 1.302336
user5 3.855109 None None None None 3.772252 4.18115 4.077935 None 3.905809 ... 0.851721 1.313315 None 1.093123 None 0.759305 1.336896 None 0.742396 1.064772
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
user196 0.788662 0.704273 0.776555 0.8481 None None None 0.686273 None None ... 4.014754 4.184846 None 4.126758 None None 4.364885 None 3.767153 4.405117
user197 1.303235 1.43626 1.00433 None None None 1.486788 1.295232 None 0.920782 ... 4.253895 None None 4.369466 None 3.996908 3.853673 None 3.917286 4.57724
user198 1.251698 None 1.017147 None None None None None None 0.806444 ... 4.380199 4.022162 None 3.895619 None 3.887536 None 3.862879 None 4.261574
user199 1.007993 None 0.955789 None 0.846838 None 0.58893 1.046728 None 1.139212 ... None 4.039527 None None 3.837071 4.103043 None None None None
user200 0.717826 None 1.23011 None 0.994098 None None None 1.14695 None ... 3.798483 None 4.224537 None None 4.117838 None 3.920277 4.00732 None

200 rows × 40 columns


(2) 첫번째 유저를 평점을 조회하고 이 유저의 취향을 서술하라. 커피와 홍차중 어떤음료를 선호하는가? 따듯한 음료와 차가운 음료중 어떤 음료를 선호하는가?

df2.iloc[0,:]
차가운커피1         None
차가운커피2     3.937672
차가운커피3         None
차가운커피4     3.989888
차가운커피5     4.133222
차가운커피6         None
차가운커피7         None
차가운커피8         None
차가운커피9         None
차가운커피10    4.015579
따뜻한커피1     2.103387
따뜻한커피2     2.361724
따뜻한커피3         None
따뜻한커피4     2.273406
따뜻한커피5     2.295347
따뜻한커피6         None
따뜻한커피7         None
따뜻한커피8         None
따뜻한커피9     2.791477
따뜻한커피10        None
차가운홍차1     2.359307
차가운홍차2     2.565654
차가운홍차3         None
차가운홍차4     2.447412
차가운홍차5         None
차가운홍차6         None
차가운홍차7     2.677878
차가운홍차8      2.38241
차가운홍차9     2.194201
차가운홍차10        None
따뜻한홍차1         None
따뜻한홍차2         None
따뜻한홍차3         None
따뜻한홍차4         None
따뜻한홍차5     0.887225
따뜻한홍차6     1.014088
따뜻한홍차7         None
따뜻한홍차8     0.952034
따뜻한홍차9     0.658081
따뜻한홍차10    1.235058
Name: user1, dtype: object

따뜻한 커피보다는 차가운 커피를 더 선호하는 것 같고, 따뜻한 음료보다는 차가운 음료를 선호하는 것 같다.

(3) fastai를 이용하여 추천모형을 학습하라. (nn을 사용하지 않아도 무방하다.)

dls = CollabDataLoaders.from_df(df,bs=200)
dls.items
user item rating item_name
2030 102 11 0.957028 따듯한커피1
2689 135 17 0.974055 따듯한커피7
1652 83 26 1.042701 차가운홍차6
1063 54 27 0.965986 차가운홍차7
1873 94 24 1.136094 차가운홍차4
... ... ... ... ...
3595 180 5 1.180172 차가운커피5
3965 199 22 2.764538 차가운홍차2
2215 111 31 2.377901 따뜻한홍차1
2920 147 24 3.935108 차가운홍차4
2593 130 1 2.314916 차가운커피1

3200 rows × 4 columns

lrnr = collab_learner(dls,n_factors=4,y_range=(0,5))
lrnr.fit(30,0.01)
epoch train_loss valid_loss time
0 1.187016 1.097175 00:00
1 1.132885 0.986906 00:00
2 0.994547 0.674259 00:00
3 0.773934 0.298967 00:00
4 0.552205 0.106625 00:00
5 0.390826 0.065410 00:00
6 0.283230 0.058494 00:00
7 0.209869 0.057090 00:00
8 0.158989 0.056235 00:00
9 0.123309 0.055232 00:00
10 0.098035 0.055365 00:00
11 0.079922 0.054929 00:00
12 0.067054 0.056076 00:00
13 0.057767 0.055572 00:00
14 0.051069 0.056253 00:00
15 0.046277 0.057279 00:00
16 0.042751 0.056172 00:00
17 0.040254 0.057400 00:00
18 0.038407 0.056913 00:00
19 0.037115 0.057196 00:00
20 0.036142 0.057445 00:00
21 0.035268 0.057364 00:00
22 0.034693 0.057553 00:00
23 0.034215 0.058029 00:00
24 0.033800 0.057356 00:00
25 0.033358 0.057974 00:00
26 0.032996 0.057489 00:00
27 0.032599 0.058639 00:00
28 0.032370 0.058777 00:00
29 0.032010 0.058919 00:00

(4) (3)의 추천시스템을 이용하여 모든 음료(총40개)에 대하여 144번 유저의 fitted rating 을 구하라. 144번 유저는 어떤 취향인가?

X,y = dls.one_batch()
x144 = torch.tensor([[144,j] for j in range(1,41) ])
lrnr.model(x144.to("cuda:0"))
tensor([2.5086, 2.4434, 2.3787, 2.3671, 2.3752, 2.3470, 2.5660, 2.4663, 2.3122,
        2.5410, 1.0356, 1.1298, 1.1237, 1.0032, 1.0872, 1.0394, 0.9993, 1.0948,
        0.9525, 1.0109, 3.9269, 3.7818, 3.8785, 3.8533, 3.9777, 3.8690, 3.8268,
        3.9350, 3.8300, 3.9017, 2.3461, 2.4903, 2.4714, 2.4518, 2.4859, 2.4408,
        2.4442, 2.4606, 2.4151, 2.5207], device='cuda:0',
       grad_fn=<AddBackward0>)

따뜻한 것보다는 차가운 것을 선호하고, 커피보다는 홍차를 선호한다.

(5) 차가운커피1에 대한 모든유저(총200명)의 fitted rating을 구하라. 몇번부터 몇번까지의 유저가 차가운 커피를 선호하는가?

y200 = torch.tensor([[i,1] for i in range(1,201) ])
lrnr.model(y200.to("cuda:0"))
tensor([4.0307, 3.9342, 3.9372, 4.0616, 3.9890, 4.1917, 4.0492, 3.9227, 4.0169,
        4.0119, 3.8937, 4.0132, 3.9489, 3.9400, 4.0705, 3.8897, 3.9464, 4.0079,
        4.0021, 3.9821, 3.8712, 3.9630, 4.0100, 3.9272, 3.9372, 3.9122, 4.0236,
        3.9831, 3.9159, 4.0026, 3.8776, 4.1160, 4.0420, 3.9028, 3.8675, 3.9697,
        4.0159, 3.9916, 4.0962, 3.9461, 4.0571, 4.0299, 4.0056, 3.8416, 3.9038,
        4.0665, 4.0428, 3.9875, 3.8992, 3.9494, 2.3476, 2.3713, 2.4418, 2.4077,
        2.5568, 2.4489, 2.4081, 2.4093, 2.5145, 2.6321, 2.6104, 2.3875, 2.5014,
        2.5568, 2.3744, 2.5192, 2.5545, 2.4501, 2.5179, 2.6041, 2.5477, 2.5135,
        2.6937, 2.6739, 2.4849, 2.6132, 2.5985, 2.3677, 2.6616, 2.5425, 2.4946,
        2.3571, 2.4904, 2.5340, 2.3980, 2.5443, 2.5990, 2.7054, 2.6373, 2.8059,
        2.5896, 2.3616, 2.3839, 2.5014, 2.4420, 2.3765, 2.6235, 2.3284, 2.5282,
        2.4882, 2.5021, 2.6593, 2.4707, 2.5397, 2.5260, 2.5570, 2.7532, 2.8283,
        2.5673, 2.5643, 2.4518, 2.5750, 2.3708, 2.4444, 2.4195, 2.3766, 2.7532,
        2.7490, 2.3548, 2.4608, 2.1720, 2.6812, 2.4438, 2.6737, 2.5101, 2.6943,
        2.4171, 2.5449, 2.4094, 2.3245, 2.3912, 2.3495, 2.6789, 2.6022, 2.5185,
        2.4704, 2.4714, 2.6119, 2.6751, 2.4093, 2.6280, 2.5218, 2.4337, 2.5086,
        2.6370, 2.4458, 2.4652, 2.5439, 2.6842, 2.3492, 1.1037, 0.9217, 1.0008,
        1.0509, 1.1303, 1.0044, 0.9332, 1.0595, 1.0925, 0.9502, 1.0263, 0.9593,
        0.9552, 0.8511, 1.2146, 1.0233, 1.0399, 1.0169, 1.0579, 1.1242, 0.9761,
        0.9685, 1.1077, 0.8608, 1.0976, 1.0279, 1.0787, 1.0907, 1.0640, 0.9867,
        0.8604, 1.1144, 1.0012, 1.0196, 0.8719, 1.0122, 1.0339, 1.0094, 0.9931,
        1.0014, 1.0294, 0.9996, 1.0137, 0.9285, 1.0147, 0.8326, 1.1660, 1.0500,
        0.8367, 0.9081], device='cuda:0', grad_fn=<AddBackward0>)

55번째 user까지 차가운 커피를 선호한다.


#3. 영화추천

아래의 코드를 이용하여 자료를 받고 df를 만든뒤 물음에 답하라.

path = untar_data(URLs.ML_100k) 
ratings=pd.read_csv(path/'u.data', delimiter='\t', header=None, names=['user','movie','rating','timestamp'])
movies = pd.read_csv(path/'u.item', delimiter='|', encoding='latin-1', usecols=(0,1), names=('movie','title'), header=None)
df = ratings.merge(movies)
path = untar_data(URLs.ML_100k) 
ratings=pd.read_csv(path/'u.data', delimiter='\t', header=None, names=['user','movie','rating','timestamp'])
movies = pd.read_csv(path/'u.item', delimiter='|', encoding='latin-1', usecols=(0,1), names=('movie','title'), header=None)
df = ratings.merge(movies)
df
user movie rating timestamp title
0 196 242 3 881250949 Kolya (1996)
1 63 242 3 875747190 Kolya (1996)
2 226 242 5 883888671 Kolya (1996)
3 154 242 3 879138235 Kolya (1996)
4 306 242 5 876503793 Kolya (1996)
... ... ... ... ... ...
99995 840 1674 4 891211682 Mamma Roma (1962)
99996 655 1640 3 888474646 Eighth Day, The (1996)
99997 655 1637 3 888984255 Girls Town (1996)
99998 655 1630 3 887428735 Silence of the Palace, The (Saimt el Qusur) (1994)
99999 655 1641 3 887427810 Dadetown (1995)

100000 rows × 5 columns

(1) fastai를 이용하여 추천모형을 학습하라. (nn을 사용하지 않아도 무방하다.)

dls = CollabDataLoaders.from_df(df,bs=64,item_name='title') 
dls.show_batch()
user title rating
0 465 Big Sleep, The (1946) 3
1 535 Some Like It Hot (1959) 4
2 618 Birdcage, The (1996) 2
3 731 Miracle on 34th Street (1994) 1
4 894 Mystery Science Theater 3000: The Movie (1996) 1
5 30 Anaconda (1997) 3
6 551 While You Were Sleeping (1995) 2
7 197 Butch Cassidy and the Sundance Kid (1969) 5
8 816 Mimic (1997) 4
9 216 Shine (1996) 4
lrnr = collab_learner(dls, use_nn=True, y_range=(0,5), layers=[20,10]) 
lrnr.fit(8)
epoch train_loss valid_loss time
0 0.911590 0.926560 00:07
1 0.892155 0.892765 00:07
2 0.829315 0.878876 00:07
3 0.821508 0.872383 00:06
4 0.820626 0.873852 00:07
5 0.772432 0.879065 00:07
6 0.770707 0.882653 00:07
7 0.723726 0.900044 00:07
lrnr.show_results()
user title rating rating_pred
0 663 412 4 2.837234
1 943 1265 2 3.180738
2 883 1300 2 4.118921
3 125 864 3 2.721634
4 437 114 5 3.010027
5 404 333 4 3.980665
6 42 617 5 3.818271
7 89 1621 3 3.644624
8 648 842 1 3.735309

(2) 아래의 영화들에 대한 30번유저의 fitted rating을 구하라.

1461    Terminator 2: Judgment Day (1991)
1462               Terminator, The (1984)
x,y = dls.one_batch()
lrnr.model(torch.tensor([[30,1461]]).to("cuda:0"))
tensor([[4.6221]], device='cuda:0', grad_fn=<AddBackward0>)
lrnr.model(torch.tensor([[30,1462]]).to("cuda:0"))
tensor([[4.3273]], device='cuda:0', grad_fn=<AddBackward0>)

#4. 다음을 읽고 물음에 답하라. (O/X로 답할것)

(1) 학습이 진행됨에 따라 training loss는 줄어들지만 validation loss가 증가하는 현상을 기울기소실문제라고 한다.

X

(2) 배치정규화는 기울기소실문제를 해결하는 방법 중 하나이다.

O

(3) 기울기소실은 얕은신경망보다 깊은신경망에서 자주 발생한다.

O

(4) 역전파기법은 과적합을 방지하는 테크닉중 하나이다.

O

(5) 순전파만 계산하고 싶을 경우 GPU메모리에 각 층의 계산결과를 저장할 필요가 없다.

X