GPU가 빠른 이유 core 수가 많기 때문!
ref: https://ang-love-chang.tistory.com/33

GPU도 GPU 메모리를 초과할 경우 에러가 뜨면서 실행되지 않을걸?

X_gpu=X.to("cuda:0")
y_gpu=y.to("cuda:0")
# 전 시간의 이 방법은 X, y를 모두 GPU에 올리는 gpu 메모리를 차지했던 비효율적인 방법이었음

방법 1. 가지고 있는 데이터 중 일부만 뽑아 모형을 만들어 STEP을 밟아 나가는 과정
```
net.to("cuda:0")
```
$\hat{y}=f(X*\hat{w})$
${n*1}=f({(n*p)}*{(p*1)})$

p는 조정할 수 없지만 n은 조정할 수 있겠다는 생각.$\to$ 일부를 뽑아볼까

for epoc in range(200): 
  ## 1 
  yhat_gpu=net(X_gpu) # X_gpu를 나누고
  ## 2 
  loss= loss_fn(yhat_gpu,y_gpu) # 위 나눈 결과가 yhat_GPU에 들어가 loss가 계산됨
  ## 3 
  loss.backward() # 그대로 미분되겠지
  ## 4 
  optimizer.step()
  net.zero_grad() #모든 일부 뽑은 것들이 계산될 걸!

Pytorch 로 이 방법 써보기가 이번 수업내용!

import

import torch
from fastai.vision.all import *

Dataset

X=torch.tensor([3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0])
y=torch.tensor([1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0])

X,y

(tensor([3., 4., 5., 6., 7., 8., 9.]), tensor([1., 0., 1., 0., 1., 1., 0.]))

?torch.utils.data.TensorDataset

Init signature: torch.utils.data.TensorDataset(*args, **kwds)
Docstring:     
Dataset wrapping tensors.

Each sample will be retrieved by indexing tensors along the first dimension.

Args:
    *tensors (Tensor): tensors that have the same size of the first dimension.
File:           ~/anaconda3/envs/csy/lib/python3.8/site-packages/torch/utils/data/dataset.py
Type:           type
Subclasses:

ds=torch.utils.data.TensorDataset(X,y)

torch.utils _ ref : https://pytorch.org/docs/stable/data.html

ds ## 그냥 텐서들의 pair , tensor를 pair로 연결한 것이다?!

<torch.utils.data.dataset.TensorDataset at 0x7fb6c45324c0>

dir(ds)

['__add__',
 '__annotations__',
 '__class__',
 '__class_getitem__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattr__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__orig_bases__',
 '__parameters__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_is_protocol',
 'functions',
 'register_datapipe_as_function',
 'register_function',
 'tensors']

ds.tensors # 숨겨져 있던 것들

(tensor([3., 4., 5., 6., 7., 8., 9.]), tensor([1., 0., 1., 0., 1., 1., 0.]))

DataLoader

- batch_size=2,shuffle=True

dl=torch.utils.data.DataLoader(ds,batch_size=2,shuffle=True) # batch_size데이터를 몇 개로 나눌 건지, shuffle 데이터를 섞을 건지

dl

<torch.utils.data.dataloader.DataLoader at 0x7fb6c4532520>

dir(dl)

['_DataLoader__initialized',
 '_DataLoader__multiprocessing_context',
 '_IterableDataset_len_called',
 '__annotations__',
 '__class__',
 '__class_getitem__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__orig_bases__',
 '__parameters__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_auto_collation',
 '_dataset_kind',
 '_get_iterator',
 '_index_sampler',
 '_is_protocol',
 '_iterator',
 'batch_sampler',
 'batch_size',
 'check_worker_number_rationality',
 'collate_fn',
 'dataset',
 'drop_last',
 'generator',
 'multiprocessing_context',
 'num_workers',
 'persistent_workers',
 'pin_memory',
 'prefetch_factor',
 'sampler',
 'timeout',
 'worker_init_fn']

dl은 배치를 만드는 기능이 있어보임

저번주에 한 12396개의 데이터를 예로 들어보면 batch 1,2,3,....으로 12396개의 데이터를 나눌 것!
dl은 ds로 만든 데이터라 xy가 들어있는 모습 dl.dataset.ds에!

for xx,yy in dl: 
    print(xx,yy)

tensor([5., 4.]) tensor([1., 0.])
tensor([3., 6.]) tensor([1., 0.])
tensor([9., 7.]) tensor([0., 1.])
tensor([8.]) tensor([1.])

xx는 X에서 랜덤으로 2개가 뽑힌 것 (batch_size를 2로 정했으니까)
yy는 y에서 랜덤으로 2개가 뽑힌 것 (batch_size를 2로 정했으니까)
안 남을 때까지! 그래서 마지막은 batch_size 만족하지 못할 수도, 문제가 되는 사안은
실핼할 때마다 데이터가 바뀌는 모습 $\to$ suffle =true

- batch_size=2,shuffle=False

dl=torch.utils.data.DataLoader(ds,batch_size=2,shuffle=False)

for xx,yy in dl: 
    print(xx,yy)

tensor([3., 4.]) tensor([1., 0.])
tensor([5., 6.]) tensor([1., 0.])
tensor([7., 8.]) tensor([1., 1.])
tensor([9.]) tensor([0.])

데이터가 바뀌지 않음 $\to$ suffle=false

- batch_size=3,shuffle=True

dl=torch.utils.data.DataLoader(ds,batch_size=3,shuffle=True)

for xx,yy in dl: 
    print(xx,yy)

tensor([4., 6., 9.]) tensor([0., 0., 0.])
tensor([8., 3., 7.]) tensor([1., 1., 1.])
tensor([5.]) tensor([1.])

3개씩 뽑히는 모습

MNIST 3/7 예제

- 우선 텐서로 이루어진 X,y를 만들자.

path = untar_data(URLs.MNIST_SAMPLE)

threes=(path/'train'/'3').ls()
sevens=(path/'train'/'7').ls()

seven_tensor = torch.stack([tensor(Image.open(i)) for i in sevens]).float()/255
three_tensor = torch.stack([tensor(Image.open(i)) for i in threes]).float()/255

seven_tensor.shape, three_tensor.shape

(torch.Size([6265, 28, 28]), torch.Size([6131, 28, 28]))

X=torch.vstack([seven_tensor,three_tensor]).reshape(12396,-1) 
y=torch.tensor([0.0]*6265 + [1.0]*6131).reshape(12396,1)

X.shape

torch.Size([12396, 784])

- dataset=(X,y) 를 만들자.

ds=torch.utils.data.TensorDataset(X,y)

ds.tensors

(tensor([[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]]),
 tensor([[0.],
         [0.],
         [0.],
         ...,
         [1.],
         [1.],
         [1.]]))

- dataloader를 만들자.

dl=torch.utils.data.DataLoader(ds,batch_size=2048,shuffle=True) # defalut는 TRUE

dl.dataset.tensors

(tensor([[0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         ...,
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.],
         [0., 0., 0.,  ..., 0., 0., 0.]]),
 tensor([[0.],
         [0.],
         [0.],
         ...,
         [1.],
         [1.],
         [1.]]))

- 네트워크(아키텍처), 손실함수, 옵티마이저

torch.manual_seed(1)
net = torch.nn.Sequential(
    torch.nn.Linear(in_features=784,out_features=30),
    torch.nn.ReLU(),
    torch.nn.Linear(in_features=30,out_features=1)
    #torch.nn.Sigmoid()
)
loss_fn=torch.nn.BCEWithLogitsLoss()
optimizer=torch.optim.Adam(net.parameters())

- 저번시간 복습

for epoc in range(200):
    ## 1
    yhat=net(X) 
    ## 2 
    loss= loss_fn(yhat,y) 
    ## 3 
    loss.backward()
    ## 4 
    optimizer.step()
    net.zero_grad()

plt.plot(yhat.data,'.')

[<matplotlib.lines.Line2D at 0x7fb6cc0ff610>]

f=torch.nn.Sigmoid() # sigmoid 취한 게 아니니까 취해주자
plt.plot(f(yhat.data),'.')

[<matplotlib.lines.Line2D at 0x7fb6c57f0b50>]

- 미니배치활용

torch.manual_seed(1)
net = torch.nn.Sequential(
    torch.nn.Linear(in_features=784,out_features=30),
    torch.nn.ReLU(),
    torch.nn.Linear(in_features=30,out_features=1)
    #torch.nn.Sigmoid()
)
loss_fn=torch.nn.BCEWithLogitsLoss()
optimizer=torch.optim.Adam(net.parameters())

네트워크 파라메터 다시 초기화

12396 / 2048

6.052734375

총 7개의 미니배치가 만들어질것임 $\to$ 따라서 파라메터를 업데이트하는 횟수는 7 $\times$ epoc 임 (실제적으로는 6 $\times$ epoc)

200/6 # 6번 epoc 돌렸을때 원해는 값, 즉 33번 정도만 돌리면 되겠네?

33.333333333333336

for epoc in range(33): 
    for xx,yy in dl:  ### 총 7번돌면 끝나는 for 
        ## 1 
        yyhat=net(xx)
        ## 2 
        loss= loss_fn(yyhat,yy) 
        ## 3 
        loss.backward()
        ## 4 
        optimizer.step()
        net.zero_grad()

plt.plot(yyhat.data,'.')

[<matplotlib.lines.Line2D at 0x7fb6c467e190>]

이게 왜이러지??

- 배치사이즈를 다시 확인해보자.

for xx,yy in dl: 
    print(xx.shape,yy.shape)

torch.Size([2048, 784]) torch.Size([2048, 1])
torch.Size([2048, 784]) torch.Size([2048, 1])
torch.Size([2048, 784]) torch.Size([2048, 1])
torch.Size([2048, 784]) torch.Size([2048, 1])
torch.Size([2048, 784]) torch.Size([2048, 1])
torch.Size([2048, 784]) torch.Size([2048, 1])
torch.Size([108, 784]) torch.Size([108, 1])

- 마지막이 108개이므로 108개의 y만 그려짐

list(net.parameters())

[Parameter containing:
 tensor([[ 0.0184, -0.0158, -0.0069,  ...,  0.0068, -0.0041,  0.0025],
         [-0.0274, -0.0224, -0.0309,  ..., -0.0029,  0.0013, -0.0167],
         [ 0.0282, -0.0095, -0.0340,  ..., -0.0141,  0.0056, -0.0335],
         ...,
         [ 0.0267,  0.0186, -0.0326,  ...,  0.0047, -0.0072, -0.0301],
         [-0.0190,  0.0291,  0.0221,  ...,  0.0067,  0.0206,  0.0151],
         [ 0.0226,  0.0331,  0.0182,  ...,  0.0150,  0.0278, -0.0073]],
        requires_grad=True),
 Parameter containing:
 tensor([-0.0326,  0.0119,  0.0150, -0.0099,  0.1880,  0.0172,  0.0290, -0.0646,
          0.0443,  0.0508, -0.0529,  0.0595, -0.0219,  0.1083,  0.0488, -0.0247,
          0.0175,  0.0209,  0.0698, -0.0100,  0.1001, -0.0113,  0.0594,  0.0775,
         -0.0458, -0.0667,  0.1188,  0.0233,  0.0782,  0.0732],
        requires_grad=True),
 Parameter containing:
 tensor([[ 0.2373,  0.2086,  0.2235,  0.1671, -0.3771, -0.0725, -0.1732,  0.1348,
          -0.2618, -0.2588,  0.1584, -0.1948,  0.1277, -0.1204, -0.5405,  0.1574,
           0.2048,  0.2636, -0.1551,  0.2251, -0.3237,  0.2152, -0.1976, -0.1840,
           0.1584,  0.2033, -0.1255,  0.1504, -0.1768, -0.1693]],
        requires_grad=True),
 Parameter containing:
 tensor([-0.1088], requires_grad=True)]

plt.plot(net(X).data,'.')

[<matplotlib.lines.Line2D at 0x7fb6c463c160>]

- 2048개 정도만 대충학습해도 동일 반복횟수에 대하여 거의 대등한 효율이 나옴

- GPU에 있는 메모리로 12396개의 데이터를 모두 보내지 않아도 괜찮겠다 $\to$ 그래픽카드의 메모리를 얼마나 큰 것으로 살지는 자료의 크기와는 상관없다.

- net.parameters()에 저장된 값들은 그대로 GPU로 가야만한다. $\to$ 그래픽카드의 메모리를 얼마나 큰것으로 살지는 모형의 복잡도와 관련이 있다.

컴퓨터사는방법

메모리: $n$이 큰 자료를 다룰수록 메모리가 커야한다.
GPU의 메모리: 모형의 복잡도가 커질수록 GPU의 메모리가 커야한다.

숙제

- batchsize=1024로 바꾼후 학습해보고 결과를 관찰할것

ds=torch.utils.data.TensorDataset(X,y)

dl=torch.utils.data.DataLoader(ds,batch_size=1024,shuffle=True) # defalut는 TRUE

torch.manual_seed(1)
net = torch.nn.Sequential(
    torch.nn.Linear(in_features=784,out_features=30),
    torch.nn.ReLU(),
    torch.nn.Linear(in_features=30,out_features=1)
    #torch.nn.Sigmoid()
)
loss_fn=torch.nn.BCEWithLogitsLoss()
optimizer=torch.optim.Adam(net.parameters())

네트워크 파라메터 다시 초기화

12396 / 1024

12.10546875

200*12

2400

총 12개의 미니배치가 만들어질것임 $\to$ 따라서 파라메터를 업데이트하는 횟수는 12 $\times$ epoc 임 (실제적으로는 11 $\times$ epoc)

200/11

18.181818181818183

for epoc in range(18): 
    for xx,yy in dl:
        ## 1 
        yyhat=net(xx)
        ## 2 
        loss= loss_fn(yyhat,yy) 
        ## 3 
        loss.backward()
        ## 4 
        optimizer.step()
        net.zero_grad()

plt.plot(yyhat.data,'.')

[<matplotlib.lines.Line2D at 0x7fb6c45ca730>]

- 배치사이즈를 다시 확인해보자.

for xx,yy in dl: 
    print(xx.shape,yy.shape)

torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([1024, 784]) torch.Size([1024, 1])
torch.Size([108, 784]) torch.Size([108, 1])

- 마지막이 108개이므로 108개의 y만 그려짐

list(net.parameters())

[Parameter containing:
 tensor([[ 0.0184, -0.0158, -0.0069,  ...,  0.0068, -0.0041,  0.0025],
         [-0.0274, -0.0224, -0.0309,  ..., -0.0029,  0.0013, -0.0167],
         [ 0.0282, -0.0095, -0.0340,  ..., -0.0141,  0.0056, -0.0335],
         ...,
         [ 0.0267,  0.0186, -0.0326,  ...,  0.0047, -0.0072, -0.0301],
         [-0.0190,  0.0291,  0.0221,  ...,  0.0067,  0.0206,  0.0151],
         [ 0.0226,  0.0331,  0.0182,  ...,  0.0150,  0.0278, -0.0073]],
        requires_grad=True),
 Parameter containing:
 tensor([-0.0137,  0.0255,  0.0345, -0.0102,  0.1220,  0.0174,  0.0154, -0.0269,
          0.0345,  0.0388, -0.0185,  0.0470,  0.0204,  0.0745, -0.0060, -0.0004,
          0.0337,  0.0344,  0.0543,  0.0134,  0.0929,  0.0122,  0.0472,  0.0674,
         -0.0052, -0.0322,  0.0998,  0.0415,  0.0636,  0.0550],
        requires_grad=True),
 Parameter containing:
 tensor([[ 0.2195,  0.1956,  0.2072,  0.1674, -0.2437, -0.0725, -0.1636,  0.1094,
          -0.2539, -0.2455,  0.1312, -0.1835,  0.1062, -0.0972, -0.1502,  0.1302,
           0.1897,  0.2523, -0.1412,  0.2096, -0.3102,  0.1957, -0.1872, -0.1739,
           0.1200,  0.1664, -0.1148,  0.1343, -0.1647, -0.1526]],
        requires_grad=True),
 Parameter containing:
 tensor([-0.0983], requires_grad=True)]

plt.plot(net(X).data,'.')

[<matplotlib.lines.Line2D at 0x7fb6c4518ac0>]