pytorch指定GPU

在用pytorch写CNN的时候,发现一运行程序就卡住,然后cpu占用率100%,nvidia-smi 查看显卡发现并没有使用GPU。所以考虑将模型和输入数据及标签指定到gpu上。

pytorch中的Tensor和Module可以指定gpu运行,并且可以指定在哪一块gpu上运行,方法非常简单,就是直接调用Tensor类和Module类中的 .cuda() 方法。

import torch
from PIL import Image
import torch.nn as nn
import numpy as np
from torch.autograd import Variable

# 先看看有没有显卡
torch.cuda.is_available()
Out[16]: True
# 嗯,有显卡,可以指定,先生成一个Tensor
a = torch.Tensor(3,5)
a
Out[13]: 
.00000e-05 *
 0.0000 0.0000 2.0419 0.0000 2.0420
 0.0000 0.0000 0.0000 0.0000 0.0000
 0.0132 0.0000 0.0131 0.0000 0.0000
[torch.FloatTensor of size 3x5]
a.cuda()
Out[14]: 
.00000e-05 *
 0.0000 0.0000 2.0419 0.0000 2.0420
 0.0000 0.0000 0.0000 0.0000 0.0000
 0.0132 0.0000 0.0131 0.0000 0.0000
[torch.cuda.FloatTensor of size 3x5 (GPU 0)]
# 可以看到上面显示了(GPU 0),也就是说这个Tensor是在第一个GPU上的
a.cuda(1)
Traceback (most recent call last):

 File "<ipython-input-15-ef42531f63ca>", line 1, in <module>
  a.cuda(1)

 File "/home/chia/anaconda2/lib/python2.7/site-packages/torch/_utils.py", line 57, in _cuda
  with torch.cuda.device(device):

 File "/home/chia/anaconda2/lib/python2.7/site-packages/torch/cuda/__init__.py", line 127, in __enter__
  torch._C._cuda_setDevice(self.idx)

RuntimeError: cuda runtime error (10) : invalid device ordinal at torch/csrc/cuda/Module.cpp:84
# 这个报错了,因为只有一块GPU,所以指定cuda(1)无效。

同样滴,Variable变量和Module类型的模型也可以指定放在哪块GPU上

v = Variable(a)

v
Out[18]: 
Variable containing:
.00000e-05 *
 0.0000 0.0000 2.0419 0.0000 2.0420
 0.0000 0.0000 0.0000 0.0000 0.0000
 0.0132 0.0000 0.0131 0.0000 0.0000
[torch.FloatTensor of size 3x5]

v.cuda(0)
Out[19]: 
Variable containing:
.00000e-05 *
 0.0000 0.0000 2.0419 0.0000 2.0420
 0.0000 0.0000 0.0000 0.0000 0.0000
 0.0132 0.0000 0.0131 0.0000 0.0000
[torch.cuda.FloatTensor of size 3x5 (GPU 0)]

model = DenoiseCNN()

model
Out[22]: 
DenoiseCNN (
 (hid_layer): Sequential (
  (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
  (2): LeakyReLU (0.2)
 )
 (first_layer): Sequential (
  (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): LeakyReLU (0.2)
 )
 (last_layer): Sequential (
  (0): Conv2d(32, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
 )
)

model.cuda(0)
Out[23]: 
DenoiseCNN (
 (hid_layer): Sequential (
  (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
  (2): LeakyReLU (0.2)
 )
 (first_layer): Sequential (
  (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): LeakyReLU (0.2)
 )
 (last_layer): Sequential (
  (0): Conv2d(32, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
 )
)

这样看不出来Module的变化,考虑看一下Module中的参数在哪里

for i, para in enumerate(model.parameters()):
  if i < 2:
    print para

Parameter containing:
(0 ,0 ,.,.) = 
 -3.1792e-02 -4.6396e-02 -4.3472e-02
 3.4903e-02 1.8558e-02 5.3955e-03
 2.4986e-02 3.8061e-02 -1.6658e-02

(0 ,1 ,.,.) = 
 -3.5041e-02 -3.6286e-02 -3.0819e-02
 1.0683e-02 9.0773e-03 -2.5379e-02
 2.9508e-03 2.8774e-02 7.4632e-04

(0 ,2 ,.,.) = 
 -4.6986e-02 -5.1183e-02 8.4346e-04
 -6.6864e-03 -2.8816e-02 1.2566e-02
 2.1682e-02 2.5485e-02 -7.2600e-03
  ...

(0 ,29,.,.) = 
 -5.5289e-03 -2.6012e-02 -2.7771e-02
 2.7528e-02 3.0460e-02 -1.2829e-02
 7.3387e-03 5.2633e-02 5.0601e-02

(0 ,30,.,.) = 
 -3.5881e-02 9.7000e-03 -3.3692e-02
 1.6257e-03 -4.0113e-02 3.5300e-02
 -2.1399e-03 3.0934e-02 -2.7513e-02

(0 ,31,.,.) = 
 -2.7492e-02 2.5803e-02 5.2171e-02
 -2.4082e-02 3.1887e-02 1.1292e-02
 5.8893e-02 -3.5452e-02 -1.2115e-02
   ⋮ 

(1 ,0 ,.,.) = 
 5.0664e-02 -4.1085e-02 2.9089e-02
 2.1555e-02 5.7176e-02 -7.5013e-03
 3.5075e-02 -1.6610e-02 3.4904e-02

(1 ,1 ,.,.) = 
 4.6716e-02 -1.2552e-02 -3.8132e-02
 -2.9573e-02 -3.5008e-02 -4.2891e-02
 9.5230e-03 -4.8599e-02 2.5357e-02

(1 ,2 ,.,.) = 
 -1.7859e-02 1.3442e-02 1.9493e-02
 1.8434e-02 1.4884e-03 8.6479e-03
 -7.1610e-03 3.5724e-02 6.2249e-03
  ...

(1 ,29,.,.) = 
 -3.3194e-02 1.6803e-05 2.3405e-02
 -5.2223e-02 6.5680e-03 -1.8427e-02
 -1.4476e-02 -1.5434e-02 -2.3108e-02

(1 ,30,.,.) = 
 2.3479e-02 1.2840e-02 4.5949e-02
 4.4833e-02 4.9272e-02 -3.7634e-02
 4.2787e-02 8.5841e-04 1.2332e-02

(1 ,31,.,.) = 
 4.1723e-02 -2.5295e-02 1.1326e-02
 -5.1707e-02 5.3201e-02 4.8928e-02
 -1.6735e-02 -8.7450e-03 -4.9530e-02
   ⋮ 

(2 ,0 ,.,.) = 
 -3.1728e-02 -3.9757e-02 6.5561e-03
 -1.7731e-02 2.8615e-02 2.7457e-02
 -2.1817e-03 -4.2405e-02 -3.6126e-03

(2 ,1 ,.,.) = 
 3.2434e-02 -1.1574e-03 1.3353e-02
 -2.3069e-02 4.9532e-02 1.6768e-02
 -3.5563e-02 -4.4264e-02 -2.0571e-02

(2 ,2 ,.,.) = 
 7.4980e-03 -5.7412e-03 -3.0638e-03
 1.1812e-02 -1.7851e-02 4.2195e-04
 3.9753e-02 3.8771e-02 4.3166e-03
  ...

(2 ,29,.,.) = 
 -5.0798e-02 4.3651e-02 -2.3798e-02
 -6.0957e-03 -5.6953e-02 1.2583e-02
 -2.3450e-02 -4.7136e-02 5.2458e-02

(2 ,30,.,.) = 
 1.5088e-02 2.6097e-02 4.9392e-03
 -9.0372e-03 -5.3276e-02 -1.7824e-02
 3.2060e-03 5.8820e-02 1.3459e-02

(2 ,31,.,.) = 
 -5.2557e-03 -4.9638e-02 -7.5522e-03
 2.8668e-02 -3.9617e-02 -1.8111e-02
 -4.0412e-02 1.1320e-02 -2.4005e-02

   ⋮ 

(29,0 ,.,.) = 
 -1.4393e-02 2.1343e-02 5.1940e-02
 5.7449e-02 3.1327e-02 -1.0721e-02
 -1.0184e-02 -6.2289e-03 3.9823e-02

(29,1 ,.,.) = 
 -4.2240e-03 5.8135e-02 5.2816e-02
 -4.9888e-02 3.3972e-02 4.3127e-02
 -2.3355e-02 -5.5401e-02 3.4952e-02

(29,2 ,.,.) = 
 4.0336e-02 7.6532e-03 5.4083e-02
 -2.7456e-02 3.9090e-02 4.4008e-02
 -2.0424e-02 -5.8922e-02 -4.4759e-03
  ...

(29,29,.,.) = 
 8.8037e-03 1.0347e-02 -2.2285e-02
 -1.0538e-02 -3.2981e-02 2.2300e-02
 -2.7337e-02 5.3113e-02 5.4608e-02

(29,30,.,.) = 
 3.1429e-02 5.2024e-03 -1.3882e-02
 -3.3123e-02 -2.7633e-03 1.9007e-02
 -2.9795e-02 3.7551e-02 5.6486e-02

(29,31,.,.) = 
 2.0140e-02 1.8530e-02 7.4208e-03
 2.7311e-02 5.3581e-02 -2.5553e-02
 -1.7285e-02 1.8722e-02 4.0104e-02
   ⋮ 

(30,0 ,.,.) = 
 5.2750e-02 4.5757e-03 -5.3894e-02
 -3.9297e-02 3.2918e-02 2.3571e-02
 -1.1806e-02 1.6091e-02 3.3755e-04

(30,1 ,.,.) = 
 4.2858e-02 -5.2211e-02 -3.5660e-02
 1.4807e-02 -5.8873e-02 5.5535e-02
 4.9854e-02 2.2946e-02 4.0968e-03

(30,2 ,.,.) = 
 3.0378e-02 2.1315e-02 9.1700e-03
 3.6277e-02 -4.0734e-02 4.8175e-02
 3.0748e-02 -2.7425e-02 -1.7741e-02
  ...

(30,29,.,.) = 
 3.1883e-02 2.5012e-02 2.8504e-02
 -1.3538e-02 3.5570e-02 -2.0261e-02
 -1.5959e-02 3.3373e-02 8.3261e-03

(30,30,.,.) = 
 2.7152e-02 -5.6752e-02 2.2697e-02
 1.2614e-02 -2.4174e-02 -2.5058e-02
 1.8737e-02 -1.3581e-03 -3.7116e-02

(30,31,.,.) = 
 -4.3278e-02 2.5873e-02 -1.6677e-02
 3.9483e-02 5.7898e-02 -4.1450e-02
 -5.8218e-02 -3.0660e-02 -4.2161e-02
   ⋮ 

(31,0 ,.,.) = 
 1.3370e-02 -1.4191e-02 -2.2524e-02
 2.1772e-02 -2.2440e-02 -3.0512e-03
 3.4139e-02 -1.9043e-02 1.1289e-02

(31,1 ,.,.) = 
 -5.1293e-02 -5.2802e-02 1.7022e-02
 5.1031e-02 -1.0345e-02 -4.4780e-02
 -4.9422e-02 4.7709e-02 -2.1215e-02

(31,2 ,.,.) = 
 2.2289e-02 -2.1746e-02 -5.3192e-02
 2.6651e-02 -1.6531e-02 2.2640e-02
 1.4012e-02 1.1405e-02 -1.4809e-02
  ...

(31,29,.,.) = 
 2.5505e-03 2.4052e-02 -4.7662e-02
 1.6068e-02 -4.2278e-02 -2.4670e-02
 -1.4684e-02 -3.8222e-02 -5.0006e-02

(31,30,.,.) = 
 -4.9350e-02 4.7564e-02 -7.3479e-03
 2.6490e-02 -1.1745e-02 3.4324e-02
 4.2650e-02 -5.4633e-02 9.4581e-03

(31,31,.,.) = 
 -3.2695e-02 -2.8899e-02 1.5543e-02
 -5.3662e-02 5.0727e-02 3.5950e-02
 4.6130e-02 -4.4754e-02 -4.5647e-02
[torch.cuda.FloatTensor of size 32x32x3x3 (GPU 0)]

Parameter containing:
.00000e-02 *
 -1.2723
 -5.2970
 -3.4638
 -1.5302
 0.7641
 5.7516
 -4.8427
 -0.7230
 4.5940
 -4.1709
 4.8093
 -4.7249
 -2.2756
 -5.5165
 5.1259
 -2.4693
 1.8527
 -0.4210
 -2.0518
 -3.8124
 -4.6195
 -4.3019
 -0.8631
 -0.4400
 5.4604
 -5.5597
 1.5557
 4.2336
 3.9482
 -1.4457
 2.6124
 -1.8218
[torch.cuda.FloatTensor of size 32 (GPU 0)]

可以看出,模型的参变量是放在GPU上的。

通过指定了gpu后,就可以使用gpu来训练模型了~美滋滋

以上这篇在pytorch中为Module和Tensor指定GPU的例子就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持。

标签:
pytorch,Module,Tensor,指定GPU

免责声明:本站文章均来自网站采集或用户投稿,网站不提供任何软件下载或自行开发的软件! 如有用户或公司发现本站内容信息存在侵权行为,请邮件告知! 858582#qq.com
评论“在pytorch中为Module和Tensor指定GPU的例子”
暂无“在pytorch中为Module和Tensor指定GPU的例子”评论...

RTX 5090要首发 性能要翻倍!三星展示GDDR7显存

三星在GTC上展示了专为下一代游戏GPU设计的GDDR7内存。

首次推出的GDDR7内存模块密度为16GB,每个模块容量为2GB。其速度预设为32 Gbps(PAM3),但也可以降至28 Gbps,以提高产量和初始阶段的整体性能和成本效益。

据三星表示,GDDR7内存的能效将提高20%,同时工作电压仅为1.1V,低于标准的1.2V。通过采用更新的封装材料和优化的电路设计,使得在高速运行时的发热量降低,GDDR7的热阻比GDDR6降低了70%。