物理の駅 Physics station by 現役研究者

テクノロジーは共有されてこそ栄える

TensorFlowをWindows + Nvidia GPUで使ってみる (2019/08/21)

日々バージョンが新しくなってるので、2019/08/21時点で。

Cuda Toolkit 10.0

ダウンロードとインストール

https://developer.nvidia.com/cuda-10.0-download-archive

現時点での最新版は10.1だったが、pipからインストールできるtensorflow-gpuは10.0だった

cuDNN v7.6.2 をインストール

cuDNN v7.6.2 (July 22, 2019), for CUDA 10.0 をダウンロード

https://developer.nvidia.com/rdp/cudnn-download

Pythonにtensorflowをインストール

pip install tensorflow
pip install tensorflow-gpu

サンプルコード

https://www.tensorflow.org/tutorials

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

GPUを使いたくない場合は CUDA_VISIBLE_DEVICES-1 とすればよい

結果

WARNING: Logging before flag parsing goes to stderr.
W0821 20:29:07.159986  2124 deprecation.py:506] From C:\Users\Masahiro\Anaconda3\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
2019-08-21 20:29:07.664596: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-08-21 20:29:07.738388: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2019-08-21 20:29:08.677659: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:02:00.0
2019-08-21 20:29:08.727962: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-08-21 20:29:08.740897: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-21 20:29:10.242017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-21 20:29:10.253687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0
2019-08-21 20:29:10.276918: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N
2019-08-21 20:29:10.285098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1347 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:02:00.0, compute capability: 6.1)
Epoch 1/5
60000/60000 [==============================] - 18s 299us/sample - loss: 0.2221 - acc: 0.9330
Epoch 2/5
60000/60000 [==============================] - 21s 355us/sample - loss: 0.0967 - acc: 0.9705
Epoch 3/5
60000/60000 [==============================] - 30s 493us/sample - loss: 0.0693 - acc: 0.9777
Epoch 4/5
60000/60000 [==============================] - 16s 264us/sample - loss: 0.0539 - acc: 0.9829
Epoch 5/5
60000/60000 [==============================] - 14s 235us/sample - loss: 0.0441 - acc: 0.9861
10000/10000 [==============================] - 2s 185us/sample - loss: 0.0631 - acc: 0.9801

https://qiita.com/tilfin/items/24e9491eb8a4ce42eea6

https://qiita.com/resnant/items/80730ae63b26ce39c2e0