Learn how to easily set CUDA_VISIBLE_DEVICES within a Jupyter Notebook environment for optimized TensorFlow GPU usage.
When working with TensorFlow, especially on systems with multiple GPUs, you might want to control which GPUs TensorFlow can access and utilize. This is crucial for tasks like dedicating specific GPUs for certain processes or troubleshooting. You can achieve this control by using the CUDA_VISIBLE_DEVICES
environment variable.
You can control which GPUs TensorFlow sees by setting the CUDA_VISIBLE_DEVICES
environment variable. Here's how:
Identify GPU IDs: Use nvidia-smi
in your terminal to list available GPUs and their IDs.
Set the environment variable:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0" # Use GPU 0
"0"
with the desired GPU ID."0,1,2"
."-1"
.Place the code: This code needs to be executed before you import TensorFlow.
Important Notes:
CUDA_VISIBLE_DEVICES
. If you set "1,0"
, then GPU 1 will be logical device 0 and GPU 0 will be logical device 1.CUDA_VISIBLE_DEVICES
differently for each node to specify which GPU(s) that node should use.The Python code demonstrates how to control which GPUs TensorFlow can access. It uses the CUDA_VISIBLE_DEVICES environment variable to specify GPU IDs before importing TensorFlow. You can choose a single GPU, multiple GPUs, or disable GPU usage entirely. The code also includes a check to verify the visible GPUs within TensorFlow.
# Step 1: Identify GPU IDs (This is done in your terminal, not in the code)
# Run: nvidia-smi
# Step 2 & 3: Set the environment variable BEFORE importing TensorFlow
import os
# Use only GPU 0
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# Use multiple GPUs (0 and 1)
# os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
# Disable GPU usage
# os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
# Now import TensorFlow and your other libraries
import tensorflow as tf
# Verify which GPUs TensorFlow sees
physical_devices = tf.config.list_physical_devices('GPU')
print("Num GPUs Available: ", len(physical_devices))
# Rest of your TensorFlow code...
Explanation:
nvidia-smi
(Terminal command): This command provides information about your NVIDIA GPUs, including their IDs. You'll use these IDs in the next step.os.environ["CUDA_VISIBLE_DEVICES"] = ...
: This line sets the CUDA_VISIBLE_DEVICES
environment variable.
"0"
: Makes only GPU 0 visible to TensorFlow."0,1"
: Makes GPUs 0 and 1 visible. TensorFlow will assign logical IDs 0 and 1 to them, respectively."-1"
: Disables all GPUs, forcing TensorFlow to use the CPU.tf.config.list_physical_devices('GPU')
: This TensorFlow command lets you confirm which GPUs are visible and accessible within your code.Remember:
CUDA_VISIBLE_DEVICES
on each node separately to control GPU assignments across your cluster.CUDA_VISIBLE_DEVICES
only affects the current process. Once you close your terminal or Jupyter Notebook, the setting reverts to its default.CUDA_VISIBLE_DEVICES
as part of the container's environment configuration during its launch.nvidia-smi
.CUDA_VISIBLE_DEVICES
must be set before you import TensorFlow.tf.config.set_visible_devices
: Allows you to dynamically change visible devices within your TensorFlow code.tf.device
.Remember that understanding your specific hardware configuration and software environment is essential for effectively managing GPU resources with TensorFlow.
Feature | Description |
---|---|
Environment Variable | CUDA_VISIBLE_DEVICES |
Purpose | Controls which GPUs TensorFlow can access and utilize. |
Identifying GPU IDs | Use the command nvidia-smi in your terminal. |
Setting the Variable | - Use os.environ["CUDA_VISIBLE_DEVICES"] = "GPU_ID" in your Python code before importing TensorFlow. - Replace "GPU_ID" with: - A single GPU ID (e.g., "0" ). - Multiple GPU IDs separated by commas (e.g., "0,1,2" ). - "-1" to disable GPU usage. |
Logical GPU IDs | TensorFlow assigns logical IDs (0, 1, ...) to visible GPUs based on their order in CUDA_VISIBLE_DEVICES . |
Jupyter Notebooks | Restart the kernel after setting the variable for changes to take effect. |
Other Libraries | This method generally applies to other deep learning libraries like PyTorch. |
Horovod | Distributed training with Horovod requires specific CUDA_VISIBLE_DEVICES settings for each node. |
By using the CUDA_VISIBLE_DEVICES
environment variable, you can effectively manage which GPUs TensorFlow utilizes. This control is essential for optimizing GPU resources, especially in multi-GPU systems. Remember to set this variable before importing TensorFlow and to verify your configuration using tf.config.list_physical_devices('GPU')
. Whether you need to dedicate GPUs, troubleshoot issues, or experiment with different setups, mastering this technique will streamline your TensorFlow development process.