Content Comparison

Table of Contents

Overview

Austin's own Advanced Micro Devices (AMD) has most generously donated a number of GPU-enabled servers to UT.

While it is still true that AMD GPUs do not support as many 3rd party applications as NVIDIA, they do support many popular Machine Learning (ML) applications such as TensorFlow, PyTorch, and AlphaFold, and Molecular Dynamics (MD) applications such as GROMACS, all of which are installed and ready for use.

Our recently announced Two BRCF research pods have AMD GPU servers available: the Hopefog and Livestong PODs. Their use is restricted to the groups who own those pods. See Livestrong and Hopefog pod AMD servers for specific information.

The BRCF's AMD GPU pod is available for instructional use and for research use for qualifying UT-Austin affiliated PIs. Allocations are granted to groups who will only perform certain GPU-enabled workflows. To request an allocation, contact us at rctf-support@utexas.edu, and provide the UT EIDs of those who should be granted access.Two BRCF research pods also have AMD GPU servers available: the Hopefog and Livestong PODs. Their use is restricted to the groups who own those pods. See Livestrong and Hopefog pod AMD servers for specific information.

GPU-enabled software

AlphaFold

The AlphaFold protein structure solving software is available on all AMD GPU servers. The /stor/scratch/AlphaFold directory has the large required database, under the data.4 sub-directory. There is also an AMD example script /stor/scratch/AlphaFold/alphafold_example_amd.shand an alphafold_example_nvidia.sh script if the POD also has NVIDIA GPUs, (e.g. the Hopefog pod). Interestingly, our timing tests indicate that AlphaFold performance is quite similar on all the AMD and NVIDIA GPU servers.

On AMD GPU servers, AlphaFold is implemented by a run_alphafold.py Python script inside a Docker image, See the run_alphafold_rocm.sh and run_multimer_rocm.sh scripts under /stor/scratch/AlphaFold for a complete list of options to that script.

Pytorch and TensorFlow

All pod compute servers have 2 main Python environments, which are all managed separately (see About Python and JupyterHub server for more information about these environments):

command-line Python 2.7 (python2.7, pip2.7)
command-line Python 3.12 (python, python3, python3.12, pip, pip3, pip3.12)
web-based JupyterHub which uses the Python 3.12 kernel

We are working hard to get AMD-GPU-enabled versions of TensorFlow and PyTorch working in all three environments. Current status is as follows:

POD	GPU-enabled PyTorch on all AMD-GPU servers	GPU-enabled TensorFlow
AMD GPU	command-line python3, python3.12 JupyterHub (e.g. https://amdgcomp02.ccbb.utexas.edu/)	command-line python3, python3.8 JupyterHub (e.g. https://amdgcomp02.ccbb.utexas.edu/)
Hopefog	command-line python3, python3.8 (upgrade coming soon) JupyterHub (e.g. https://hfogcomp02.ccbb.utexas.edu/)	command-line python3, python3.8 (upgrade coming soon)
Livestrong	command-line python3, python3.12 JupyterHub (e.g. https://livecomp02.ccbb.utexas.edu/)	command-line python3, python3.12

Pytorch/TensorFlow example scripts

Two Python scripts are located in /stor/scratch/GPU_info that can be used to ensure you have access to the server's GPUs from TensorFlow or PyTorch. Run You can run them from the command line using time to compare see the run times.

Tensor Flow – AMD GPU pod servers (amdgcomp01/02/03)
- time (python3 /stor/scratch/GPU_info/tensorflow_example.py )
- should take ~30s or less with GPU (on an unloaded system), > 1 minute with CPUs only
Tensor Flow – Livestrong and Hopefog pod servers
- time (python3 /stor/scratch/GPU_info/tensorflow_example.py )
PyTorch
- time (python3 /stor/scratch/GPU_info/pytorch_example.py )
- Model time should be ~30-45s with GPU on an unloaded system
- You'll see this warning, which can be ignored:
  MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx90878.kdb Performance
  may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package

...

All pod compute servers have 3 2 main Python environments, which are all managed separately (see About Python and JupyterHub server for more information about these environments):

command-line Python 32.87 (python3, python3python2.87, pip3, pip3pip2.87)
command-line Python 3.912 (python, python3, 9python3.12, pip, pip3, pip3.912)
web-based JupyterHub which uses the Python 3.912 kernel

We are working hard to get AMD-GPU-enabled versions of TensorFlow and PyTorch working in all three environments. Current status is as follows:

...

command-line python3, python3.8
JupyterHub (e.g. https://amdgcomp02.ccbb.utexas.edu/)

...

command-line python3, python3.8
JupyterHub (e.g. https://amdgcomp02.ccbb.utexas.edu/)

...

command-line python3, python3.8

...

command-line python3, python3.8

...

command-line python3, python3.8

...

command-line python3, python3.8

If you need a different combination of Python and TensorFlow/PyTorch versions, you'll need to construct an appropriate custom Conda environment (e.g. miniconda3 or anaconda) as well as your own Jupyter Notebook environment if needed.

...

About TensorFlow versions

The AMD-GPU-specific version of TensorFlow, Tensorflowtensorflow-rocm 2.9.1 , is installed on most all AMD GPU servers. This version works with ROCm 5.1.3+. The TensorFlow

If you need to install your own version with pip, specify the version explicitly, e.g.:

Code Block
pippip3 install tensorflow-rocm==2.9.1

...

benchmarks/ - a set of MD benchmark files from https://www.mpinat.mpg.de/grubmueller/bench
gromacs_amd_example.sh - a simple GROMACS example script taking advantage of the GPU, running the benchMEM.tpr benchmark by default.
gromacs_cpu_example.sh - a GROMACS example script using the CPUs only.

You'll see warnings like these when you run the GPU-enabled examples script; they can be ignored:

...

.

Resources

ROCm environment

...

ROCm Video series
- https://community.amd.com/t5/instinct-accelerators-blog/rocm-open-software-ecosystem-for-accelerated-compute/ba-p/418720
- Especially the Introduction to AMD GPU Hardware: Link
  - Provides hardware background and terminology used throughout other guides
- Also
  - GPU Programming Concepts
    - Part 1 - HIP framework (like NVIDIA CUDA): Link
    - Part 2 - Device management, synchronization, MPI programming: Link
    - Part 3 - Device code, shared memory & thread synchronization: Link
  - GPU Programming Software (compilers, libraries & tools): Link
AMD ROCm resources Learning Center: https://developer.amd.com/resources/rocm-resources/rocm-learning-center/
- Especially:
  - Introduction to ROCm (Video, PDF)
  - Introduction to HIP (Video, PDF)
  - Introduction to Deep Learning on ROCm (Video, PDF)

...

Version	Old Version 22	New Version 23
Changes made by	Anna Battenhouse	Anna Battenhouse
Saved on	Sep 05, 2024	Jul 18, 2025

Versions Compared

Key