...
Stampede2 is a collection of 6,400 computers of 4,200 KNL nodes (computers), 1736 SKX nodes (computers) connected to three file servers, each with unique characteristics.
You need to understand the file servers to know how to use them.
$HOME | $WORK$WORK2 | $SCRATCH | |
---|---|---|---|
Purged? | No | No | Files can be purged if not accessed for 10 days. |
Backed Up? | Yes | No | No |
Capacity | 10GB | 1TB | Basically infinite. 8.5 PB |
Command to Access | cdh | cdwcdw2 | cds |
Purpose | Store Executables | Store Files | Run Jobs |
...
If you're going to run a job, it's a good idea to keep your input files in a directory in $WORK $WORK2 and copy them to a directory in $SCRATCH where you plan to run your job.
This example command might help a bit:
Code Block |
---|
cp $WORK$WORK2/my_fastq_data/*fastq $SCRATCH/my_project/ |
Stampede2's /home and /scratch file systems are mounted and visible only on Stampede2, but the work file system mounted on Stampede2 is part of the global file system hosted on Stockyard. This means /work and /work2 are visible on common among TACC clusters : lonestar5, stampede2, and frontera.
General Guidelines to reduce File I/O load on TACC:
TACC staff now recommends that you run your jobs out of the $SCRATCH file system instead of the global $WORK file system. Copy input files to $SCRATCH, run analyses, output should be written to $SCRATCH. Copy results to $WORK when done.
General Guidelines to when transferring files to/from TACC:
- Don't do too many (>3) simultaneous file transfers.
- If you need to transfer recursive files (directories within directories, lots of small files), create a TAR archive before transferring.
Now let's go on to look at how jobs are run on stampede2.