...
Code Block |
---|
# Make a new "wget" directory in your student Home directory and change into it mkdir -p ~/wget; cd ~/wget # download a Gencode statistics file using default output file naming wget "https://ftp.ebi.ac.uk/pub/databases/gencode/_README_stats.txt" wc -l _README_stats.txt # if you execute the same wget again, and the output file already exists # wget will create a new one with a numeric extension wget "https://ftp.ebi.ac.uk/pub/databases/gencode/_README_stats.txt" wc -l _README_stats.txt.1 # download the same Gencode statistics file to a different local filename wget -O gencode_stats.txt "https://ftp.ebi.ac.uk/pub/databases/gencode/_README_stats.txt" wc -l gencode_stats.txt |
The find command
TBDThe find command is a powerful – and of course complex! – way of looking for files in a nested directory hierarchy. The general form I use is:
- find <in_directory> [ operators ] -name <expression> [ tests ]
- looks for files matching <expression> in <in_directory> and its sub-directories
- <expression> can be a double-quoted string including pathname wildcards (e.g. "[a-g]*.txt")
- there are tons of operators and tests:
- -type f (file) and -type d (directory) are useful tests
- -maxdepth NNis a useful operator to limit the depth of recursion.
- returns a list of matching relative pathnames, relative to <in_directory>, one per output line.
Examples:
Code Block | ||
---|---|---|
| ||
cd
find . -name "*.txt" -type f # find all .txt files in the Home directory
find . -name "*docs*" -type d # find all directories with "docs" in the directory name |
Exercise 2-1
The /stor/work/CBRS_unix/fastq/ directory contains sequencing data from a GSAF Job. Its structure, as shown by tree, is:
Use find to find all fastq.gz files in /stor/work/CBRS_unix/fastq/.
Expand | ||
---|---|---|
| ||
find /stor/work/CBRS_unix/fastq/ -name "*.fastq.gz" -type f |
How many fastq.gz files in /stor/work/CBRS_unix/fastq/ were run in sequencer lane L001.
Expand | ||
---|---|---|
| ||
find /stor/work/CBRS_unix/fastq/ -name "*L001*fastq.gz" -type f | wc -l |
How many sample directories in /stor/work/CBRS_unix/fastq/ were run on July 10, 2020?
Expand | ||
---|---|---|
| ||
find /stor/work/CBRS_unix/fastq/ -name "*2020-07-10*" -type d | wc -l |
Working with symbolic links
...