seeker
is an R package for fetching and processing sequencing data, especially RNA-seq data. Hopefully it helps you get what you’re after, before the day you die.
Install BiocManager
.
if (!requireNamespace('BiocManager', quietly = TRUE))
install.packages('BiocManager')
If you use RStudio, go to Tools → Global Options… → Packages → Add… (under Secondary repositories), then enter:
You only have to do this once. Then you can install or update the package by entering:
BiocManager::install('seeker')
Alternatively, you can install or update the package by entering:
BiocManager::install('seeker', site_repository = 'https://hugheylab.github.io/drat/')
These instructions are for Unix-based systems. If you’re using Windows, you’re doing it wrong.
Download and install Aspera Connect. On Linux, you will likely have to download a tar.gz file (using wget
or curl
), untar it (using tar -zxvf
), then run the resulting shell script. On macOS, you may have to install a browser extension first, then install Connect from a dmg file.
Install Miniconda. On Linux:
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
On macOS:
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
sh Miniconda3-latest-MacOSX-x86_64.sh
Set up conda channels, including bioconda.
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Install the mamba package manager.
conda install mamba -c conda-forge
Install the other command-line tools. The code below will install the packages into the base conda environment; use -n
to specify a different environment. When using seeker
, you will need to ensure that R is running with the given conda environment.
mamba install refgenie trim-galore fastqc fastq-screen salmon multiqc
Optionally, configure refgenie. For example:
mkdir -p ${HOME}/genomes
refgenie init -c ${HOME}/genomes/genome_config.yaml
Then you can add the following line to ~/.bashrc, ~/.bash_profile, or ~/.zshrc, depending on your OS and shell.
REFGENIE="${HOME}/genomes/genome_config.yaml"
Then source
the file and run refgenie init
.
Optionally, use refgenie to fetch the salmon index files for the mouse and human transcriptomes.
refgenie pull hg38/salmon_sa_index
refgenie pull mm10/salmon_sa_index
Optionally, fetch the genomes for fastq-screen. This takes a long time, so don’t bother unless you actually plan to run fastq-screen.
fastq_screen --get_genomes