seeker is an R package for fetching and processing sequencing data, especially RNA-seq data. Hopefully it helps you get what you’re after, before the day you die.

Installation

R package

  1. Install BiocManager.

    if (!requireNamespace('BiocManager', quietly = TRUE))
      install.packages('BiocManager')
  2. If you use RStudio, go to Tools → Global Options… → Packages → Add… (under Secondary repositories), then enter:

    You only have to do this once. Then you can install or update the package by entering:

    BiocManager::install('seeker')

    Alternatively, you can install or update the package by entering:

    BiocManager::install('seeker', site_repository = 'https://hugheylab.github.io/drat/')

System dependencies

These instructions are for Unix-based systems. If you’re using Windows, you’re doing it wrong.

  1. Download and install Aspera Connect. On Linux, you will likely have to download a tar.gz file (using wget or curl), untar it (using tar -zxvf), then run the resulting shell script. On macOS, you may have to install a browser extension first, then install Connect from a dmg file.

  2. Install Miniconda. On Linux:

    curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    sh Miniconda3-latest-Linux-x86_64.sh

    On macOS:

    curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
    sh Miniconda3-latest-MacOSX-x86_64.sh
  3. Set up conda channels, including bioconda.

    conda config --add channels defaults
    conda config --add channels bioconda
    conda config --add channels conda-forge
  4. Install the mamba package manager.

    conda install mamba -c conda-forge
  5. Install the other command-line tools. The code below will install the packages into the base conda environment; use -n to specify a different environment. When using seeker, you will need to ensure that R is running with the given conda environment.

    mamba install refgenie trim-galore fastqc fastq-screen salmon multiqc
  6. Optionally, configure refgenie. For example:

    mkdir -p ${HOME}/genomes
    refgenie init -c ${HOME}/genomes/genome_config.yaml

    Then you can add the following line to ~/.bashrc, ~/.bash_profile, or ~/.zshrc, depending on your OS and shell.

    REFGENIE="${HOME}/genomes/genome_config.yaml"

    Then source the file and run refgenie init.

  7. Optionally, use refgenie to fetch the salmon index files for the mouse and human transcriptomes.

    refgenie pull hg38/salmon_sa_index
    refgenie pull mm10/salmon_sa_index
  8. Optionally, fetch the genomes for fastq-screen. This takes a long time, so don’t bother unless you actually plan to run fastq-screen.

    fastq_screen --get_genomes