The main function in the seeker package is, well, seeker(). Currently seeker() is targeted at processing RNA-seq data. The main input is a list of parameters specifying which steps of RNA-seq data processing to perform and how to perform them. Depending on the parameters, seeker() will call other functions in the package.

A convenient way to construct the list of parameters is to make a yaml file and read it into R using yaml::read_yaml(). A template yaml file is reproduced below and available at system.file('extdata', 'params_template.yml', package = 'seeker').

study: '' # [string]
metadata:
  run: TRUE # [logical]
  bioproject: '' # [string]
  # include # [named list or NULL]
    # colname # [string]
    # values # [vector]
  # exclude # [named list or NULL]
    # colname # [string]
    # values # [vector]
fetch:
  run: TRUE # [logical]
  # overwrite # [logical or NULL]
  # ascpCmd # [string or NULL]
  # ascpArgs # [character vector or NULL]
  # ascpPrefix # [string or NULL]
trimgalore:
  run: TRUE # [logical]
  # cmd # [string or NULL]
  # args # [character vector or NULL]
fastqc:
  run: TRUE # [logical]
  # cmd # [string or NULL]
  # args # [character vector or NULL]
salmon:
  run: TRUE # [logical]
  indexDir: '' # [string]
  # cmd # [string or NULL]
  # args # [character vector or NULL]
multiqc:
  run: TRUE # [logical]
  # cmd # [string or NULL]
  # args # [character vector or NULL]
tximport:
  run: TRUE # [logical]
  tx2gene:
    # [named list or NULL]
    dataset: 'mmusculus_gene_ensembl' # [string]
    version: 104 # [number; latest version is 104 as of Oct 2021]
  countsFromAbundance: '' # [string]
  # ignoreTxVersion # [logical or NULL]

A convenient way to run seeker() is then using a script such as the one reproduced below and available at system.file('extdata', 'run_seeker.R', package = 'seeker').

doParallel::registerDoParallel()

cArgs = commandArgs(TRUE)
yamlPath = cArgs[1L]
parentDir = cArgs[2L]

params = yaml::read_yaml(yamlPath)
seeker::seeker(params, parentDir)

If you copy the script to your current working directory, you can run it using something like

Rscript run_seeker.R <path/to/study>.yml <path/to/parent/directory>

A fancier option, which saves stdout and stderr to a log file, would be something like

study="<study>" && \
parentDir="<path/to/parent/directory>" && \
mkdir -p "${parentDir}/${study}" && \
Rscript run_seeker.R "<path/to>/${study}.yml" "${parentDir}" &> \
  "${parentDir}/${study}/progress.log"

This option assumes that the name of the yaml file (minus the file extension) is identical to the study variable within the yaml file, which we highly recommend.