Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .bashrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
source /usr/share/modules/init/bash
module use /modules/gsi/modulator/modulefiles/Ubuntu18.04
23 changes: 23 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
FROM modulator:latest

MAINTAINER Fenglin Chen <f73chen@uwaterloo.ca>

# packages should already be set up in modulator:latest
USER root

# move in the yaml to build modulefiles from
COPY recipes/variant_effect_predictor_recipe.yaml /modulator/code/gsi/recipe.yaml

# build the modules and set folder & file permissions
RUN ./build-local-code /modulator/code/gsi/recipe.yaml --initsh /usr/share/modules/init/sh --output /modules && \
find /modules -type d -exec chmod 777 {} \; && \
find /modules -type f -exec chmod 777 {} \;

# add the user
RUN groupadd -r -g 1000 ubuntu && useradd -r -g ubuntu -u 1000 ubuntu
USER ubuntu

# copy the setup file to load the modules at startup
COPY .bashrc /home/ubuntu/.bashrc

CMD /bin/bash
82 changes: 82 additions & 0 deletions Dockstore_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# dockstore_variantEffectPredictor

The workflow is made to run in Docker and uploaded to [Dockstore](https://docs.dockstore.org/en/develop/getting-started/getting-started.html).
You can find OICR's Dockstore page [here](https://dockstore.org/organizations/OICR).
The Docker container is based on [Modulator](https://gitlab.oicr.on.ca/ResearchIT/modulator), which builds environment modules to set up the docker runtime environment.

### Set Up and Run
Currently, this WDL must be run with Cromwell.
It uses Cromwell configuration files to mount a directory to the docker container.
The directory contains data modules built with Modulator, which the WDL tasks need to access.
In addition, you must obtain run files locally and build data modules to a local directory.

#### 1. Build Data Modules
- Create a local `data_modules/` directory to store the data modules
- make sure you have enough disk space
- each data module could be 5-30 GB in size
- In future iterations of the workflow, this process will be simplified
- Enter the container:
```
# Mount this repository as /pipeline/; mount the data module destination directory as /data_modules/
docker run -it --rm -v [PWD]:/pipeline -v [data_modules]:/data_modules [CONTAINER ID (find in options.json)]

# Copy prerequisite code module YAMLs into the Modulator code directory (code/gsi/)
cp /pipeline/recipes/variant_effect_predictor_data_modules_prep.yaml code/gsi/data_modules_recipe_prep.yaml

# Build the prerequisite code modules
./build-local-code code/gsi/data_modules_recipe_prep.yaml --output /data_modules --initsh /usr/share/modules/init/sh

# Copy data module YAMLs into the Modulator data directory (data/gsi/)
cp /pipeline/recipes/variant_effect_predictor_data_modules.yaml data/gsi/data_modules_recipe.yaml

# Build the data modules
./build-local-data data/gsi/data_modules_recipe.yaml --output /data_modules --initsh /usr/share/modules/init/sh

# Change resulting file permissions
find /data_modules/ -type d -exec chmod 777 {} \; && \
find /data_modules/ -type f -exec chmod 777 {} \;

# /data_modules/ should now contain gsi/modulator/modulefiles/Ubuntu18.04/ and gsi/modulator/modulefiles/data/
```
For run directories that are not part of modules, copy them from UGE's archive at `/.mounts/labs/gsi/src/`

#### 2. Obtain Files Locally
In the test json, change file paths like so:
- File type files should be copied to local
- E.g. use scp to copy from UGE
- In the json, change the file path from UGE to local path
- String type files should be copied or moved to the mounted directory, if it's not already part of a module
- In the json, change the file path to how the file would be accessed from inside the docker container
- $MODULE_ROOT paths can stay the same
```
# File type files
# File is copied to local machine
UGE: "/.mounts/labs/gsi/testdata/wgsPipeline/input_data/wgsPipeline_test_pcsi/hg19_random.genome.sizes.bed"
Dockstore: "/home/ubuntu/data/sample_data/callability/hg19_random.genome.sizes.bed"

# String type files
# /data_modules/ is a directory mounted to the docker container
UGE: "/.mounts/labs/gsi/modulator/sw/data/hg19-p13/hg19_random.fa"
Dockstore: "/data_modules/gsi/modulator/sw/data/hg19-p13/hg19_random.fa"

# Root type paths
# The value of $MODULE_ROOT changes, but the path stays the same
UGE: "$HG19_BWA_INDEX_ROOT/hg19_random.fa"
Dockstore: "$HG19_BWA_INDEX_ROOT/hg19_random.fa"
```

#### 3. Run with Cromwell
Submit the preprocessed subworkflow and modified json to Cromwell, with configs and options attached
```
# Validate the wrapper workflow and json
java -jar $womtool validate [WDL] --inputs [TEST JSON]

# For example:
java -jar $womtool validate wgsPipeline.wdl --inputs tests/wgsPipeline_test_cre_uge.json

# Submit to Cromwell
java -Dconfig.file=[CONFIG] -jar $cromwell run [WRAPPER WDL] --inputs [JSON] --options [OPTIONS]

# For example:
java -Dconfig.file=local.config -jar $cromwell run wgsPipeline.wdl --inputs tests/wgsPipeline_test_cre.json --options options.json
```
49 changes: 49 additions & 0 deletions local.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
backend {
default = "Local"
providers {
Local {
actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory"
config {
concurrent-job-limit = 10
#run-in-background = true
runtime-attributes = """
String? docker
String? docker_volume
String? modules
"""
submit = "/usr/bin/env bash ${script}"
submit-docker = """
docker run \
--rm -i \
-v ${cwd}:${docker_cwd} \
${"-v " + docker_volume} \
${docker} /bin/bash -c 'source /home/ubuntu/.bashrc; ${"module load " + modules + " || exit 1; "} /bin/bash ${docker_script}'
"""
root = "cromwell-executions"
dockerRoot = "/cromwell-executions"
}
}
}
}
call-caching {
enabled = true
invalidate-bad-cache-results = true
}
database {
profile = "slick.jdbc.HsqldbProfile$"
db {
driver = "org.hsqldb.jdbcDriver"
url = """
jdbc:hsqldb:file:/tmp/cromwell-executions/cromwell-db/cromwell-db;
shutdown=false;
hsqldb.default_table_type=cached;hsqldb.tx=mvcc;
hsqldb.result_max_memory_rows=10000;
hsqldb.large_data=true;
hsqldb.applog=1;
hsqldb.lob_compressed=true;
hsqldb.script_format=3
"""
connectionTimeout = 120000
numThreads = 2
}
}
6 changes: 6 additions & 0 deletions options.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"default_runtime_attributes": {
"docker": "g3chen/varianteffectpredictor:2.0",
"docker_volume": "/home/ubuntu/Downloads/sample_data:/data"
}
}
224 changes: 224 additions & 0 deletions recipes/variant_effect_predictor_data_modules.yaml

Large diffs are not rendered by default.

112 changes: 112 additions & 0 deletions recipes/variant_effect_predictor_data_modules_prep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# htslib/1.9
- name: htslib
version: 1.9
build_type: autotools
build_args:
prereq_args:
prereq_args:
sha256: e04b877057e8b3b8425d957f057b42f0e8509173621d3eccaedd0da607d9929a
url: https://github.com/samtools/htslib/releases/download/1.9/htslib-1.9.tar.bz2
prereq_type: download
prereq_type: extract
system_depends:
- name: libbz2-dev
- name: liblzma-dev

# samtools/1.9
- name: samtools
version: 1.9
build_type: autotools
build_args:
prereq_args:
prereq_args:
sha256: 083f688d7070082411c72c27372104ed472ed7a620591d06f928e653ebc23482
url: https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2
prereq_type: download
prereq_type: extract
configure:
- --with-htslib={htslib_root}
- --enable-configure-htslib=false
depends:
- name: htslib
version: 1.9
system_depends:
- name: libncurses5-dev

# fasplit/1.0
- name: fasplit
version: 1.0
hidden: true
build_type: copy
build_args:
from: /pipeline/build_files/faSplit-20200114T16_09
to: "bin/faSplit"

# java/8
- name: java
version: 8
build_type: extract
build_args:
prereq_args:
sha256: 4ee3b37cb70fe1dbfad0ef449fe2c5fec43d81bd37ef0a65ec9f65afac190b4f
url: https://github.com/AdoptOpenJDK/openjdk8-upstream-binaries/releases/download/jdk8u222-b10/OpenJDK8U-jdk_x64_linux_8u222b10.tar.gz
prereq_type: download
system_depends:
- name: libfontconfig1-dev

# picard/2.19.2
- name: picard
version: 2.19.2
build_type: copy
build_args:
prereq_args:
sha256: 2b27f3c19529bfa9b1120b9a149b7b2a5ddf0832b1a9011dc803a80779b8ca35
url: https://github.com/broadinstitute/picard/releases/download/2.19.2/picard.jar
prereq_type: download
to: 'picard.jar'
depends:
- name: java
version: 8

# bcftools/1.9
- name: bcftools
version: 1.9
permitted_os: ["Ubuntu18.04"]
build_type: autotools
build_args:
prereq_args:
prereq_args:
sha256: 6f36d0e6f16ec4acf88649fb1565d443acf0ba40f25a9afd87f14d14d13070c8
url: https://github.com/samtools/bcftools/releases/download/1.9/bcftools-1.9.tar.bz2
prereq_type: download
prereq_type: extract
configure: ["--enable-libgsl"]
depends:
- name: htslib
version: 1.9
system_depends:
- name: libgsl-dev
- name: zlib1g-dev
- name: libbz2-dev
- name: liblzma-dev

# tabix/1.9
- name: tabix
version: 1.9
build_type: aggregate
depends:
- name: htslib
version: 1.9

# vep-hg19-filter-somaticsites/0
- name: vep-hg19-filter-somaticsites
version: 0
build_type: copy
build_args:
from: /pipeline/build_files/vep_hg19_filter_somaticsites.sh
to: bin/vep_hg19_filter_somaticsites
depends:
- name: bcftools
version: 1.9
- name: tabix
version: 1.9
Loading