-
Trimmomatic 0.40
download java jar from Trimmomatic releases, place in /Applications
-
fastqc 0.12.1
brew install fastqc
-
bowtie2 2.5.5
brew install bowtie2 -
parallel GNU parallel 20260522
brew install parallel -
samtools 1.23
brew install samtools
-
bowtie2 indexes for GRCr8
download indexes from https://benlangmead.github.io/aws-indexes/bowtie (bowtie2 maintainer site), put into
sequences/bowtie2_indexes/GRCr8/
curl -L -o sequences/bowtie2_indexes/GRCr8.zip \
https://genome-idx.s3.amazonaws.com/bt/GRCr8.zip-
GRCr8_TSS_1kb.bed
see
build_TSS_1kb.mdin this repository. The 1kb flanking TSS regions file is made bymake_tss_bed.zshfrom GRCr8 annotation files.
ssh thoupt@hpc-login.rcc.fsu.edu
cd /gpfs/research/medicine/sequencer/NovaSeqXPlus/Outputs_XP/2025_Outputs_XP/
rsync -avP *.fastq.gz USERNAME@pauper.bio.fsu.edu:~/FOLDERNAMEOFCHOICEor, copy to local directory
rsync -avP thoupt@hpc-login.rcc.fsu.edu:/gpfs/research/medicine/sequencer/NovaSeqXPlus/Outputs_XP/2025_Outputs_XP/Thomas_Houpt_11-19-2025_SN_Medull ./view first 10 lines
gzcat SN_Medulla_10U_S1_L008_R1_001.fastq.gz | head -n 10mkdir fastqc_raw
for i in *fastq*; do fastqc $i -t 15 -o fastqc_raw/; done &> fastqc_raw.log To download and view the fastqc.html files, use rsync
rsync -avc sn23h@pauper.bio.fsu.edu:~/medulla_analysis2/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla/fastqc_raw/*.html .- install
brew install fastqc - install
brew install parallel
Script runs against all fastq.gz files in source directory, uses parallel for speed up, logs fastqc messages to fastqc_raw.log
./do_fastqc.sh <source_directory> <fastqc_output_directory>On MacStudio for 2 samples with R1 and R2 (so 4 fastq files) about 1 hour
https://pmc.ncbi.nlm.nih.gov/articles/PMC4103590/ http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf http://www.usadellab.org/cms/index.php?page=trimmomatic
run in same directory as fastq.gz files
cd ~/medulla_analysis2/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla
nohup bash -c 'for i in *_R1*; do java -jar ~/Trimmomatic-0.39/trimmomatic-0.39.jar PE -threads 20 -phred33 "$i" "${i/R1/R2}" "${i/R1/R1_paired}" "${i/R1/R1_unpaired}" "${i/R1/R2_paired}" "${i/R1/R2_unpaired}" ILLUMINACLIP:/home/sn23h/Trimmomatic-0.39/adapters/TruSeq3-PE.fa:2:30:10:1:TRUE MINLEN:25 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 < /dev/null; done' > trimming_B.log 2>&1 &You can monitor progress with tail -f trimming_B.log.
Download jar from Trimmomatic releases: version 0.40 has parallel unzipping.
adapter files are in Trimmomatic-0.40/adapters, and Trimmomatic looks there automatically.
Query: which are appropriate adapters? NEBNext_PE from the library kit?
./trim_pe.zsh ./sequences/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla
You can monitor progress with tail -f trimming.log.
**put the original sequencing files in /raw, and the paired/unpaired files in /trimmed
./do_fastqc.sh ./do_fastqc.sh ./sequences/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla/trimmed ./fastqc_trimmed
We align to current (2024) reference genome assembly GRCr8 for rat
- paper
- NCBI site
- assembly report (txt download) -- gives ascension numbers for each chromosome and mitochondrion
install brew install bowtie2
- need to get bowtie2 indices for rat genome (GRCr8 for rat)
download indexes from https://benlangmead.github.io/aws-indexes/bowtie (bowtie2 maintainer site), put into sequences/bowtie2_indexes/GRCr8/
to copy bowtie2 indexes to pauper, use curl:
curl -L -o sequences/bowtie2_indexes/GRCr8.zip https://genome-idx.s3.amazonaws.com/bt/GRCr8.zipand place in ./bowtie2_indexes/GRCr8
run bowtie2 and pipe through samtools to get BAM files:
nohup ./run_bowtie2.zsh <source_directory> <destination_directory> &e.g.
nohup ./run_bowtie2.zsh ./sequences/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla/trimmed ./sequences/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla/aligned &Outputs BAM files to the destination directory. Logs to bowtie.log (and bowtie2 logs per-sample bowtie2.log).
- specify non-discordant and no-mixed
- -x: Specifies the index (use the prefix). -- currently hardcoded to "$SCRIPT_DIR/bowtie2_indexes/GRCr8"
- -1, -2: Your forward and reverse read files (can be gzipped).
- -p 8: Uses 8 threads for faster alignment (adjust as needed).
The bowtie2 alignment results are piped to samtools to directly produce sorted BAM files. For each generated BAM file, samtools index is called to generate bam.bai index files, and samtools flagstat is called to provide summary statistics.
To view BAM file contents:
samtools view input.bam | head -10 # first 10 alignment records
samtools view -h input.bam | head -10 # include header lines (@HD, @SQ, etc.)
samtools head input.bam # header onlyto copy to pauper:
rsync -avP -c houpt@bio-k2067c-mac.bio.fsu.edu:/Users/houpt/Programming_Github/MNase-Seq_Analysis/sequences/Thomas_Houpt_05-29-2026_Houpt_SN_Medulla/Houpt_SN_Medulla/aligned/ ./aligned 2>&1 | grep -i -E 'error|denied|failed|permission'
(samtools flagstat reports no duplicates? so do we need to do this step)
samtools fixmate -m Sorted_names.bam Fixmate.bam samtools sort -o Sorted.bam Fixmate.bam samtools markdup -r -s Sorted.bam Final_File.bam