2. Quickstart

Note

As the CustardPy commands below are included in the CustardPy docker image, you need to add docker or singularity commands as shown below.

# This example command will mount the /work directory of the host machine
# For docker
singularity exec [--nv] --bind /work custardpy.sif <command>
# For singularity
docker run --rm -it [--gpus all] -v /work:/work rnakato/custardpy <command>

# Example of custardpy_juicer
# For docker
docker run --rm -it --gpus all -v /work:/work rnakato/custardpy \
    custardpy_juicer -p $ncore -a $gene -b $build -g $gt \
    -i $bwaindex -e $enzyme -z $fastq_post $fqdir $cell

# For singularity
singularity exec --nv --bind /work custardpy.sif \
    custardpy_juicer -p $ncore -a $gene -b $build -g $gt \
    -i $bwaindex -e $enzyme -z $fastq_post $fqdir $cell

See also the sample scripts in the tutorial on GitHub.

2.1. Hi-C analysis using Juicer

2.1.1. Hi-C analysis from FASTQ files

You can implement whole commands for Juicer analysis from FASTQ files using custardpy_juicer command.

build=hg38   # genome build
gt=genometable.$build.txt # genome_table file
gene=refFlat.$build.txt   # gene annotation (refFlat format)
bwaindex=bwa-indexes/$build  # BWA index file
ncore=64  # number of CPUs

cell=Control
fastq_post="_"  # "_" or "_R"
enzyme=MboI

fqdir=fastq/$cell
custardpy_juicer -p $ncore -a $gene -b $build -g $gt \
    -i $bwaindex -e $enzyme -z $fastq_post $fqdir $cell
  • custardpy_juicer assumes that the fastq files are stored in fastq/$cell (here fastq/Control). The outputs are stored in CustardPyResults_Hi-C/Juicer_$build/$cell.

  • $fastq_post indicates the filename of input fastqs is *_[1|2].fastq.gz or *_[R1|R2].fastq.gz.

  • Avaible genome build: hg19, hg38, mm10, mm39, rn7, galGal5, galGal6, ce10, ce11, danRer11, dm6, xenLae2, sacCer3

  • Available Enzymes: HindIII, DpnII, MboI, Sau3AI, Arima, AluI

2.1.2. Hi-C analysis from a .hic file

If you start the Hi-C analysis from a .hic file, use custardpy_process_hic command.

build=hg38   # genome build
gt=genometable.$build.txt # genome_table file
gene=refFlat.$build.txt   # gene annotation (refFlat format)
ncore=64  # number of CPUs
cell=Control
hic=sample.hic

custardpy_process_hic -p $ncore -n $norm -g $gt -a $gene $hic $cell
  • The outputs are stored in $cell.

Note

Due to the backward incompatibility of Juicertools, custardpy_process_hic fails with an error when processing .hic files created by older Juicertools. In this case, use the -o option which uses older versions of Juicertools in CustardPy.

2.2. Hi-C analysis using Cooler

CustardPy allows the Hi-C analysis by Cooler and cooltools. custardpy_cooler_HiC generates a .cool file and converts it to a .hic file. You can apply custardpy_process_hic command to it. The outputs are stored in CustardPyResults_MicroC/Cooler_$build//$cell.

build=hg38
gt=genometable.hg38.txt
index_bwa=bwa-indexes/hg38
gene=refFlat.$build.txt
genome=genome.$build.fa
ncore=64

cell=Control
enzyme=MboI

# Generate .cool and .hic files from FASTQ
custardpy_cooler_HiC -g $gt -b $build -f $genome -i $index_bwa -p $ncore fastq/$cell $cell

# Downstream analysis using .hic
odir=CustardPyResults_cooler/$build/$cell
hic=$odir/hic/contact_map.q30.hic
norm=SCALE
custardpy_process_hic -p $ncore -n $norm -g $gt -a $gene $hic $odir

2.3. Micro-C analysis using Cooler

Micro-C analysis by Cooler and cooltools.

2.3.1. Micro-C using BWA

The command custardpy_cooler_MicroC maps Micro-C reads by BWA and makes .cool and .hic files. The .hic file is processed using custardpy_process_hic.

build=mm39
ncore=64
gt=genome_table.$build.txt  # genome_table file
bwa_index=bwa-indexes/UCSC-$build
genome=genome.$build.fa
cell=C36_rep1   # modify this for your FASTQ data

# Generate .hic file from FASTQ
custardpy_cooler_MicroC -t bwa -g $gt -f $genome -i $bwa_index -p $ncore fastq/$cell $cell

# Juicer analysis with the .hic file
odir=CustardPyResults_MicroC/Cooler_bwa/$cell
hic=$odir/hic/contact_map.q30.hic
norm=SCALE

custardpy_process_hic -p $ncore -n $norm -g $gt -a $gene $hic $odir
  • custardpy_cooler_MicroC assumes that the fastq files are stored in fastq/$cell (here fastq/C36_rep1). The outputs are stored in CustardPyResults_MicroC/Cooler_bwa/$cell.