Visualization =============================== This page introduces the commands in **CustardPy** to plot the 3D visualization. These commands use the output data generated by ``custardpy_juicer`` or ``custardpy_process_hic``. Here we use the example data used in :doc:`StepbyStep`. .. contents:: :depth: 3 plotHiCMatrix ---------------------------------------------------------------- ``plotHiCMatrix`` visualizes a contact map as a simple square heatmap. The input data is a dense matrix output from ``makeMatrix_intra.sh``. The contact level is normalized by the total number of mapped reads for the chromosome. .. code-block:: bash plotHiCMatrix Example: .. code-block:: bash chr=chr20 start=8000000 end=16000000 resolution=25000 norm=SCALE matrix=CustardPyResults_Hi-C/Juicer_hg38/Control/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz plotHiCMatrix \ $matrix \ ContactMap.Control.$chr.$start-$end.pdf \ $start $end Control .. figure:: img/plotHiCMatrix.jpg :width: 500px :align: center :alt: Alternate plotHiCMatrix drawSquareMulti ------------------------------------------------------ ``drawSquareMulti`` visualizes the contact map of multiple Hi-C samples as triangle heatmaps. The input data is a dense matrix output from ``makeMatrix_intra.sh``. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 # linear scale drawSquareMulti \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o SquareMulti.$chr \ -c $chr --start $start --end $end --type $norm -r $resolution .. figure:: img/SquareMulti.jpg :width: 500px :align: center :alt: Alternate SquareMulti The dashed squares indicate TADs. Add ``--log`` option to visualize a log-scale heatmap. .. code-block:: bash # log scale drawSquareMulti \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o SquareMulti.$chr \ -c $chr --start $start --end $end \ --type $norm -r $resolution --log .. figure:: img/SquareMulti.log.jpg :width: 500px :align: center :alt: Alternate SquareMulti (log scale) drawSquareRatioMulti ------------------------------------------------------ ``drawSquareRatioMulti`` visualizes a relative contact frequency (log scale) of 2nd to the last samples against the first sample. The input data is a dense matrix output from ``makeMatrix_intra.sh``. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 drawSquareRatioMulti \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o SquareRatioMulti.$chr \ -c $chr --start $start --end $end --type $norm -r $resolution .. figure:: img/drawSquareRatioMulti.jpg :width: 500px :align: center :alt: Alternate drawSquareRatioMulti This figure shows the relative contact frequency of 2nd (siCTCF) and 3rd (siRad21) against 1st (Control). drawSquarePair ------------------------------------------------------ ``drawSquarePair`` shows two samples in a single square heatmap. The first and second samples are visualzed in the upper and bottom triagles, respectively. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 drawSquarePair \ $Resdir/Control/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz:Control \ $Resdir/siRad21/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz:siRad21 \ -o SquarePair.$chr --start $start --end $end -r $resolution .. figure:: img/drawSquarePair.jpg :width: 400px :align: center :alt: Alternate drawSquarePair The command visualizes Control and siRad21 in the upper and bottom triagles, respectively. drawSquareRatioPair ------------------------------------------------------ Similar to ``drawSquarePair``, ``drawSquareRatioPair`` shows the relative contact map of two sample pairs in a single square heatmap. This command visualize the log-scale frequency of ``sample2/sample1`` and ``sample4/sample3`` in the upper and bottom triagles, respectively. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 drawSquareRatioPair \ $Resdir/Control/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz:Control \ $Resdir/siRad21/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz:siRad21 \ $Resdir/Control/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz:Control \ $Resdir/siCTCF/Matrix/intrachromosomal/$resolution/observed.$norm.$chr.matrix.gz:siCTCF \ -o drawSquareRatioPair.$chr --start $start --end $end -r $resolution .. figure:: img/drawSquareRatioPair.jpg :width: 400px :align: center :alt: Alternate drawSquareRatioPair In this case, siRad21/Control and siCTCF/Control are visualized in the upper and bottom triagles, respectively. drawTriangleMulti ------------------------------------------------------ ``drawTriangleMulti`` visualizes the contact map of multiple Hi-C samples as triangle heatmaps. The input data is a dense matrix output from ``makeMatrix_intra.sh``. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 drawTriangleMulti \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o TriangleMulti.$chr \ -c $chr --start $start --end $end --type $norm -d 5000000 -r $resolution .. figure:: img/drawTriangleMulti.jpg :width: 600px :align: center :alt: Alternate drawTriangleMulti The black dashed lines and blue circles indicate TADs and loops, respectively. drawTrianglePair ------------------------------------------------------ ``drawTrianglePair`` visualizes a contact map of the first and second sample in upper and lower triangles, respectively. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 drawTrianglePair \ $Resdir/Control:Control \ $Resdir/siRad21:siRad21 \ -o TrianglePair.$chr \ -c $chr --start $start --end $end --type $norm -d 5000000 -r $resolution .. figure:: img/drawTrianglePair.jpg :width: 500px :align: center :alt: Alternate drawTrianglePair The black dashed lines and blue circles indicate TADs and loops, respectively. plotHiCfeature ------------------------------------------------------ ``plotHiCfeature`` can visualize various features values of chromatin folding for multiple samples. .. code-block:: bash plotHiCfeature [-h] [-o OUTPUT] [-c CHR] [--type TYPE] [--distance DISTANCE] [-r RESOLUTION] [-s START] [-e END] [--multi] [--multidiff] [--compartment] [--di] [--drf] [--drf_right] [--drf_left] [--triangle_ratio_multi] [-d VIZDISTANCEMAX] [--v4c] [--vmax VMAX] [--vmin VMIN] [--vmax_ratio VMAX_RATIO] [--vmin_ratio VMIN_RATIO] [--anchor ANCHOR] [--xsize XSIZE] [input [input ...]] positional arguments: input <Input directory>:<label> optional arguments: -h, --help show this help message and exit -o OUTPUT, --output OUTPUT Output prefix -c CHR, --chr CHR chromosome --type TYPE normalize type (default: SCALE) --distance DISTANCE distance for DI (default: 500000) -r RESOLUTION, --resolution RESOLUTION resolution (default: 25000) -s START, --start START start bp (default: 0) -e END, --end END end bp (default: 1000000) --multi plot MultiInsulation Score --multidiff plot differential MultiInsulation Score --compartment plot Compartment (eigen) --di plot Directionaly Index --drf plot Directional Relative Frequency --drf_right (with --drf) plot DirectionalRelativeFreq (Right) --drf_left (with --drf) plot DirectionalRelativeFreq (Left) --triangle_ratio_multi plot Triangle ratio multi -d VIZDISTANCEMAX, --vizdistancemax VIZDISTANCEMAX max distance in heatmap --v4c plot virtual 4C from Hi-C data --vmax VMAX max value of color bar (default: 50) --vmin VMIN min value of color bar (default: 0) --vmax_ratio VMAX_RATIO max value of color bar for logratio (default: 1) --vmin_ratio VMIN_RATIO min value of color bar for logratio (default: -1) --anchor ANCHOR (for --v4c) anchor point --xsize XSIZE xsize for figure (default: max(length/2M, 10)) - ``Input`` should be "<sample directory>:<label>". - ``<sample directory>`` is the output directory by ``custardpy_juicer``. - ``<label>`` is the label used in the figure. - In default, ``plotHiCfeature`` uses a 25-kbp bin matrix. Supply ``-r`` option to change the resolution. - ``type`` is the normalization type defined by Juicer (SCALE/KR/VC_SQRT/NONE). Insulation score +++++++++++++++++++++++++++ By default, ``plotHiCfeature`` outputs a single insulation score (500 kbp distance). .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -c $chr --start $start --end $end -r $resolution \ --type $norm -d 5000000 \ -o IS.$chr.$start-$end .. figure:: img/plotHiCfeature_IS.jpg :width: 500px :align: center :alt: Alternate Text Insulation score ``plotHiCfeature`` always draws compartment PC1 (line plot in the second row) to roughly estimate compartments A and B. The third and fourth rows are the heatmap and line plot for the feature value specified (insulation score in this case). Multi-insulation score +++++++++++++++++++++++++++ ``plotHiCfeature`` can also calculate a "multi-scale insulation score" `[Crane et al., Nature, 2015] <https://www.nature.com/articles/nature14450>`_ by supplying ``--multi`` option. .. figure:: img/multi-insulationscore.jpg :width: 400px :align: center :alt: Alternate Text Schematic illustration of multi-insulation score Red regions in the heatmap indicate the insulated regions (TAD boundary-like). The lower and upper sides of the heatmap are 100 kbp to 1 Mbp distances, respectively. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -c $chr --start $start --end $end -r $resolution \ --multi --type $norm -d 5000000 \ -o MultiIS.$chr.$start-$end .. figure:: img/plotHiCfeature_multiIS.jpg :width: 500px :align: center :alt: Alternate Text Multi-insulation score Differential multi-insulation score ++++++++++++++++++++++++++++++++++++++++++++++++ To directory investigate the difference of multi-insulation score, we provide **differential multi-insulation score** by ``--multidiff`` option. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -c $chr --start $start --end $end -r $resolution \ --multidiff --type $norm -d 5000000 \ -o MultiISdiff.$chr.$start-$end .. figure:: img/plotHiCfeature_multiISdiff.jpg :width: 500px :align: center :alt: Alternate Text Differential multi-insulation score The heatmap shows the difference against the first sample (``siCTCF - Control`` and ``siRad21 - Control`` in this case). Compartment PC1 +++++++++++++++++++++++++++ While the line plot in the second row shows the PC1 value of the first sample, ``plotHiCfeature --compartment`` visualizes the PC1 values for multiple samples. This plot can be used to explore compartment switching. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -c $chr --start $start --end $end -r $resolution \ --compartment --type $norm -d 5000000 \ -o Compartment.$chr.$start-$end .. figure:: img/plotHiCfeature_compartment.jpg :width: 500px :align: center :alt: Alternate Text Compartment PC1 Directionality index +++++++++++++++++++++++++++ The directionality index identifies TAD boundaries by capturing the bias in contact frequency up- and downstream of a TAD `[Dixon et al., Nature, 2012] <https://www.nature.com/articles/nature11082>`_. The "left side" and "right side" of a TAD are likely to have positve and negative values, respectively. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -c $chr --start $start --end $end -r $resolution \ --di --type $norm -d 5000000 \ -o DI.$chr.$start-$end .. figure:: img/plotHiCfeature_DI.jpg :width: 500px :align: center :alt: Alternate Text Directionality index Directional relative frequency ++++++++++++++++++++++++++++++++++++++ **Directional relative frequency (DRF)** is a score that our group previously proposed `[Nakato et al, Nature Communications, 2023] <https://doi.org/10.1038/s41467-023-41316-4>`_. This score estimates the inconsistency of relative contact frequence (log scale) between the left side (3′) and right side (5′). .. figure:: img/Directional_relative_frequency.jpg :width: 400px :align: center :alt: Alternate Text Schematic illustration of directional relative frequency .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -c $chr --start $start --end $end -r $resolution \ --drf --type $norm -d 5000000 \ -o DRF.$chr.$start-$end .. figure:: img/plotHiCfeature_DRF.jpg :width: 500px :align: center :alt: Alternate Text Directional relative frequency drawTriangleRatioMulti ++++++++++++++++++++++++++++++++++++++ ``drawTriangleRatioMulti`` visualizes a relative contact frequency (log scale) of 2nd to the last samples against the first sample. Directional relative frequency is also shown. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o TriangleRatioMulti.$chr \ -c $chr --start $start --end $end -r $resolution \ --triangle_ratio_multi --type $norm -d 5000000 .. figure:: img/plotHiCfeature_TriangleRatioMulti.jpg :width: 500px :align: center :alt: Alternate Text TriangleRatioMulti "Right" and "left" shown as blue and orange line plots in the second row indicate the "B" and "A" in Fig. 4.15. Virtual 4C ++++++++++++++++++++++++++++++++++++++ ``Virtual 4C`` visualizes a 4C-like plot (interation from the anchor site) using Hi-C data. Use ``--anchor`` option to specify the anchor site. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotHiCfeature \ $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o virtual4C.$chr \ -c $chr --start $start --end $end -r $resolution \ --v4c --anchor 10400000 --vmax 100 --type $norm .. figure:: img/plotHiCfeature_v4c.jpg :width: 500px :align: center :alt: Alternate Text Virtual 4C plotCompartmentGenome ------------------------------------------------------ Plot a PC1 value of multiple samples for the whole genome. .. code-block:: bash Resdir=CustardPyResults_Hi-C/Juicer_hg38 norm=SCALE resolution=25000 chr=chr20 start=8000000 end=16000000 plotCompartmentGenome $Resdir/Control:Control \ $Resdir/siCTCF:siCTCF \ $Resdir/siRad21:siRad21 \ -o CompartmentGenome -r 25000 --type VC_SQRT .. figure:: img/plotCompartmentGenome.jpg :width: 700px :align: center :alt: Alternate plotCompartmentGenome plotInsulationScore ------------------------------------------------------ Plot a line graph of insulation score. The input data is a dense matrix output from ``makeMatrix_intra.sh``. .. code-block:: bash plotInsulationScore [-h] [--num4norm NUM4NORM] [--distance DISTANCE] [--sizex SIZEX] [--sizey SIZEY] matrix output resolution Example: plotInsulationScore WT/intrachromosomal/25000/observed.KR.chr7.matrix.gz InsulationScore_WT.chr7.png 25000 .. figure:: img/InsulationScore.png :width: 700px :align: center :alt: Alternate InsulationScore plotMultiScaleInsulationScore ------------------------------------------------------ Plot multi-scale insulation scores from Juicer matrix. .. code-block:: bash plotMultiScaleInsulationScore [-h] [--num4norm NUM4NORM] [--sizex SIZEX] [--sizey SIZEY] matrix output resolution Example: plotInsulationScore WT/intrachromosomal/25000/observed.KR.chr7.matrix.gz MultiInsulationScore_WT.chr7.png 25000 .. figure:: img/MultiInsulationScore.chr7.jpg :width: 700px :align: center :alt: Alternate Multi-Insulation Score (chr7) (Jupyter notebook) Plot Contact frequency distribution ---------------------------------------------------------------- .. toctree:: :glob: :maxdepth: 1 ContactDistancePlot.ipynb (Jupyter notebook) Plot Loop length distribution ---------------------------------------------------------------- .. toctree:: :glob: :maxdepth: 1 LoopLengthPlot.ipynb