.. _home-page-mopmod:

*******************
MOP_MOD
*******************

.. autosummary::
   :toctree: generated

This module takes as input the output from MOP_PREPROCESS: basecalled fast5 reads, together with their respective fastq files and unspliced alignments to the transcriptome . It runs four different RNA detection algorithms (Epinano, Nanopolish, Tombo and Nanocompore) and it outputs the predictions generated by each one of them as individual tab-delimited files. 
   

Input Parameters
======================

.. list-table:: 
   :widths: 25 75
   :header-rows: 1

   * - Parameter name
     - Description
   * - **input_path**
     - Output folder generated by mop_preprocess 
   * - **comparison**
     - TSV file with two fields, each one will indicate the ID of the sample that has to be compared 1 vs 1
   * - **reference**
     - reference sequences
   * - **output**
     - Output folder
   * - **pars_tools**
     - TSV file with optional extra command line parameters for the tool indicated in the first field.
   * - **epinano**
     - It (in)activate the corresponding flow. It can be YES or NO
   * - **nanocompore**
     - It (in)activate the corresponding flow. It can be YES or NO
   * - **tombo_lsc**
     - It (in)activate the corresponding flow. It can be YES or NO
   * - **tombo_msc**
     - It (in)activate the corresponding flow. It can be YES or NO
   * - **epinano_plots**
     - If YES will produce a plot for each sample for each transcript.
   * - **email**
     - Email for pipeline reporting.
 

How to run the pipeline
=============================

Before launching the pipeline,user should:

1. Decide which containers to use - either docker or singularity **[-with-docker / -with-singularity]**.
2. Fill in both **params.config** and **tools_opt.tsv** files.
3. Fill in **comparison.tsv** file - please see example below:

.. code-block:: console

   wt_1 ko_1
   wt_2 ko_2


To launch the pipeline, please use the following command:

.. code-block:: console

   nextflow run mop_mod.nf -with-singularity > log.txt


You can run the pipeline in the background adding the nextflow parameter **-bg**:

.. code-block:: console

   nextflow run mop_mod.nf -with-singularity -bg > log.txt

You can change the parameters either by changing **params.config** file or by feeding the parameters via command line:

.. code-block:: console

   nextflow run mop_mod.nf -with-singularity -bg --output test2 > log.txt


You can specify a different working directory with temporary files:

.. code-block:: console

   nextflow run mop_mod.nf -with-singularity -bg -w /path/working_directory > log.txt


.. note::
 
   * In case of errors you can troubleshoot seeing the log file (log.txt) for more details. Furthermore, if more information is needed, you can also find the working directory of the process in the file. Then, access that directory indicated by the error output and check both the **.command.log** and **.command.err** files. 


.. tip::

   Once the error has been solved or if you change a specific parameter, you can resume the execution with the **Netxtlow** parameter **- resume** (only one dash!). If there was an error, the pipeline will resume from the process that had the error and proceed with the rest. If a parameter was changed, only processes affected by this parameter will be re-run. 


.. code-block:: console

   nextflow run mop_mod.nf -with-singularity -bg -resume > log_resumed.txt

To check whether the pipeline has been resumed properly, please check the log file. If previous correctly executed process are found as *Cached*, resume worked!
   

Results
====================

Several folders are created by the pipeline within the output directory specified by the **output** parameter:

1. **Epinano** results are stored in **epinano_flow** directory. It contains two files per sample: one containing data at position level and the other, at 5-mer level. Different features frequencies as well as quality data are included in the results. See example below: 

.. code-block:: console

   #Ref,pos,base,cov,q_mean,q_median,q_std,mis,ins,del
   gene_A,2515,C,45497.0,5.36995,4.00000,3.97797,0.0822032221904741,0.18715519704595907,0.2058377475437941
   gene_A,2516,A,45504.0,5.38207,4.00000,4.71619,0.17128164556962025,0.20497099156118143,0.07733386075949367
   gene_A,2517,C,45529.0,6.92130,5.00000,5.04250,0.06165301236574491,0.1505633771881658,0.13540820136616222
   gene_A,2518,A,45545.0,6.49821,5.00000,5.47485,0.10802503018992206,0.10855198155670216,0.2082775277198375
   gene_A,2519,T,45557.0,6.51247,5.00000,4.81853,0.09386043857145993,0.14792457800118533,0.2033057488421099
   
Here an example of a plot from Epinano:

.. image:: ../img/epinano.png
  :width: 600  
 

2. **Tombo** results are stored in **tombo_flow** directory. It contains one file per comparison. It reports the p-value per position, the sum of p-values per 5-mer and coverage in both WT and KO. See example below: 

.. code-block:: console

   "Ref_Position"	"Chr"	"Position"	"Tombo_SiteScore"	"Coverage_Sample"	"Coverage_IVT"	"Tombo_KmerScore"
   "gene_A_3"	"gene_A"	"3"	"0.0000"	"92"	"87"	NA
   "gene_A_4"	"gene_A"	"4"	"0.0000"	"92"	"87"	NA
   "gene_A_5"	"gene_A"	"5"	"0.0000"	"92"	"87"	0
   "gene_A_6"	"gene_A"	"6"	"0.0000"	"93"	"88"	0.0014
   "gene_A_7"	"gene_A"	"7"	"0.0000"	"95"	"89"	0.0027
   "gene_A_8"	"gene_A"	"8"	"0.0014"	"95"	"89"	0.004
   

3. **Nanopolish** results are stored in **nanopolish-compore_flow** directory. It contains two files per sample: raw eventalign output (gzipped) and another with the median raw current per position and transcript (**sample_processed_perpos_median.tsv.gz**). See example below: 

.. code-block:: console

   contig	position	reference_kmer	read_name	median	coverage
   gene_A	0	AAATT	1	113.35	433
   gene_A	1	AATTG	1	97.24	506
   gene_A	2	ATTGA	1	70.35	2034
   gene_A	3	TTGAA	1	102.03	416
   gene_A	4	TGAAG	1	115.315	422
   gene_A	5	GAAGA	1	104.25	471

4. **Nanocompore** results are stored in **nanopolish-compore_flow** directory. It contains one file per comparison (**wt_1_vs_ko_1_nanocompore_results.tsv**). Default output from Nanocompore (see Nanocompore's repository for a more detailed explanation).