.. _home-page-moptail:

*******************
MOP_TAIL
*******************

.. autosummary::
   :toctree: generated
   
This module takes as input the output from MOP_PREPROCESS: basecalled fast5 reads, together with their respective fastq files, alignment and assignment read ID to gene/transcript. It outputs the estimation of poly(A) tail length at read level provided by **Tailfindr**, **Nanopolish** or both. 

Input Parameters
======================

.. list-table:: 
   :widths: 25 75
   :header-rows: 1
   
   * - Parameter name
     - Description
   * - **input_path**
     - Output folder generated by mop_preprocess 
   * - **reference**
     - reference sequences
   * - **output**
     - Output folder
   * - **pars_tools**
     - TSV file with optional extra command line parameters for the tool indicated in the first field.
   * - **tailfindr**
     - It (in)activate the corresponding flow. It can be YES or NO
   * - **nanopolish**
     - It (in)activate the corresponding flow. It can be YES or NO
   * - **email**
     - Email for pipeline reporting.
   
  
How to run the pipeline
=============================
Before launching the pipeline,user should:

1. Decide which containers to use - either docker or singularity **[-with-docker / -with-singularity]**.
2. Fill in both **params.config** and **tools_opt.tsv** files.

To launch the pipeline, please use the following command:

.. code-block:: console

   nextflow run mop_tail.nf -with-singularity > log.txt


You can run the pipeline in the background adding the nextflow parameter **-bg**:

.. code-block:: console

   nextflow run mop_tail.nf -with-singularity -bg > log.txt

You can change the parameters either by changing **params.config** file or by feeding the parameters via command line:

.. code-block:: console

   nextflow run mop_tail.nf -with-singularity -bg --output test2 > log.txt


You can specify a different working directory with temporary files:

.. code-block:: console

   nextflow run mop_tail.nf -with-singularity -bg -w /path/working_directory > log.txt


.. note::
 
   * In case of errors you can troubleshoot seeing the log file (log.txt) for more details. Furthermore, if more information is needed, you can also find the working directory of the process in the file. Then, access that directory indicated by the error output and check both the **.command.log** and **.command.err** files. 


.. tip::

   Once the error has been solved or if you change a specific parameter, you can resume the execution with the **Netxtlow** parameter **- resume** (only one dash!). If there was an error, the pipeline will resume from the process that had the error and proceed with the rest. If a parameter was changed, only processes affected by this parameter will be re-run. 


.. code-block:: console

   nextflow run mop_tail.nf -with-singularity -bg -resume > log_resumed.txt

To check whether the pipeline has been resumed properly, please check the log file. If previous correctly executed process are found as *Cached*, resume worked!
   
Results
====================

Several folders are created by the pipeline within the output directory specified by the **output** parameter:

1. NanoPolish: contains the output of *nanopolish* tool.
2. Tailfindr: contains the output of *tailfindr* tool.
3. PolyA_final: contains the txt files with the combined results (i.e. predicted polyA sizes). Here an example of a test:

.. code-block:: console

   "Read name"	"Tailfindr"	"Nanopolish"	"Gene Name"
   "013a5dde-9c52-4de1-83eb-db70fb2cd130"	52.16	49.39	"YKR072C"
   "01119f62-ca68-458d-aa1f-cf8c8c04cd3b"	231.64	274.28	"YDR133C"
   "0154ce9c-fe6b-4ebc-bbb1-517fdc524207"	24.05	24.24	"YFL044C"
   "020cde28-970d-4710-90a5-977e4b4bbc46"	41.27	56.79	"YGL238W" 

If both softwares are run, an additional plot which shows the correlation of their results is generated. 


Here an example of a result:

.. image:: ../img/moptail.png
  :width: 800