<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE pise PUBLIC "pise2.dtd" "pise2.dtd" >
<pise>
    <head>
      <title>Genetree on XSEDE</title>
      <version>8.3</version>
      <description>Estimation of mutation, migration and growth rates, and ancestral inference</description>
      <authors>Bob Griffiths</authors>
      <reference>Simulating probability distributions in the coalescent. Theor. Popn. Biol., 46, 131-159, 1994. R. C. Griffiths, S. Tavaré</reference>
      <category>Phylogeny / Alignment</category>
      <doclink></doclink>
      <doclink></doclink>
  </head>
	
	<command>genetree_xsede</command>
	
<!-- ***********************************  created 3/17/2019 or thereabouts by mamiller ****************************************************************
Ancestral inference for gene trees         
Version 8.3, Bob Griffiths, 11/05/98   

Usage: genetree tree_file theta runs seed {options)      
*genetree tree (input file) 1.0 (theta) 100000 *(runs)  771 (seed) *
Options on command line or in a file (or both) {file containing options: @file_name}         
-j input sequences [sequence file name],         
-J input sequences [sequence file name] [ancestor file name],  (tree will be output in tree_file)         
-z generate all possible trees with different roots        
-e exponential growth rate file (array of rates)                   
-s number of subpopulations (default - number in tree_file)         
-m migration rate file (matrix of rates, diagonals zero)         
-p subpopulation relative size file (array of rates)                    
-f surface outfile_name (g m theta likelihood sd_like)         
-g theta surface domain [theta0 theta1 surface_points]         
-i growth magnification surface domain [g0 g1 surface_points]         
-h migration surface domain [m0 m1 surface_points]        
-H migration matrix surface [infile] [outfile]         
-k tmrca distribution [n_cells top_value outfile]         
-K tmrca and age distributions [n_cells top_value outfile]       
-o output level [value 0-7] (default 3)                
	1 likelihood               
	2 show tree               
	3 TMRCA               
	4 age of mutations               
	5 MRCA distribution in subpopulations               
	6 mutation distribution in subpopulations               
	7 subpopulation TMRCA          
-c tree picture file name (default: tree.ps)         
-C [gene population size] [generation size] (for tree picture)         
-2 calculate exact likelihood from 2 ancestors         
-l multiple file summary [0 or 1] (default 1)         
-b generate batch commands for multiple trees         
-P calculate average pairwise differences         
-x maximum events in one simulation run (default - 500)         
-y maximum types in the ancestry of the sample (default - 30)  
-->
<!--
Here are typical commands.

(1)  To generate all possible rooted trees: genetree infile.dat -z
(2)  To run results for all rooted trees: genetree infile_dat.#  1.4 100 72189 -m infile.mig > infile.out_100

To allow for different population sizes:
genetree infile_dat.#  9.7 100000 72189 -p infile.size -m infile.mig>infile.out0.1_100000

To allow for different population sizes and 'extra' populations:
genetree infile_dat.#  9.7 100000 72189 -p pop.dat -s 3 -m infile.mig > infile.out_100000

This also generates output file (tree.age) with age of mutations for initial tree specified as the root.

(3)  To generate ages of mutations for tree with the highest root probability:
genetree infile_dat.3  7.5 100 72189 -x 100000 -m infile.mig -o 4 >infile_age.3

Most comprehensive output:
genetree infile_dat.51 9.7 1000000 72189 -m infile.mig -o 7 >infile_age.51_1000000

(4)  To simulate theta:
genetree infile_dat.50 7.5 1000 72189 -x 100000 -m infile.mig -f surf.31_1000 -g 7.0 20 150 -2 >surfout.31_1000

 -->
 <!-- 
 We want to run two commands:
 
 (1) genetree haplotype_dat.txt -z
 
(2)  genetree haplotype_dat_txt.# 2.0 10000 2399662 -o 7 -l -x 5000 -y 5000 -A mutation_age.# -M genetree_temp.# -Z 49
 
genetree haplotype_dat_txt.# 2 1000 12345 -A mutation_age.# -M genetree_temp.# -l -Z 49 -x 5000 -y 5000 -o 7  

  -->
<parameters> 

        <parameter isinput="1" type="InFile">
			<name>infile</name>
			<attributes>
				<prompt>Input sequence file</prompt>
				<filenames>haplotype_dat</filenames>
			</attributes>
		</parameter>

<!-- Usage: genetree tree_file theta runs seed {options)    -->
<!-- 
                           /projects/ps-ngbt/opt/comet/genetree/gtree/genetree haplotype_dat -z 
 -->
		<parameter ishidden="1" type="String">
			<name>genetree_invocation</name>
			<attributes>
				<format>
					<language>perl</language>
					<code>"/expanse/projects/ngbt/opt/comet/genetree/gtree/genetree haplotype_dat -z"</code> 
				</format>
				<group>0</group>
			</attributes>
		</parameter>
<!-- 
                                         && /projects/ps-ngbt/opt/comet/genetree/gtree/genetree haplotype_dat.# 2.0 10000 7843 -A mutation_age.# -M genetree_temp.# -x 5000 -o 7 > output.txt
 -->		
 <parameter ishidden="1" type="String">
		<name>genetree_invocation2</name>
			<attributes>
				<format>
					<language>perl</language>
					<code>"&amp;&amp; /expanse/projects/ngbt/opt/comet/genetree/gtree/genetree haplotype_dat.# $theta_val $run_val $seed_val -A mutation_age.# -M genetree_temp.#"</code> 
				</format> 
				<group>2</group>
			</attributes>
		</parameter> 
		
		<parameter ishidden="1" type="String">
		<name>genetree_invocation3</name>
			<attributes>
<!--  				<precond>
					<language>perl</language>
					<code>$run_shellscript</code>
				</precond>-->
				<format>
					<language>perl</language>
					<code>"&#59; wait &#59; chmod 770 ./output.sh &#59; wait &#59; sleep 2 &#59; convert_parallel.sh output.sh"</code> 
				</format> 
				<group>95</group>
			</attributes>
		</parameter>
		
	<parameter ishidden="1" type="String">
		<name>genetree_invocation4</name>
			<attributes>
<!--  				<precond>
					<language>perl</language>
					<code>$run_shellscript</code>
				</precond>-->				
				<format>
					<language>perl</language>
					<code>"&#59; wait &#59; sleep 2 &#59; ./output.sh"</code> 
				</format> 
				<group>99</group>
			</attributes>
		</parameter>
	
		
		<parameter ishidden="1" type="String">
			<name>genetree_scheduler</name>
				<attributes>
					<paramfile>scheduler.conf</paramfile>
					<precond>
						<language>perl</language>
						<code>$run_val &lt; 3000</code>
					</precond>
					<format>
						<language>perl</language>
							<code>
									"threads_per_process=1\\n" .
									"mem=5G\\n" .
									"node_exclusive=0\\n" .
									"nodes=1\\n"
								</code>
					</format>
				</attributes>
		</parameter>

		<parameter ishidden="1" type="String">
			<name>genetree_scheduler2</name>
				<attributes>
					<paramfile>scheduler.conf</paramfile>
					<precond>
						<language>perl</language>
						<code>$run_val &gt; 2999</code>
					</precond>
					<format>
						<language>perl</language>
							<code>
									"threads_per_process=18\\n" .
									"mem=36G\\n" .
									"node_exclusive=0\\n" .
									"nodes=1\\n"
								</code>
					</format>
				</attributes>
		</parameter>

<!-- return absolutely everything -->
		<parameter type="Results">
			<name>all_output</name>
			<attributes>
				<prompt>All output</prompt>
				<filenames>*</filenames>
			</attributes>
		</parameter>
	
<!-- Visible parameters -->
<!-- Parameters with visible controls start here -->
		<parameter type="Float" issimple="1" ismandatory="1">
			<name>runtime</name>
			<attributes>
				<group>1</group>
				<paramfile>scheduler.conf</paramfile>
				<prompt>Maximum Hours to Run (click here for help setting this correctly)</prompt>
				<format>
					<language>perl</language>
					<code>"runhours=$value\\n"</code>
				</format>
				<vdef>
					<value>0.25</value>
				</vdef>
				<ctrls>
					<ctrl>
						<message>Maximum Hours to Run must be less than 168</message>
						<language>perl</language>
						<code>$runtime &gt; 168.0</code>
					</ctrl>
					<ctrl>
						<message>Maximum Hours to Run must be greater than 0.1 </message>
						<language>perl</language>
						<code>$runtime &lt; 0.1</code>
					</ctrl>
				</ctrls>
				<warns>
					<warn>
						<message>The job will run on 1 processor as configured. If it runs for the entire configured time, it will consume $numthreads x $runtime cpu hours</message>
						<language>perl</language>
						<code>$runtime ne 0 </code>
					</warn>
				</warns>
								<comment>
<value>Estimate the maximum time your job will need to run. We recommend testimg initially with a &lt; 0.5hr test run because Jobs set for 0.5 h or less dependably run immediately in the "debug" queue. 
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs &gt; 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours. 
</value>
				</comment>
			</attributes>
		</parameter>
		
<!-- run the shell script 
		<parameter issimple="1"  type="Switch">
			<name>run_shellscript</name>
			<attributes>
				<prompt>Execute the shell script file</prompt>
				<vdef>
					<value>1</value>
				</vdef>
			</attributes>
		</parameter>		-->
		
<!-- add the ancestor file 
		<parameter issimple="1"  type="InFile">
			<name>ancestor_file</name>
			<attributes>
				<prompt>Select ancestor file</prompt>
				<filenames>ancestor.txt</filenames>
			</attributes>
		</parameter>	 -->	
		
<!-- $theta_val -->
		<parameter issimple="1"  type="Float">
			<name>theta_val</name>
			<attributes>
				<prompt>Specify a value for theta</prompt>
				<group>3</group>
<!-- 				<format>
					<language>perl</language>
					<code>(defined $value) ? "$value":"" </code>
				</format>  -->
			</attributes>
		</parameter>
		
<!--  $run_val -->
		<parameter issimple="1"  type="Integer">
			<name>run_val</name>
			<attributes>
				<prompt>Specify the number of runs</prompt>
				<group>3</group>
<!--  				<format>
					<language>perl</language>
					<code>(defined $value) ? "$value":"" </code>
				</format> -->
			</attributes>
		</parameter>
		
<!--  $seed_val -->
		<parameter issimple="1"  type="Integer">
			<name>seed_val</name>
			<attributes>
				<prompt>Specify a seed value</prompt>
				<group>3</group>
<!-- 				<format>
					<language>perl</language>
					<code>(defined $value) ? "$value":"" </code>
				</format>  -->
			</attributes>
		</parameter>
		
<!-- -z generate all possible trees with different roots 
		<parameter issimple="1"  type="Switch">
			<name>generate_alltrees</name>
			<attributes>
				<prompt>Generate all possible trees (-z)</prompt>
				<group>4</group>
				<format>
					<language>perl</language>
					<code>($value) ? "-z":"" </code>
				</format>
			</attributes>
		</parameter> -->
		
<!-- -b generate batch commands for multiple trees   -->
		<parameter issimple="1"  type="Switch">
			<name>make_batchfile</name>
			<attributes>
				<prompt>Make a batch file (-b)</prompt>
				<group>4</group>
				<format>
					<language>perl</language>
					<code>($value) ? "-b":"" </code>
				</format>
			</attributes>
		</parameter>
		
<!-- -l multiple file summary [0 or 1] (default 1)  -->
		<parameter issimple="1"  type="Switch">
			<name>multifile_sum</name>
			<attributes>
				<prompt>Multiple file summary (-l)</prompt>
				<group>4</group>
				<format>
					<language>perl</language>
					<code>($value) ? "-l":"" </code>
				</format>
			</attributes>
		</parameter>
		
<!-- (2)  To run results for all rooted trees: genetree infile_dat.#  1.4 100 72189 -m infile.mig > infile.out_100 -->
		<parameter issimple="1"  type="InFile">
			<name>run_allrootedtrees</name>
			<attributes>
				<prompt>Provide a migration rate file (matrix of rates, diagonals zero) (-m)</prompt>
				<group>5</group>
				<filenames>infile.mig</filenames>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-m infile.mig":"" </code>
				</format>
			</attributes>
		</parameter>  

<!-- To allow for different population sizes and 'extra' populations:
genetree infile_dat.#  9.7 100000 72189 -p pop.dat -s 3 -m infile.mig >infile.out_100000
This also generates output file (tree.age) with age of mutations for initial tree specified as the root. -->
<!--  Can you also provide the options for allowing different population sizes -p infile.size and extra populations -s 3 

Allow for different population sizes: genetree infile_dat.#  9.7 100000 72189 -p infile.size -m infile.mig>infile.out0.1_100000
Allow for different population sizes and 'extra' populations: genetree infile_dat.#  9.7 100000 72189 -p pop.dat -s 3 -m infile.mig > infile.out_100000

 -->
 		<parameter issimple="1"  type="InFile">
			<name>allow_popsizes</name>
			<attributes>
				<prompt>Provide subpopulation relative size file (array of rates) (-p)</prompt>
				<group>6</group>
				<filenames>pop.dat</filenames>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-p pop.dat":"" </code>
				</format>
			</attributes>
		</parameter> 

<!-- -s number of subpopulations (default - number in tree_file)  	-->
		<parameter issimple="1"  type="Integer">
			<name>num_subpops</name>
			<attributes>
				<prompt>Number of subpopulations (default - number in tree_file) (-s)</prompt>
				<group>7</group>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-s $value":"" </code>
				</format>
			</attributes>
		</parameter> 	
		
<!-- number of segregating sites  -->
		<parameter issimple="1"  type="Integer">
			<name>num_segsites</name>
			<attributes>
				<prompt>Number of segregating sites + 1 (-Z)</prompt>
				<group>7</group>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-Z $value":"" </code>
				</format>
			</attributes>
		</parameter>
		
<!-- (3)  To generate ages of mutations for tree with the highest root probability:
genetree infile_dat.3  7.5 100 72189 -x 100000 -m infile.mig -o 4 >infile_age.3 -->		

		<parameter issimple="1"  type="Integer">
			<name>max_eventspersim</name>
			<attributes>
				<prompt>Maximum events in one simulation run (default - 500) (-x)</prompt>
				<group>8</group>
				<filenames>pop.dat</filenames>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-x $value":"" </code>
				</format>
			</attributes>
		</parameter>
		
		<parameter issimple="1"  type="Integer">
			<name>max_typesanc</name>
			<attributes>
				<prompt>Maximum types in the ancestry of the sample (-y)</prompt>
				<group>8</group>
				<filenames>pop.dat</filenames>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-y $value":"" </code>
				</format>
			</attributes>
		</parameter>

		<parameter issimple="1"  type="Excl">
			<name>output_level</name>
			<attributes>
				<prompt>Output Level (-o)</prompt>
				<group>8</group>
				<vlist>
<!-- -o output level [value 0-7] (default 3)                
	1 likelihood               
	2 show tree               
	3 TMRCA               
	4 age of mutations               
	5 MRCA distribution in subpopulations               
	6 mutation distribution in subpopulations               
	7 subpopulation TMRCA -->
					<value>1</value>
					<label>likelihood</label>
					<value>2</value>
					<label>show tree</label>
					<value>3</value>
					<label>TMRCA</label>
					<value>4</value>
					<label>age of mutations</label>
					<value>5</value>
					<label>MRCA distribution in sub-populations</label>
					<value>6</value>
					<label>Mutation distribution in sub-population</label>
					<value>7</value>
					<label>Subpopulation TMRCA</label>
				</vlist>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-o $value":"" </code>
				</format>
			</attributes>
		</parameter>
		
<!-- -f surface outfile_name (g m theta likelihood sd_like) -->
		<parameter issimple="1"  type="InFile">
			<name>surface_outfile</name>
			<attributes>
				<prompt>Surface outfile_name (g m theta likelihood sd_like) (-f)</prompt>
				<group>9</group>
				<filenames>surf.file</filenames>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-f surf.file":"" </code>
				</format>
			</attributes>
		</parameter>  

<!-- (4)  To simulate theta:
genetree infile_dat.50 7.5 1000 72189 -x 100000 -m infile.mig -f surf.31_1000 -g 7.0 20 150 -2 >surfout.31_1000 -->
		
 		<parameter issimple="1"  type="Float">
			<name>theta_1</name>
			<attributes>
				<prompt>Theta value 1  (-g)</prompt>
				<group>9</group>
				<format>
					<language>perl</language>
					<code>(defined $value) ? "-g $theta_1 $theta_2 $surface_1 $surface_2":""</code>
				</format>
			</attributes>
		</parameter>

		<parameter issimple="1"  type="Float">
			<name>theta_2</name>
			<attributes>
				<prompt>Theta value 2 </prompt>
			</attributes>
		</parameter>
		
		<parameter issimple="1"  type="Integer">
			<name>surface_1</name>
			<attributes>
				<prompt>Surface point 1 </prompt>
			</attributes>
		</parameter>
		
		<parameter issimple="1"  type="Integer">
			<name>surface_2</name>
			<attributes>
				<prompt>Surface point 2 </prompt>
			</attributes>
		</parameter> 
			
<!--  $output_file  -->
		<parameter issimple="1"  ishidden="1" type="String">
			<name>outfile_name_val</name>
			<attributes>
				<prompt>Specify the output file name</prompt>
				<format>
					<language>perl</language>
					<code>"&gt; output.sh"</code>
				</format>
				<group>10</group>
			</attributes>
		</parameter> 

<!--  <parameter issimple="1" type="InFile">
			<name>secondary_ctrlfile</name>
			<attributes>
				<prompt>Select secondary CTRL file (optional)</prompt>
				<format>
					<language>perl</language>
					<code>defined $value ? "ctrlfile2.ctrl" : ""</code>
				</format>
				<filenames>ctrlfile2.ctrl</filenames>
			</attributes>
		</parameter>
		
		<parameter issimple="1" ismandatory="0" type="InFile">
			<name>add_tracefile</name>
			<attributes>
				<prompt>Add a tracefile file (optional)</prompt>
				<filenames>tracefile.out</filenames>
			</attributes>
		</parameter> -->      		
        		
 </parameters> 
</pise>


