RELAX Method Documentation

RELAX is a hypothesis testing framework that asks whether the strength of natural selection has been relaxed or intensified along a specified set of test branches. RELAX is therefore not a suitable method for explicitly testing for positive selection. Instead, RELAX is most useful for identifying trends and/or shifts in the stringency of natural selection on a given gene.

RELAX requires a specified set of "test" branches to compare with a second set of "reference" branches (note that all branches do not have to be assigned, but one branch is required for the test and reference set each). RELAX begins by fitting a codon model with three $\omega$ classes to the entire phylogeny (null model). RELAX then tests for relaxed/intensified selection by introducing the parameter k (where $k \geq 0$), serving as the selection intensity parameter, as an exponent for the inferred $\omega$ values: $\omega^k$. Specifically, RELAX fixes the inferred $\omega$ values (all $\omega_{<1,2,3>}$) and infers, for the test branches, a value for k which modifies the rates to $\omega_{<1,2,3>}^k$ (alternative model). RELAX then conducts a Likelihood Ratio Test to compare the alternative and null models.

A significant result of k>1 indicates that selection strength has been intensified along the test branches, and a significant result of k<1 indicates that selection strength has been relaxed along the test branches.

In addition to this pair of null/alternative models, RELAX fits three other models meant as complementary descriptors for the data, but are not suitable for hypothesis testing. These additional models include the following:

Partitioned MG94xREV - This model fits a single $\omega$ value, i.e. shared for all sites, to each branch partition (reference and test). Here, a total of two $\omega$ rates are inferred.
Partitioned Descriptive - This model, like a more standard branch-site model, fits three $\omega$ classes separately to each branch partition (reference and test, producing a total of six estimated $\omega$ rates estimated). The selection intensity parameter k is not included.
General Descriptive - This model fits three $\omega$ classes to the full data set, ignoring the specified test and reference partition division (three total $\omega$ rates estimated). It subsequently fits a k parameter at each branch, ultimately tailoring the three $\omega$ class values to this branch. This model may serve as a useful description of how selection intensity fluctuates over the whole tree.

Citation

If you use the RELAX method in your analysis, please cite:

Wertheim, JO et al. "RELAX: detecting relaxed selection in a phylogenetic framework." Mol. Biol. Evol. 32, 820–832 (2015).

Input Parameters

Required Inputs

Genetic Code: Specify the genetic code to use for the analysis (default: "Universal").
Alignment File: Provide an in-frame codon alignment file (supported formats: .fasta, .phy, etc.).
Phylogenetic Tree: A phylogenetic tree (with optional branch length annotations) appended to the FASTA file or embedded within the NEXUS file.
Test Branches: Designate branches to be considered as 'Test'.
Reference Branches: Specify the branches to be treated as 'Reference'.

Optional Inputs

Model Selection: Choose the analysis type: "All" for descriptive models and RELAX test or "Minimal" for only the RELAX test (default: "All").

Outputs

Summary

The RELAX method generates a JSON file that contains:

Metadata about the analysis, including the input parameters and methodology.
Results illustrating the estimates of selection pressures for each branch.

Visualization

Tree Visualization: Visual representation of the phylogenetic tree with highlighted branches indicating different selection pressures.
Omega Distributions: Graphical representation of the omega rates across examined branches.
Statistical Results: Display of significance levels (p-values) for individual branches.

Example Workflow

Upload Data:
- Begin by providing your alignment file and phylogenetic tree file.
- Select Test and Reference branches.
Run Analysis:
- Initiate the RELAX analysis by clicking the "Run Analysis" button on the interface.
Review Results:
- When the analysis completes, access a summary interface that includes graphical and numerical representations of your data.
Export Results:
- Download the detailed JSON results for further examination or archiving.
- Options for exporting visualizations (SVG/PNG) of the tree are available.

Example CLI Usage of RELAX in HyPhy

To run the RELAX analysis using HyPhy, you would typically structure your command as follows:

Complete Command Example

Command:
bash
```
/path/to/hyphy/hyphy relax
```
Parameters:
- --code - Specify the genetic code to use (e.g., "Universal")
- --alignment - Provide the path to the in-frame codon alignment file (e.g., "alignment.phy")
- --tree - Path to the phylogenetic tree file (e.g., "tree.nwk")
- --mode - Run mode (e.g., "Classic mode")
- --test - Designate which branches to consider as the test set (e.g., "TEST")
- --reference - Specify the reference branches (e.g., "REFERENCE")
- --models - Type of analysis (e.g., "All")
- --rates - Number of omega rate classes (e.g., "3")
- --kill-zero-lengths - Specify whether to handle zero-length branches (e.g., "No")
- --output - File name for saving the output results (e.g., "results.json")

Example Command

bash

/path/to/hyphy/hyphy relax --code Universal --alignment alignment.phy --tree tree.nwk --mode "Classic mode" --test TEST --reference REFERENCE --models "All" --rates 3 --kill-zero-lengths No --output results.json

Minimal Example Command

If you prefer to run a command without default parameters, you can limit your input as follows:

bash

/path/to/hyphy/hyphy relax --alignment alignment.phy --tree tree.nwk --test TEST --reference REFERENCE

FAQ

1. How are K values interpreted in the RELAX model?

K values indicate the relative selection pressure. A K value of less than 1 indicates relaxation of selection, whereas a K greater than 1 suggests intensification.

2. Can I run RELAX on a concatenated dataset with many genes?

It is generally advisable to run RELAX on individual genes rather than concatenated data, as pooled datasets can complicate the inference of selective pressures.

3. How do I increase the chances of detecting relaxed/strengthened selection?

Use larger datasets with more branches in your test set, ensure good sequence quality, and consider multiple testing corrections to improve statistical robustness.

4. How can I ensure accurate branch and omega estimations in RELAX?

Make sure to utilize a robust phylogenetic tree, clean up your alignment, and properly specify branch groups for testing. Additionally, monitor for any warnings in the output regarding convergence.

5. What steps should be taken if I wish to validate the outcomes of multiple hypotheses from BUSTED and RELAX?

Use false discovery rate (FDR) corrections for p-values across the results. This allows for a more accurate interpretation of the results while minimizing the risk of type I errors.

RELAX Method Documentation ​

Citation ​

Input Parameters ​

Required Inputs ​

Optional Inputs ​

Outputs ​

Summary ​

Visualization ​

Example Workflow ​

Example CLI Usage of RELAX in HyPhy ​

Complete Command Example ​

Example Command ​

Minimal Example Command ​

FAQ ​

1. How are K values interpreted in the RELAX model? ​

2. Can I run RELAX on a concatenated dataset with many genes? ​

3. How do I increase the chances of detecting relaxed/strengthened selection? ​

4. How can I ensure accurate branch and omega estimations in RELAX? ​

5. What steps should be taken if I wish to validate the outcomes of multiple hypotheses from BUSTED and RELAX? ​