Skip to content

Improve filtering steps in anchor clustering

Cecilia SENSALARI requested to merge anchors into master
  • The filtering steps are meant to remove anchor clusters that are likely to be artefact or are very unclear.
  • New filtering criterion: if a cluster is lower in height than the successive cluster, it is discarded. The motivation is that any WGD peak is generally taller than the successive (older) WGD peak in the distribution, due to gene loss and Ks stochasticity (increased variability in substitution accumulation) among paralogs.
  • Modified existing filtering criterion: widely spread clusters (IQR >= 1.1) are now discarded only if their median Ks is old (>2.6); this change comes from the fact that some clusters whose median was actually acceptable (e.g. 2) were discarded because they were covering a wide range with a tail up to 5 Ks. Such species were Poaceae like rice and sorghum, where there is a wide sigma-tau signal between 2 and 3.
Edited by Cecilia SENSALARI

Merge request reports