• Get anchor points of the most recent wgd Ks 01.-0.6
• Segments for those anchor points
o ad_segments (species=ptr, exp_id=1) -> first and last gene of the segments
• Annotation of the segments
o coordinates of first and last gene -> coordinates of segments
o get all genes on the segment
o mark Anchor point and tandem duplicates
• Create pseudoalignment between the segments
o based on anchor points
• Define potential regions
o Genes that have no counterpart on the aligned segment
• Extract the potential regions form the genome
o Region between the proceeding and following anchor point
• tblastn of the opposing gene on the potential region
• Filter blast results
o Percent identity >30%
o Evalue<0.5
o Length >50 amino acids
• Collapse and link fragments:
o Different fragments of the gene put together
o Fragments are like exons
• extra annotations
The output contains regions with anchor points, retained duplicates, and also regions without genes. In these last regions potential to contain remnants of duplicates.
2. Analyses of Blast results => Remnant regions?
• Annotate the features of the regions
o RNA seq coverage
o Repeat elements between the different fragments of a gene
Can still be further annotated to TE’s
o Retention group the query gene: Single/Intermediate/multi/Non-core (Plant cell paper)