Fig. 1
From: REPrise: de novo interspersed repeat detection using inexact seeding

The schematic illustration of REPrise and RepeatScout algorithms. These algorithms first construct a seed table from the input genome sequences. REPrise utilizes inexact seeds, frequently appearing k-mers permitting d substitutions, for the table construction. Subsequently, these algorithms perform seed-and-extension alignments on both ends of the seeds. REPrise adopts the affine gap scoring in this step. These algorithms then mask seed regions in the detected repeat regions. REPrise performs looser seed masking than RepeatScout. This cycle of alignment and masking is repeated until the seed table is depleted. In REPrise, representative sequences are selected from the consensus sequences of repeat families using CD-HIT and the representative sequences are outputted