ShortStack 0.4.0 has been released. This is a major update which offers substantial speed improvements relative to previous versions. The speed improvements are due primarily to a new filtering method that reduces the number of RNALfold calls. Previously, all small RNA clusters were subjected to RNALfolding (except those that were too long -- see documentation). In this new release, highly repetitive clusters, and those that are not dominated by small RNAs in the acceptable size range (in other words, have DicerCall of "N") are not eligible for secondary structure analysis.
The filtering by repetitiveness is accomplished using a metric I am calling the "Uniqueness Index" (or "UI" for short). The UI is the ratio of repeat-normalized abundance / total mappings at a locus. For highly repetitive loci, where most mappings come from reads that are highly multi-mapped, the UI approaches 0. In contrast, for very unique loci, where most to all of the mappings are from reads that are uniquely mapped at the locus, the UI approaches or achieves a value of 1. By default, ShortStack will not attempt RNA folding for any locus with a UI of less than 0.1. This setting is adjustable with the new parameter --minUI.
This should offer substantial speed improvements, especially for large highly repetitive plant genomes which spawn