You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the insert size estimation appears to be cut at the peak 271 and it says that >98% of reads did not overlap.
This happens with all my samples. The rest of the parameters seem fine (except in few samples where the run failed). Is it normal for a sample to have 98% non-overlapping reads or is this a cause for concern? I have seen other people with similar questions but I have not found clear answers or guidelines.
Why is there a fixed threshold around 270? Is there a way to plot the full distribution?
Would it be possible to include documentation for help with interpreting the insert size estimation?
Also, before fastp filtering, all my reads were exactly 151 bp, now I got some few sequences ranging from 100 - 151 bp, which issues a warning by fastqc. Is this because some reads get trimmed (e.g. polyG or other reasons) and then still fall into the quality criteria for being kept?
Thanks in advance for your help!
The text was updated successfully, but these errors were encountered:
Sort of...I followed the suggestions I found online about checking the estimated insert size in the sam files output, after aligning the sequences.
My data seems to have bigger insert sizes, ranging from 300-600, so it makes sense that the distribution mode falls outside of the imposed 30-272 range of the fastp output.
It would be nice to get some more information about this though.
After running the following
fastp
command on my paired-end metagenomics data:the insert size estimation appears to be cut at the peak 271 and it says that >98% of reads did not overlap.
This happens with all my samples. The rest of the parameters seem fine (except in few samples where the run failed). Is it normal for a sample to have 98% non-overlapping reads or is this a cause for concern? I have seen other people with similar questions but I have not found clear answers or guidelines.
Why is there a fixed threshold around 270? Is there a way to plot the full distribution?
Would it be possible to include documentation for help with interpreting the insert size estimation?
Also, before
fastp
filtering, all my reads were exactly 151 bp, now I got some few sequences ranging from 100 - 151 bp, which issues a warning byfastqc
. Is this because some reads get trimmed (e.g. polyG or other reasons) and then still fall into the quality criteria for being kept?Thanks in advance for your help!
The text was updated successfully, but these errors were encountered: