Introduction

The nature of many sequencing platforms is that their chemistry requires that specific adapter sequences are added to the end of the fragments to be sequenced.  These adapters fulfil roles such as allowing for specific amplification and priming to support the underlying sequencing chemistry. Ideally a sequencing library should only contain valid adapter+insert constructs and manufacturers use a variety of techniques and modifications to try to ensure that these are highly enriched in the library.

However, it can happen that these precautions sometimes fail and a set of adapter dimers – a pair of ligated adapters with no insert sequence – end up in the library.  These can still be sequenced because they contain all of the relevant parts of the sequencing template, but will produce no useful sequence.  If these constructs end up present at high proportions of the library they can soak up significant amounts of the sequencing capacity in a lane and cause a number of QC metrics to be triggered.

The Symptoms

There are a couple of different ways to spot that this type of contamination has occurred.  Because adapter dimers will always produce exactly the same sequence they will superimpose their sequence on the per-cycle base content plot.  The example below is obviously extreme, but a reduced version of this pattern would appear in more modesty contaminated libraries.

per_base_dimers

The introduction of a number of identical sequences will also show up as a sharp spike in the overall GC profile for the run.

adapter_dimer_gc_profile

If you are monitoring over-represented sequences then this screen will also turn up this type of contamination since the sequences generated will be exactly the same each time, and will be present many thousands of times in the library.

 

Diagnosis

The easiest way to spot this is in the overrepresented sequences screen where the exact sequence will be shown and should be shown to match against the adapter sequence used.  If the contamination is too low to trigger this module then you can normally see it as a fixed sequence super-imposed on the per base sequence content plot.

Mitigation

Adapter trimmers will generally remove this sort of data, but even without these the dimers do not normally map to a reference genome so don’t cause any further downstream disruption.

Prevention

Preventing this type of contamination must happen at the library preparation stage.  Being careful about the amount of adapters added to the ligation mix, and being stringent on the size selection step for your library are going to be the most important parts for avoiding the formation and selection of dimers.

 

February 1, 2016

Leave a Reply

Your email address will not be published.