Libraries can contain technical duplication
The assumption when analysing sequence datasets is that every sequence comes from a different biological fragment in the original sample. Many library preparation techniques though include one or more PCR steps which introduce the possibility that the same original fragment can be observed multiple times, biasing the results produced. In some cases this type of duplication can be extreme and have a serious effect on the ability to analyse the data correctly.