Illumina Sequencing
Content of this page is subject to change in the upcoming weeks as we update the information
Next generation sequencing has become a relevant technology in modern biology. GeneCore is providing 'Massively Parallel Sequencing' (MPS) as a core part of its service portfolio according to EMBL's mission to provide a state-of-the-art infrastructure to the scientific community.
The fast growing suite of instruments consists currently of:
Sequencer: |
1x Illumina MiSeq 3x Illumina NextSeq 500 3x Illumina NextSeq 2000 |
---|---|
Fragmentation: |
Covaris S2 |
Lab on Chip: |
Agilent BioAnalyzer, Agilent Fragment Analyser, Agilent FemtoPulse |
Fluorometer: |
Invitrogen Qubit 2 |
Sequencer | Lanes | Sequencing Mode* | Read Length [bases] | Clusters/Lane | Run time** |
---|---|---|---|---|---|
MiSeq | 1 | SE, PE | SE:
72, 300 PE: 36,75,150,250 |
~ 15 Mio | 36 PE: 8
hours 75 PE: 21hours 150PE: 24hours 250PE: 39hours |
NextSeq 500 High | 1 | SE, PE | SE: 75, 150 PE: 75, 150 |
~ 400 Mio | 75
SE: 11 hours 75 PE: 18 hours 150 PE: 34 hours |
NextSeq 500 Medium | 1 | SE, PE | SE: 150 PE: 75,150 |
~ 130 Mio | 150
SE: 18 hours 75 PE: 18 hours 150 PE: 32 hours |
GeneCore can prepare sequencing libraries for the following applications:
- RNA-Seq (strand-specific or strand non-specific)
- Chip-Seq
- gDNA-Seq (de novo, resequencing)
insert size: 150-600 bp or 2.5 - 5.0 kbp (mate pair)
including genome capture - methyl-Seq
GeneCore can only partially support its users with processing of samples for these applications:
- CLIP-Seq (CrossLinking-and-ImmunoPrecipitation, this method and its variants are used for identification of RNA-Protein binding sites)
- GRO-Seq (Global-Run-On method, used for identification of nascent transcripts)
- CAGE-Seq (Cap-Analysis-of-Gene-Expression, the method is used for analysis of 5' end of mRNA transcripts)
- Hi-C ( a variant of chromosome conformational capture method, enabling identification of points of contacts between distant chromosomal domains)
- Chip-exo and ATAC-Seq (transposase-accessible chromatin using sequencing), include an immuno-precipitation step with a specific antibody.
Target Capture
- Agilent Sure Select Human Exome Version 4 and higher
- Haloplex
- Customized target capture
Multiplexing / Barcoding
- Indexing is possible by Illumina-type barcodes/indices (dedicated read) or barcodes integrated into sequencing reads (inline barcodes), during registration of your samples , please indicate correct length of barcode you used.
- Dual barcoding.
- Please note that the pool of barcoded samples registered for sequencing must have at least 4 libraries with balanced base composition. You can check, if your barcodes have the optimal combination of bases with this tool: http://www.ebi.ac.uk/~markus/basedist (paste individual barcodes in one lane each and click analyse; you will receive a colour coded graph and also a numerical value showing contribution of each position of the barcode.)
User made libraries
- Users can submit libraries they prepared themselves, which have to meet GeneCore's QC criteria before sequencing.
Which type of sequencing to use ?
"If you don't get the coverage at the start you'll regret it." (Jonathon Blake, Bioinformation)
Points to consider:
- Experimental design and coverage are the key for meaningful results
- Required coverage can be inferred from the size of the assayed space (genome, transcriptome, etc.) and abundance of recorded event. Generally, the more deteiled the picture should be, the more (deeper) needed to be sequenced. It is ususally necessary to sequence deeper (more) for any de novo analysis.
- Don't underestimate the importance of biological replicates
Application specific Recomendations
Application | Recommended Reading Length and Mode |
---|---|
gDNA-Seq (WGS) | 100 PE, regardless if genome assembly is based on a reference or carried out de novo |
Exome Sequencing (WES) | 100 PE |
Methylome Sequencing (BS-Seq) | 100 PE |
ChipSeq, Faire-Seq, DNaseI-Seq, MN-Seq, ... | 50 SE, pair-end sequencing or longer reads normally do not improve results. The sequencing depth required, is lower for analysis of binding sites of a transcription factor (not such a frequent event), whereas it is higher for mapping histones modifications ( a very frequent event). |
RNA-Seq, mRNA-Seq | 50 SE, for detection of expressed genes at eukaryotic samples |
RNA-Seq, Splice Variant detection | 50 PE and deeper coverage |
miRNA-Seq | 50 SE, Confirm, if your samples contain full complement of short RNAs (columns!) |
What do we need ?
First Time
Please contact Vladimir Benes , if you like to use
GeneCore for MPS.
Ordering
Please use our registration system to place your order. Only samples which have been regeistered electronicaly can
be processed. In house user may place their sample with us at V106. Please attach a corresponding barcode label
available from the barcode printer in GeneCore. External users should send their sample clearly marked with the ID
assigned to their sample during its registration (this information is also available in an email confirmation they
receive upon completion of registration), that we can attach the proper label for processing and storage, as soon
as the samples arrive.
Sample Requirements
Container: Please use only 1.5 ml low binding tubes (e.g. Eppendorf). Everything else causes substantial
additional work and costs for all of us.
Quality of the source material for library generation by GeneCore:
RNA-Seq |
High quality total RNA (BioAnalyzer RIN > 7), absorbance ratio 260/280 ~2 |
---|---|
Chip-Seq |
Purified IP DNA with majority of fragments within size range 200 - 500 bp |
gDNA-Seq |
High quality DNA, single band on agarose gel, absorbance ratio 260/280 ~2 |
methyl-Seq |
Majority of fragments within size range 200 - 400 bp |
User made libraries (all types)
- The samples should be accompanied by a picture showing a narrow fragment size distribution, with majority of fragments not longer than 700 bp, without any primer-dimers peak.
- If you are in doubt contact GeneCore.
Type of Library | Amount | Concentration [ng/µl] |
---|---|---|
RNA-Seq (mRNA, strand-specific RNA) | 1 [µg] | 100 |
miRNA-Seq | 1 [µg] | 100 |
Chip-Seq | 5 [ng] | at least 1 |
gDNA-Seq | 1 [µg] | 100 |
Mate-pair DNA libraries | 5 [µg] | 100 |
Methyl-Seq (BS-Seq) | 5 [µg] | 100 |
Exome Capture | 3 [µg] | 100 |
PCR amplicons | 20 [ng] | --- |
User made/ready-to-run libraries | 20 ng (10 nM solution) | 2 |
Important Points to consider during
registration
- if you register several samples, each of them needs to be registered individually
- before submitting your request, all fields in the form need to be checked/filled-in but the Mate Pair library box unless you register samples for this application
- your session times out after 15 minutes (to avoid locking-up) and if you think you cannot manage, divide your registration into several batches
- your registration is truly completed only when you click “Complete Order” and you receive an email confirming your registration
- review your entries before clicking “Submit Request” so that you avoid repeated return to the form
- in the case of an error, send the screenshot with its message to zimmerma@ embl.de
What can you expect ?
Number of reads
One HiSeq 2000 flow-cell can sequence seven ‘samples’ in individual lanes and requires a control in a
separate lane. These ‘samples’ can be also pools of barcoded samples. Under optimal conditions
(cluster density, etc.) it is feasible to obtain from this instrument up to 200 million sequencing reads per lane.
Depending upon the application, usually ~75% can be mapped to the reference genome. Please take into account that
these numbers are after quality filtering.
Output
Result package contains sequences in a fastaq format and if the alignment option was selected during registration
also alignments (bam file, http://samtools.sourceforge.net/) as a tar archive. You will be notified by email when
files are ready for download. Primary and intermediate results will be stored as long as necessary to generate the
result files. The result package will be available 30 days after you receive the message informing you that data
are ready. A longer storage is not possible due to the amount of data produced and the respective costs for data
storage. Data are archived and can be retrieved against the fee, if a need be. Please note that generally we do
not manipulate data by any means such as trimming, for example. However, we can do it upon request.
Turnaround time
is heavily dependent upon the sequencing regime. Running time of the sequencer for 50 single-end bases (50SE) is
~3 days, for 100 pair-ended bases (100PE, each fragment is sequenced from both sides) and the data processing. The
sequencers are running 24 hours a day and 7 days a week. Interruptions are only made for maintenance. The general
processing time per sample is around 6 weeks, exceptions are possible.
Result download
The files are available as fastq file *.txt.gz and bam files and the names have the following notation:
Results are placed on our dedicated result distribution server. The files can be downloaded, using with the fasp protocol (by Aspera), which increases the donwload speed up to a factor of 60x (compared to conventional methods), utilizes enhanced error correction as well on the fly encryption. For further information: information File Download with Aspera.pdf.
Sample storage policy
Your physical sample and generated sequencing library will be stored at GeneCore for a period of 30 days after
completion of your request, if no explicit agreement with GeneCore has been made.
Data retention policy
As soon as the result files for your samples are produced (seq and if applicable also bam filess) you will get a
notification mail. We will keep your data online for download for 4 weeks, after that period the data files will
be deleted.
Pricing
MPS on Illumina HiseSeq consists of two parts: library preparation and sequencing. . Each of them has its own
price, which depends upon the type of the library and its sequencing mode.