-
- 素材大。
- 4.37 MB
- 素材授權(quán):
- 免費(fèi)下載
- 素材格式:
- .ppt
- 素材上傳:
- ppt
- 上傳時(shí)間:
- 2018-05-11
- 素材編號(hào):
- 186567
- 素材類別:
- 公司管理PPT
-
素材預(yù)覽
這是一個(gè)關(guān)于二代測(cè)序數(shù)據(jù)分析介紹PPT課件,包括了重測(cè)序的原理及流程,數(shù)據(jù)結(jié)構(gòu)與質(zhì)量評(píng)估,SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取,Bowtie2、BWA和SAMtools軟件使用等內(nèi)容。二代測(cè)序數(shù)據(jù)分析簡(jiǎn)介童春發(fā) 2013.12.23 主要內(nèi)容重測(cè)序的原理及流程數(shù)據(jù)結(jié)構(gòu)與質(zhì)量評(píng)估 SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取 Bowtie2、BWA和SAMtools軟件使用 重測(cè)序的原理及流程數(shù)據(jù)結(jié)構(gòu)與質(zhì)量評(píng)估 Fastq格式 FastQC FASTQ format http://en.wikipedia.org A FASTQ file containing a single sequence might look like this Illumina sequence identifiers With Casava 1.8 the format of the '@' line has changed Quality A quality value Q is an integer mapping of p (i.e., the probability that the corresponding base call is incorrect),歡迎點(diǎn)擊下載二代測(cè)序數(shù)據(jù)分析介紹PPT課件哦。
二代測(cè)序數(shù)據(jù)分析介紹PPT課件是由紅軟PPT免費(fèi)下載網(wǎng)推薦的一款公司管理PPT類型的PowerPoint.
二代測(cè)序數(shù)據(jù)分析簡(jiǎn)介童春發(fā) 2013.12.23 主要內(nèi)容重測(cè)序的原理及流程數(shù)據(jù)結(jié)構(gòu)與質(zhì)量評(píng)估 SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取 Bowtie2、BWA和SAMtools軟件使用 重測(cè)序的原理及流程數(shù)據(jù)結(jié)構(gòu)與質(zhì)量評(píng)估 Fastq格式 FastQC FASTQ format http://en.wikipedia.org A FASTQ file containing a single sequence might look like this Illumina sequence identifiers With Casava 1.8 the format of the '@' line has changed Quality A quality value Q is an integer mapping of p (i.e., the probability that the corresponding base call is incorrect). Phred quality score: The Solexa pipeline (i.e., the software delivered with the Illumina Genome Analyzer) earlier used Quality Encoding Sanger format can encode a Phred quality score from 0 to 93 using ASCII 33 to 126 Illumina's newest version (1.8) of their pipeline CASAVA will directly produce fastq in Sanger format Solexa/Illumina 1.0 format can encode a Solexa/Illumina quality score from -5 to 62 using ASCII 59 to 126 Starting with Illumina 1.3 and before Illumina 1.8, the format encoded a Phred quality score from 0 to 62 using ASCII 64 to 126 Starting in Illumina 1.5 and before Illumina 1.8, the Phred scores 0 to 2 have a slightly different meaning American Standard Code for Information Interchange (ASCII) FastQC http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Double click “run_fastqc.bat” to run FastQC The analysis results for 11 modules Green tick for normal Orange triangle for slightly abnormal Red cross for very unusual Basic Statistics Per Base Sequence Quality Per Sequence Quality Scores Per Base Sequence Content Per Base GC Content Per Sequence GC Content Per Base N Content Sequence Length Distribution Duplicate Sequences Overrepresented Sequences Overrepresented Kmers Saving a Report SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取 SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取 SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取 SRA數(shù)據(jù)庫(kù)及數(shù)據(jù)獲取查看和下載SRR576183 Fastq-dum將SRA文件轉(zhuǎn)化成FASTQ格式 fastq-dump --split-files -DQ “+” ./SRR576183.sra fastq-dump --split-files -DQ “+” --gzip ./SRR576183.sra 直接下載FASTQ格式數(shù)據(jù) ftp://ftp.era.ebi.ac.uk/vol1/fastq/SRR576/SRR576183 將Reads比對(duì)到參考序列 BWA Bowtie2 Soap Samtools BWA http://bio-bwa.sourceforge.net/ https://github.com/lh3/bwa wget http://sourceforge.net/projects/bio-bwa/files/bwa-0.7.5a.tar.bz2 tar -xjvf bwa-0.7.5a.tar.bz2 cd bwa-0.7.5a make Dowload test.tar.gz from ftp://202.119.214.193 BWA ../bwa-0.7.5a/bwa index ref.fa ../bwa-0.7.5a/bwa mem ref.fa test_PE1.fa > aln-se.sam ../bwa-0.7.5a/bwa mem ref.fa test_PE1.fa test_PE2.fa > aln-se.sam Bowtie2 http://bowtie-bio.sourceforge.net/bowtie2/index.shtml 下載 bowtie2-2.1.0-linux-x86_64.zip unzip bowtie2-2.1.0-linux-x86_64.zip mv bowtie2-2.1.0 bowtie2 cd bowtie2/example mkdir work cd work Bowtie2 Index a reference genome ../../bowtie2-build ../reference/lambda_virus.fa lambda_virus Aligning single-end reads ../../bowtie2 -x lambda_virus -U ../reads/reads_1.fq -S eg1.sam Aligning paired-end reads ../../bowtie2 -x lambda_virus -1 ../reads/reads_1.fq -2 ../reads/reads_2.fq -S eg2.sam -U: unpaired reads -S: sam format SAM output Name of read that aligned Sum of all applicable flags. Flags relevant to Bowtie are: 1: The read is one of a pair 2: The alignment is one end of a proper paired-end alignment 4: The read has no reported alignments 8: The read is one of a pair and has no reported alignments 16: The alignment is to the reverse reference strand SAM output 32: The other mate in the paired-end alignment is aligned to the reverse reference strand 64: The read is mate 1 in a pair 128: The read is mate 2 in a pair Name of reference sequence where alignment occurs 1-based offset into the forward reference strand where leftmost character of the alignment occurs SAM output Mapping quality CIGAR string representation of alignment Name of reference sequence where mate's alignment occurs. Set to = if the mate's reference sequence is the same as this alignment's, or * if there is no mate. 1-based offset into the forward reference strand where leftmost character of the mate's alignment occurs. Offset is 0 if there is no mate SAM output Inferred fragment size. Size is negative if the mate's alignment occurs upstream of this alignment. Size is 0 if there is no mate. Read sequence (reverse-complemented if aligned to the reverse strand) ASCII-encoded read qualities (reverse-complemented if the read aligned to the reverse strand). The encoded quality values are on the Phred quality scale and the encoding is ASCII-offset by 33 (ASCII char !), similarly to a FASTQ file. Optional fields. Fields are tab-separated. bowtie2 outputs zero or more of these optional fields for each alignment, depending on the type of the alignment: SAM output Optional fields: AS:i:
Alignment score. Only present if SAM record is for an aligned read XS:i: Alignment score for second-best alignment. Only present if the SAM record is for an aligned read and more than one alignment was found for the read YS:i: Alignment score for opposite mate in the paired-end alignment. Only present if the SAM record is for a read that aligned as part of a paired-end alignment. SAM output Optional fields: XN:i: The number of ambiguous bases in the reference covering this alignment. Only present if SAM record is for an aligned read XM:i: The number of mismatches in the alignment. Only present if SAM record is for an aligned read XO:i: The number of gap opens, for both read and reference gaps, in the aligment. Only present if SAM record is for an aligned read SAM output Optional fields: XG:i: The number of gap extensions, for both read and reference gaps, in the aligment. Only present if SAM record is for an aligned read NM:i: The edit distance; that is, the minimal number of one-necleotide edits (substitutions, insertions and deletions) needed to transform the read string into the reference string. Only present if SAM record is for an aligned read SAM output Optional fields: YP:i: Equals 1 if the read is part of a pair that has at least N concordant alignments, where N is the argument specified to –M plus one. Equals 0 if the read is part of pair that has fewer than N alignments. E.g. if –M 2 is specified and 3 distinct, concordant paired-end alignments are found, YP:i:1 will be printed. If fewer than 3 are found, YP:i:0 is printed. Only present if SAM record is for a read that aligned as part of a paired-end alignment. SAM output Optional fields: YM:i: Equals 1 if the read aligned with at least N unpaired alignments, where N is the argument specified to –M plus one. Equals 0 if the read aligned with fewer than N unpaired alignments. E.g. if –M 2 is specified and 3 distinct, valid, unpaired alignments are found, YM:i:1 is printed. If fewer than 3 are found, YM:i:0 is printed. Only present if SAM record is for a read that Bowtie 2 attempted to align in an unpaired fashion. SAM output Optional fields: YF:Z: String indicating reason why the read was filtered out. Only appears for reads that were filtered out. MD:Z: A string representation of the mismatched reference bases in the alignment. Only present if SAM record is for an aligned read. SAMtools http://samtools.sourceforge.net/ Install SAMtools: Dowload samtools-0.1.19.tar.bz2 tar –xjvf samtools-0.1.19.tar.bz2 Or: git clone git://github.com/samtools/samtools.git cd samtools-0.1.19 make SAMtools: Primer Tutorial http://biobits.org/samtools_primer.html Sample Data Files Aligning Reads Using Bowtie2 Converting SAM to BAM Sorting and Indexing Identifying Genomic Variants Understanding the VCF Format Visualizing Reads SAMtools: Primer Tutorial Sample Data Files unzip samtools_primer-master.zip Aligning Reads Using Bowtie2 cd samtools_primer-master ~/bowtie2/bowtie2 -x indexes/e_coli -U simulated_reads/sim_reads.fq -S sim_reads_aligned.sam SAMtools: Primer Tutorial Converting SAM to BAM ~/samtools-0.1.19/samtools view -b -S -o sim_reads_aligned.bam sim_reads_aligned.sam Sorting and Indexing ~/samtools-0.1.19/samtools sort sim_reads_aligned.bam sim_reads_aligned.sorted ~/samtools-0.1.19/samtools index sim_reads_aligned.sorted.bam SAMtools: Primer Tutorial Identifying Genomic Variants ~/samtools-0.1.19/samtools mpileup -g -f genomes/NC_008253.fna sim_reads_aligned.sorted.bam > sim_variants.bcf ~/samtools-0.1.19/bcftools/bcftools view -c -v sim_variants.bcf > sim_variants.vcf SAMtools: Primer Tutorial Understanding the VCF Format http://www.1000genomes.org/wiki/Analysis/Variant%20Call%20Format/vcf-variant-call-format-version-41 SAMtools: Primer Tutorial Visualizing Reads ~/samtools-0.1.19/samtools tview sim_reads_aligned.sorted.bam genomes/NC_008253.fna