一个经常会有些小脚本操作代码的网站-ecseq之脚本收集

http://www.ecseq.com/support/

image.png

干货很多,收了不看系列就不好了

1、Convert BAM to BED file format using command line perl

samtools view file.bam | perl -F'\t' -ane '$strand=($F[1]&16)?"-":"+";$length=1;$tmp=$F[5];$tmp =~ s/(\d+)[MD]/$length+=$1/eg;print "$F[2]\t$F[3]\t".($F[3]+$length)."\t$F[0]\t0\t$strand\n";'

2、Convert an interleaved fasta file to a single line fasta using only the Linux command line

awk '{if(NR==1) {print $0} else {if($0 ~ /^>/) {print "\n"$0} else {printf $0}}}' interleaved.fasta > singleline.fasta

3、How to get a coverage graph in WIG file format directly from an alignment (BAM)

samtools mpileup -BQ0 run.sorted.bam | perl -pe '($c, $start, undef, $depth) = split;if ($c ne $lastC || $start != $lastStart+1) {print "fixedStep chrom=$c start=$start step=1 span=1\n";}$_ = $depth."\n";($lastC, $lastStart) = ($c, $start);' | gzip -c > run.wig.gz

4、Convert FASTQ to FASTA on the command line

paste - - - - < file.fq | cut -f 1,2 | sed 's/^@/>/' | tr "\t" "\n" > file.fa

5、Add gzip/gunzip support to a program that doesn't have it (e.g. bowtie)

mkfifo mate1.fastq
mkfifo mate2.fastq
gunzip -c mate1.fastq.gz > mate1.fastq &
gunzip -c mate2.fastq.gz > mate2.fastq &
bowtie -S genome -1 mate1.fastq -2 mate2.fastq > sample.sam

6、Annotate mapping entries of a BAM file based on overlaps with BED files using BEDtools

tagBam -i run.bam -files RefSeq.bed -names -tag GA > run.tagged.bam

7、Extract a list of specific read IDs from a bam file

samtools view file.bam | fgrep -w -f IDs.txt

8、Run time-consuming processes in parallel on Unix systems

# Example usage of xargs (-P is the number of parallel processes started - don't use more than the number of cores you have available):
samtools view -H yourFile.bam | grep "\@SQ" | sed 's/^.*SN://g' | cut -f 1 | xargs -I {} -n 1 -P 24 sh -c "samtools mpileup -BQ0 -d 100000 -uf yourGenome.fa -r {} yourFile.bam | bcftools view -vcg - > tmp.{}.vcf"

# To merge the results afterwards, you might want to do something like this:
samtools view -H yourFile.bam | grep "\@SQ" | sed 's/^.*SN://g' | cut -f 1 | perl -ane 'system("cat tmp.$F[0].bcf >> yourFile.vcf");'

你可能感兴趣的:(一个经常会有些小脚本操作代码的网站-ecseq之脚本收集)