Gene abundanceΒΆ
Mapping reads
Source the configuration file:
source conf_cluster.conf
Create a folder for mapping results:
mkdir -p $MAPP
Build the reference for the alignment with the predicted genes:
module load profile/bioinf
module load autoload bowtie2/2.2.9
bowtie2-build $GENE_PRED/prodigal.idba.$PRJ_NAME.fasta $MAPP/bwt.idba.$PRJ_NAME
Align sample reads against the reference:
for i in $SAMPLE_LIST; do
bowtie2 -x $MAPP/bwt.idba.$PRJ_NAME \
-1 $FILT_READS/filtered.$i.R1.fastq \
-2 $FILT_READS/filtered.$i.R2.fastq \
-U $FILT_READS/filtered.single.$i.fastq \
-S $MAPP/bwt.idba.$i.sam -p 9
done
Gene coverages
Create output folder for coverage calculation:
mkdir -p $COV
Calculate gene coverages:
for i in $SAMPLE_LIST; do
echo "Samtools: production of bam file $i"
samtools view -b -o $COV/bwt.idba.$i.bam $MAPP/bwt.idba.$i.sam -S -@ 9
echo "Samtools: bam sorting $i"
samtools sort $COV/bwt.idba.$i.bam -o $COV/bwt.idba.$i.sort.bam -@ 9
echo "Bedtools: calculating coverage of $i"
bedtools genomecov -ibam $COV/bwt.idba.$i.sort.bam -g $GENE_PRED/prodigal.idba.$PRJ_NAME.fasta > $COV/coverage.$i.prodigal.bt
echo "done"
done
Table parsing:
for i in $SAMPLE_LIST; do
awk 'BEGIN {pc=""}
{
c=$1;
if (c == pc) {
cov=cov+$2*$5;
} else {
print pc,cov;
cov=$2*$5;
pc=c}
} END {print pc,cov}' $COV/coverage.$i.prodigal.bt | tail -n +2 > $COV/coverage.$i
done
Have a look to the output files:
cd $COV
ll