Gene abundanceΒΆ

Mapping reads

Source the configuration file:

source conf_cluster.conf

Create a folder for mapping results:

mkdir -p $MAPP

Build the reference for the alignment with the predicted genes:

module load profile/bioinf
module load autoload bowtie2/2.2.9

bowtie2-build $GENE_PRED/prodigal.idba.$PRJ_NAME.fasta $MAPP/bwt.idba.$PRJ_NAME

Align sample reads against the reference:

for i in $SAMPLE_LIST; do
        bowtie2 -x $MAPP/bwt.idba.$PRJ_NAME \
        -1 $FILT_READS/filtered.$i.R1.fastq \
        -2 $FILT_READS/filtered.$i.R2.fastq \
        -U $FILT_READS/filtered.single.$i.fastq \
        -S $MAPP/bwt.idba.$i.sam -p 9
done

Gene coverages

Create output folder for coverage calculation:

mkdir -p $COV

Calculate gene coverages:

for i in $SAMPLE_LIST; do
        echo "Samtools: production of bam file $i"
        samtools view -b -o $COV/bwt.idba.$i.bam $MAPP/bwt.idba.$i.sam -S -@ 9
        echo "Samtools: bam sorting $i"
        samtools sort $COV/bwt.idba.$i.bam -o $COV/bwt.idba.$i.sort.bam -@ 9
        echo "Bedtools: calculating coverage of $i"
        bedtools genomecov -ibam $COV/bwt.idba.$i.sort.bam -g $GENE_PRED/prodigal.idba.$PRJ_NAME.fasta > $COV/coverage.$i.prodigal.bt
        echo "done"
done

Table parsing:

for i in $SAMPLE_LIST; do
        awk 'BEGIN {pc=""}
        {
                c=$1;
                if (c == pc) {
                        cov=cov+$2*$5;
                } else {
                        print pc,cov;
                        cov=$2*$5;
                pc=c}
        } END {print pc,cov}' $COV/coverage.$i.prodigal.bt | tail -n +2 > $COV/coverage.$i

done

Have a look to the output files:

cd $COV
ll