Bioinformatics pipeline for genotyping and genotype - phenotype association study in maize (Zea mays L.)
Autori
Mladenović, MarkoGrčić, Nikola
Dudić, Dragana
Nikolić, Ana
Božić, Manja
Delić, Nenad
Prodanović, Slaven
Banović Đeri, Bojana
Ostala autorstva
Matavulj, MilicaKonferencijski prilog (Objavljena verzija)
Metapodaci
Prikaz svih podataka o dokumentuApstrakt
Multidisciplinary research is today commonly used in plant breeding for improving important agronomic
traits. High throughput genotyping technologies and genotype – phenotype association studies as widely
used for improving breeding programs, depend on bioinformatics analysis for extracting information from
the gathered data. In this research, among plethora of widely used bioinformatics approaches, the custom
made one was chosen, based on the current recommendations in the field.
The material includes a set of 46 maize inbred lines commonly used in maize breeding programs. Phenotyping
was done for thirteen important quantitative agronomic traits in 8 environments during two years (2018
and 2019). For the purpose of genotyping, plants of all inbred lines were grown under optimal conditions
and sampled after completing the V4 growth stage. Total RNA was isolated from the third leaf of three plants
per inbred line and used for cDNA preparation by Illumina TruSeq Stranded RNA LT ...kit. Pair-end RNA-Seq
based on Next Generation Sequencing methodology was performed on MiSeq Illumina sequencer using
MiSeq Reagent kit, v2 (2 x 150bp). Raw sequencing data of maize leaves’ transcriptionally active genome
regions at the moment of sampling were used for identification of single nucleotide polymorphisms (SNPs)
in each of 46 inbred lines.
Bioinformatics pipeline for data manipulation and analysis was custom made and included FastQC (for
quality control (QC) of raw data), Trimmomatic tool v0.32 (for adapter and contaminants removal, as well
as for the removal of regions with QC below 30), TopHat (insert size 130, standard deviation 50, maximum
intron size 100.000 – for mapping filtered reads onto the B73 maize reference genome v3.0), Cufflinks
v2.2.1 (for reads assembly), Cuffmerge (for the final transcriptome assembly) and an intersection output of
two independent SNPs calling tools FreeBayes and BCFtools (to minimize false positive results). With the
aim to find SNP markers which show strongly statistically supported relationship with favorable values of
investigated quantitative traits, genotype - phenotype association analysis was conducted. It was performed
using two approaches – one relying on the TASSEL software, widely used in agronomics and the other based
on machine learning software like WEKA, rarely used in agronomics. The results of two approaches were
compared and discussed.
Ključne reči:
maize / bioinformatics / genotyping / RNA-Seq / genotype-phenotype associationIzvor:
Biologia Serbica, 2021, 43, 1 (Special Edition), 109-Izdavač:
- Novi Sad : Faculty of Sciences, Department of Biology and Ecology
Napomena:
- Book of Abstracts: Belgrade BioInformatics Conference 2021
Institucija/grupa
Institut za molekularnu genetiku i genetičko inženjerstvoTY - CONF AU - Mladenović, Marko AU - Grčić, Nikola AU - Dudić, Dragana AU - Nikolić, Ana AU - Božić, Manja AU - Delić, Nenad AU - Prodanović, Slaven AU - Banović Đeri, Bojana PY - 2021 UR - https://imagine.imgge.bg.ac.rs/handle/123456789/1872 AB - Multidisciplinary research is today commonly used in plant breeding for improving important agronomic traits. High throughput genotyping technologies and genotype – phenotype association studies as widely used for improving breeding programs, depend on bioinformatics analysis for extracting information from the gathered data. In this research, among plethora of widely used bioinformatics approaches, the custom made one was chosen, based on the current recommendations in the field. The material includes a set of 46 maize inbred lines commonly used in maize breeding programs. Phenotyping was done for thirteen important quantitative agronomic traits in 8 environments during two years (2018 and 2019). For the purpose of genotyping, plants of all inbred lines were grown under optimal conditions and sampled after completing the V4 growth stage. Total RNA was isolated from the third leaf of three plants per inbred line and used for cDNA preparation by Illumina TruSeq Stranded RNA LT kit. Pair-end RNA-Seq based on Next Generation Sequencing methodology was performed on MiSeq Illumina sequencer using MiSeq Reagent kit, v2 (2 x 150bp). Raw sequencing data of maize leaves’ transcriptionally active genome regions at the moment of sampling were used for identification of single nucleotide polymorphisms (SNPs) in each of 46 inbred lines. Bioinformatics pipeline for data manipulation and analysis was custom made and included FastQC (for quality control (QC) of raw data), Trimmomatic tool v0.32 (for adapter and contaminants removal, as well as for the removal of regions with QC below 30), TopHat (insert size 130, standard deviation 50, maximum intron size 100.000 – for mapping filtered reads onto the B73 maize reference genome v3.0), Cufflinks v2.2.1 (for reads assembly), Cuffmerge (for the final transcriptome assembly) and an intersection output of two independent SNPs calling tools FreeBayes and BCFtools (to minimize false positive results). With the aim to find SNP markers which show strongly statistically supported relationship with favorable values of investigated quantitative traits, genotype - phenotype association analysis was conducted. It was performed using two approaches – one relying on the TASSEL software, widely used in agronomics and the other based on machine learning software like WEKA, rarely used in agronomics. The results of two approaches were compared and discussed. PB - Novi Sad : Faculty of Sciences, Department of Biology and Ecology C3 - Biologia Serbica T1 - Bioinformatics pipeline for genotyping and genotype - phenotype association study in maize (Zea mays L.) IS - 1 (Special Edition) SP - 109 VL - 43 UR - https://hdl.handle.net/21.15107/rcub_imagine_1872 ER -
@conference{ author = "Mladenović, Marko and Grčić, Nikola and Dudić, Dragana and Nikolić, Ana and Božić, Manja and Delić, Nenad and Prodanović, Slaven and Banović Đeri, Bojana", year = "2021", abstract = "Multidisciplinary research is today commonly used in plant breeding for improving important agronomic traits. High throughput genotyping technologies and genotype – phenotype association studies as widely used for improving breeding programs, depend on bioinformatics analysis for extracting information from the gathered data. In this research, among plethora of widely used bioinformatics approaches, the custom made one was chosen, based on the current recommendations in the field. The material includes a set of 46 maize inbred lines commonly used in maize breeding programs. Phenotyping was done for thirteen important quantitative agronomic traits in 8 environments during two years (2018 and 2019). For the purpose of genotyping, plants of all inbred lines were grown under optimal conditions and sampled after completing the V4 growth stage. Total RNA was isolated from the third leaf of three plants per inbred line and used for cDNA preparation by Illumina TruSeq Stranded RNA LT kit. Pair-end RNA-Seq based on Next Generation Sequencing methodology was performed on MiSeq Illumina sequencer using MiSeq Reagent kit, v2 (2 x 150bp). Raw sequencing data of maize leaves’ transcriptionally active genome regions at the moment of sampling were used for identification of single nucleotide polymorphisms (SNPs) in each of 46 inbred lines. Bioinformatics pipeline for data manipulation and analysis was custom made and included FastQC (for quality control (QC) of raw data), Trimmomatic tool v0.32 (for adapter and contaminants removal, as well as for the removal of regions with QC below 30), TopHat (insert size 130, standard deviation 50, maximum intron size 100.000 – for mapping filtered reads onto the B73 maize reference genome v3.0), Cufflinks v2.2.1 (for reads assembly), Cuffmerge (for the final transcriptome assembly) and an intersection output of two independent SNPs calling tools FreeBayes and BCFtools (to minimize false positive results). With the aim to find SNP markers which show strongly statistically supported relationship with favorable values of investigated quantitative traits, genotype - phenotype association analysis was conducted. It was performed using two approaches – one relying on the TASSEL software, widely used in agronomics and the other based on machine learning software like WEKA, rarely used in agronomics. The results of two approaches were compared and discussed.", publisher = "Novi Sad : Faculty of Sciences, Department of Biology and Ecology", journal = "Biologia Serbica", title = "Bioinformatics pipeline for genotyping and genotype - phenotype association study in maize (Zea mays L.)", number = "1 (Special Edition)", pages = "109", volume = "43", url = "https://hdl.handle.net/21.15107/rcub_imagine_1872" }
Mladenović, M., Grčić, N., Dudić, D., Nikolić, A., Božić, M., Delić, N., Prodanović, S.,& Banović Đeri, B.. (2021). Bioinformatics pipeline for genotyping and genotype - phenotype association study in maize (Zea mays L.). in Biologia Serbica Novi Sad : Faculty of Sciences, Department of Biology and Ecology., 43(1 (Special Edition)), 109. https://hdl.handle.net/21.15107/rcub_imagine_1872
Mladenović M, Grčić N, Dudić D, Nikolić A, Božić M, Delić N, Prodanović S, Banović Đeri B. Bioinformatics pipeline for genotyping and genotype - phenotype association study in maize (Zea mays L.). in Biologia Serbica. 2021;43(1 (Special Edition)):109. https://hdl.handle.net/21.15107/rcub_imagine_1872 .
Mladenović, Marko, Grčić, Nikola, Dudić, Dragana, Nikolić, Ana, Božić, Manja, Delić, Nenad, Prodanović, Slaven, Banović Đeri, Bojana, "Bioinformatics pipeline for genotyping and genotype - phenotype association study in maize (Zea mays L.)" in Biologia Serbica, 43, no. 1 (Special Edition) (2021):109, https://hdl.handle.net/21.15107/rcub_imagine_1872 .