Extensive analyses of transcriptome have been carried out in chickpea, which is the third most important legume valued as a source of dietary protein and micronutrients. Over the last two decades, several laboratories have used a wide range of techniques encompassing expressed sequence tag (EST) analysis, serial analysis of gene expression (SAGE), microarray and next-generation sequencing (NGS) technologies for analysing the chickpea transcriptomes. However, chickpea transcriptome analysis witnessed significant progress with the advent of the NGS platforms. Gene expression analyses using NGS platforms were carried out in the vegetative and reproductive tissues such as shoot, root, mature leaf, ﬂower bud, young pod, seed and nodule by various groups which resulted in identification of several tissue-specific transcripts. Some laboratories have utilized transcriptomics to explore the response of chickpea to abiotic and biotic stresses such as drought, salinity, heat, cold, Fusarium oxysporum and Ascochyta rabiei diﬀerentially expressed genes and also established crosstalk between biotic and abiotic stress responses. Transcriptome analysis has been utilized extensively to identify non-coding RNAs such as miRNAs and long intergenic non-coding (LINC) RNAs. Transcriptome analysis has facilitated the development of molecular markers such as simple sequence repeats (SSRs), single-nucleotide polymorphisms (SNPs) and potential intron polymorphisms (PIPs) that are being used to expedite the chickpea breeding programmes. The available chickpea transcriptomes will continue to serve as the foundation for devising strategies for chickpea improvement.
- next-generation sequencing (NGS)
- gene expression
- molecular markers
2. Challenges in chickpea production
The world average of chickpea productivity is 982.1 kg/ha (FAOSTAT 2014); however, a simulated study showed that potential productivity of chickpea in rain-fed situations ranged from 1390 to 4590 kg/ha . There is a huge yield gap of 408–3608 kg/ha. A number of biotic and abiotic factors affect chickpea plant growth and, therefore, are responsible for poor productivity.
Chickpea is mostly raised on conserved soil moisture under rain-fed conditions . Therefore, drought stress generally affects the crop at terminal stage  and leads to productivity loss of up to 50% . Drought reduces overall biomass, reproductive growth and seed yield and increases flower abortion, pod abscission and number of empty pods . Soil salinity affects productivity by delaying the flowering leading to decrease in reproductive success of chickpea . Since chickpea is a cool season crop, high temperatures adversely affect the development of the plant . Chander  reported a decline in yield of chickpea by about 301 kg/ha per 1°C increase in mean seasonal temperature in India [12, 13]. Biotic factors also adversely affect the yield of chickpea crop.
3. Legume genomics
With the advent of next-generation sequencing technologies, there has been a rapid increase in the efficiency of DNA and RNA sequencing and decrease in the cost involved.
The advances in DNA sequencing have led to whole genome sequencing of important legumes such as
Next-generation sequencing (NGS)-based plant genomics has also assisted in understanding of genetic variation within and between species mostly through identification of single-nucleotide polymorphisms (SNPs). In chickpea, a number of studies have been performed to identify SNPs and utilized for various applications such as construction of linkage maps, synteny analysis, anchoring of whole genome sequencing and quantitative trait loci (QTL) analysis [44–49]. A CicArVarDB has also been developed which includes SNP and InDel variations in chickpea .
A cell undergoing a functional or developmental process has a specific set of genes undergoing transcription at a particular time and is collectively called the ‘transcriptome’. Thus, a transcriptome represents up to an extant physical, biochemical and developmental status of a cell. A transcriptome represents a pool of protein coding as well as nonprotein-coding RNAs; moreover, there may be the presence of variants of genes originating from alternative splicing and RNA editing, making the transcriptome more complex than a genome. Study of transcriptome may reveal information regarding spatial and temporal expression patterns of genes, and therefore it is possible to generate global expression profiles of genes representing developmental stages of an organism .
5. Methods for transcriptome analysis
Transcriptome analysis was initiated with the generation of expressed sequence tags (ESTs) that are 200–800 nucleotide long cDNA sequences, synthesised from mRNA through reverse transcription. ESTs represent the expressed part of an organism’s genome and hence are an excellent resource for the study of gene expression at a genome-wide level. Conventionally, EST resources have been developed through Sanger sequencing. Although this process is used to generate and sequence longer fragments of cDNA, it is tedious and labour intensive and offers poor coverage of the transcriptome. These limitations of EST-based transcriptome analysis inspired scientists to develop microarray and other tag-based methods for gene expression analysis. Therefore, tools such as microarrays and serial analysis of gene expression (SAGE) continued to be used for several years for analysis of global gene expression patterns. However, with the advent of NGS and the simultaneous development of in silico analytical tools, global genome and transcriptome analysis has become a standard practice for deriving information to relate genotype to phenotype. However, it is not possible to sequence the transcripts to the full length due to technological limitations. Transcriptome analysis is based on the principle that the depth of coverage of a sequence is proportional to the level of expression of the corresponding gene. Therefore, by mapping and counting the sequenced reads onto the given transcript, expression can be measured, thereby translating sequence information to some biologically significant information. A host of NGS technologies such as sequencing by synthesis (Illumina Inc., USA), SOLiD (ThermoFisher Scientific) and pyrosequencing (454 biosciences/Roche) has provided unprecedented opportunities for high-throughput functional genomic research [51–53]. Moreover, a number of technologies for transcriptome sequencing are emerging such as The Ion Torrent (ThermoFisher Scientific), single-molecule real time (SMRT) (Pacific Biosciences, USA) and Nanopore (Oxford Nanopore Technologies, UK).
6. Using transcriptome analysis for studying biological processes in chickpea
Extensive transcriptome analysis has been carried out in chickpea in order to gain insights into the numerous biological processes. Techniques, such as EST sequencing, SAGE and most importantly the NGS, have been used to analyse the transcriptomes of root, shoot, flower, seed and nodule tissues in order to understand the tissue-specific development and function. Several groups undertook EST sequencing, and till date (March 2017) 53,333 chickpea ESTs are reported in the NCBI database. In another earlier study of the root transcriptome, an EST library was constructed by subtractive suppressive hybridization (SSH) of two related chickpea varieties, ICC 4958 and Annigeri, as they show different root traits. Sequences of more than 2800 ESTs were reported and used to develop the ‘Chickpea Root Expressed Sequence Tag Database’ .
A major advancement in transcriptome analysis for understanding developmental and biological processes occurred with the advent of the NGS platform. Several large-scale NGS-based transcriptome analyses were carried out in chickpea [34–36]. In one of the first NGS-based studies, the Illumina sequencing of transcriptome of chickpea genotype ICC 4958 root and shoot followed by de novo assembly resulted in generation of 53,409 transcripts. Of these 34,676 transcripts were annotated, and 6577 transcripts were identified as transcription factors (TFs) belonging to 57 families. Another study by Garg et al. reported the Roche/454-based transcriptomes of ‘shoot’, ‘root’, ‘mature leaf’, ‘flower bud’ and ‘young pod’ of chickpea genotype ICC 4958 . These sequence reads generated by the Roche/454 platform were merged with the Illumina reads from the previous study, and a hybrid assembly was generated , which resulted in 34,760 tentative consensus (TC) transcripts. Of these, 1851 transcripts were annotated as transcription factors belonging to 84 families. This analysis also led to the identification of 1132, 695, 513, 408 and 126 TCs specifically expressed in flower bud, young pod, shoot, root and mature leaf, respectively. The complete data were integrated leading to the development of the ‘Chickpea Transcriptome Data Base’ (CTDB) which provides a searchable interface to the chickpea transcriptome data . Further, transcriptome analysis of the wild progenitor of chickpea, i.e.
Flower development is an important and specialized process that takes place in angiosperms. Hence, in order to gain insights into the molecular mechanisms responsible for flower development in chickpea, transcriptome analysis was carried out using the Illumina sequencing platform . Transcriptome sequencing of eight successive developing stages of flower (flower buds at sizes 4, 6, 8 and 8–10 mm and flowers with closed petals, partially opened petals, opened and faded petals and senescing petals) along with young leaf, germinating seedling and shoot apical meristem was carried out. Differential expression analysis revealed 1572 genes to be differentially expressed in at least one stage of flower development. A number of 1118 genes (908 upregulated and 201 downregulated) and 966 genes (857 upregulated and 109 downregulated) were found to be differentially regulated in flower bud and flower developmental stages, respectively . The majority of the differentially expressed genes were found to be involved in various flower developmental pathways such as floral organ identity; development of corolla, androecium and gynoecium and gametophyte development. Moreover, genes related to cell wall development and transport were also found to be differentially expressed. In addition, 111 TF genes were found differentially expressed in floral bud and flower.
Chickpea is most valued for its seeds since they serve as a source of protein, especially for vegetarian population. Therefore, a thorough understanding of the transcriptional flux during seed development is important in order to get insights into the biological processes that define the seed. Towards this, an NGS-based deep transcriptome analysis of chickpea seed at four developmental stages, i.e. 10 days after anthesis (DAA), 20 DAA, 30 DAA and 40 DAA, was carried out . The transcriptome was sequenced using the 454 pyrosequencing on the GS-FLX Titanium platform followed by its assembly into 51,099 transcripts. A gene ontology enrichment of seed-specific genes revealed genes related to reproductive structure development, fruit development and embryonic and post-embryonic development to be highly represented. Many metabolic pathways such as proteolysis, lipid metabolic process, regulation of RNA metabolic process, regulation of transcription, terpenoid metabolic process and gibberellin metabolic processes were also found to be significantly represented . In another study, sequencing of ESTs from the chickpea embryo resulted in identification of 1480 unigenes expressed during embryo development . The analysis also identified 12 genes encoding for F-box proteins, of which 2 F-box genes (
Another important distinctive feature of chickpea is its ability to form symbiotic relationship with
7. Using transcriptome analysis for study of stress response in chickpea
Transcriptome analysis has been utilized exclusively to study different abiotic and biotic stress responses in chickpea. Drought and salinity are the major factors that limit the growth and productivity of the plants. Terminal drought is thought to be a major constraint affecting productivity of chickpea as it can lower the yield of chickpea by about 50% . Cold stress also affects susceptible chickpea mainly at the reproductive stage where it leads to pollen sterility and flower abortion . Thus, it is important to study the response of chickpea under these stress conditions in order to devise strategies for development of stress-tolerant chickpea. Earlier studies based on EST sequencing, SAGE and microarray provided preliminary evidence for drought responses of chickpea at transcriptome level [61–66]. An EST sequencing-based study of drought and salinity stress in chickpea resulted in generation of 20,162 ESTs, of which 105 were found to have differential expression during one of the stresses . In another comparison between ESTs generated from chickpea, ICC 4958 (drought tolerant) and ICC 1882 (drought resistant) varieties resulted in identification of 5494 drought-responsive ESTs . A microarray-based transcriptome analysis of root and leaf of chickpea under drought stress resulted in identification of 4815 differentially expressed genes. Approximately 2623 and 3969 genes were found to be differentially expressed, whereas 88 and 52 genes were found to be specifically expressed during drought stress in root and leaf tissues, respectively . Another microarray analysis in chickpea revealed 109, 210 and 386 genes to be differentially expressed in drought, cold and high-salinity stresses, respectively . A SuperSAGE-based transcriptome analysis of chickpea drought stressed and control tissues gave rise to 17,493 unique transcripts (UniTags) of which 7532 were differentially expressed in drought stress . Another SuperSAGE followed by 454 sequencing of root nodule transcriptome of salt-tolerant variety INRAT-93 identified 363 and 106 genes to be upregulated and downregulated, respectively, in root and nodule tissues .
The more global view of stress response in chickpea was provided by the study of Garg et al.  in which the transcriptome of chickpea root and shoot under desiccation, salinity and cold stress was analysed. The Illumina sequencing-based transcriptome and comparison revealed 11,640 transcripts to be differentially expressed during at least one of the stresses. Seven hundred forty-five transcription factors (TFs) were also found to be differentially regulated in at least one stress condition. Moreover 3536 unannotated genes from the chickpea transcriptome were also identified . A more detailed transcriptome analysis of drought-tolerant (ICC 4958), drought-sensitive (ICC 1882), salinity-tolerant (JG 62) and salinity-sensitive (ICCV2) chickpea varieties resulted in identification of 18,462 transcripts representing 13,964 unique loci in at least one sample/stress condition. The study also revealed 4954 and 5545 genes exclusively regulated in drought-tolerant and salinity-tolerant varieties. A number of 775 TFs encoding genes belonging to 80 families were also found differentially regulated in stress conditions. Members of the bHLH, WRKY, NAC, AP2-EREBP and MYB were found among the top differentially expressed TFs in stress condition . In order to understand the effect of cold stress, AFLP-based transcript profiling (cDNA-AFLP) approach was used , which showed that in cold-tolerant chickpea, 102 transcript-derived fragments (TDFs) were differentially expressed during cold stress. Moreover, transcriptome analysis of cold-tolerant chickpea ICC 16349 using cDNA differential display (DDRT-PCR) resulted in identification of 127 ESTs as differentially expressed in anthers during cold stress conditions.
In order to identify common genes between biotic and abiotic responses in chickpea, Mantri et al.  performed microarray analysis of chickpea ICC 3996 under three abiotic stresses (drought, cold and high salinity) and biotic stress (infection with
8. Transcriptome analysis for non-coding RNA studies in chickpea
Non-coding RNAs usually act as regulatory elements that have a decisive role in fine regulation of gene activity. Non-coding transcripts comprise of small and long non-coding RNAs. Small non-coding RNAs regulate diverse developmental processes by controlling gene expression at transcriptional and post-transcriptional level [72, 73]. MicroRNAs (miRNAs) constitute the major class of small non-coding RNAs and are 20–24 nucleotides long key regulatory elements. They are highly conserved and play an important part in various developmental processes in plants such as leaf development, flowering, formation and maintenance of the shoot, floral and axillary meristems, establishment of organ polarity, root nodule symbiosis, vegetative to reproductive phase transition and response to biotic and abiotic stresses [73–77]. In chickpea, small RNA libraries were sequenced from normal tissues and those under different stress conditions [78–80]. Small RNA sequence data were filtered and processed for miRNA prediction using miRDeep pipeline resulting in identification of distinct conserved miRNAs from shoot (302, including Cat-miR156b-5p, Cat-miR156j.1, Cat-miR159.1, Cat-miR169b-5p), root (280, including Cat-miR156c.1, Cat-miR169n, Cat-miR171k-3p), mature leaf (248, including Cat-miR156k, Cat-miR172d.2, Cat-NovmiR319b, Cat-miR167a, Cat-miR167d.2), stem (268, including Cat-miR172c-3p, Cat-miR159.3, Cat-NovmiR319d, Cat-miR171k-3p), flower bud (247, Cat-miR319g.2, Cat-miR167c.2, Cat-miR167d.1, Cat-miR171b-3p.2), flower (293, Cat-miR159.4, Cat-miR159e, Cat-miR171m) and young pod (274, Cat-miR172d.1, Cat-NovmiR159a, Cat-miR167-5p). By ab initio prediction, a total of 109, 76, 123, 100, 106, 98 and 120 novel candidate miRNAs were identified from the above tissues, respectively. Overall 618 miRNAs were identified from all the tissues with the maximum being 373 miRNAs from the shoot and minimum 303 from flower buds. Of the 618 miRNAs predicted, 158 were present in all the tissues, and 29% of the miRNAs were found to be tissue specific. Of the 618 miRNAs, 421 were clustered to 73 miRNA families, and 197 could not find similarity to any miRNA family and were termed putative novel. Chickpea miRNAs targeted a wide range of transcripts involved in diverse cellular processes including protein turnover and modification, metabolism, transcriptional regulation and signal transduction . A similar kind of study performed in leaf and flower tissue resulted in the prediction of 96 highly conserved miRNAs belonging to 38 miRNA families and 20 novel miRNAs belonging to 17 miRNA families in chickpea . In addition to identification of miRNA from different tissues, studies were also conducted for characterization of miRNA in response to different biotic and abiotic stresses. In one such kind of study, three libraries were sequenced for small RNA identification . Libraries were constructed from fungal-infected (
Long intergenic non-coding (linc) RNAs belong to a class of non-coding transcripts which have a length of at least 200bp lacking coding potential and are transcribed from intergenic region of protein coding genes [81, 82]. Linc RNAs control gene regulation at transcriptional and post-transcriptional level by mechanisms including chromatin modification, promoter binding complex attachment and shielding mRNA degradation by acting as sponge against miRNA [83–85]. RNA-seq data from 11 different tissues of chickpea were used for mining linc-RNA . RNA-seq data were processed using TopHat2 and Cufflinks program using chickpea genome as the reference. From 32,984 transcripts obtained, 5782 putative intergenic transcripts were extracted out and subjected to the optimized pipeline for identification of linc-RNA. After removing potential coding transcripts and transcripts having similarity to protein domains, finally a total of 2248 transcripts were retained as putative chickpea linc-RNAs. About 79%, i.e. 1790 linc-RNAs, could be assigned a putative function. Through expression profiling it was evident that a large number of linc-RNAs have tissue-specific expression in distinct tissues. Along with this several linc-RNAs were found to be targets of miRNAs and were involved in various developmental and reproductive processes .
9. Expanding transcriptome data to aid development of molecular markers
A DNA-based molecular marker is a DNA sequence with an identifiable location on the genome that can be transmitted from one generation to the next following the standard laws of inheritance . Recent years have witnessed an immense interest in generation and utilization of molecular markers, as they provide the essential tools for a variety of genomic applications such as QTL mapping, map-based cloning, marker-assisted breeding, association mapping and genetic diversity assessment. These approaches can be applied to understand the genomic architecture of the crop and can expand the efficiency of breeding programmes, thereby aiding to expedite agricultural research. The advent of NGS has enabled the exploration of thousands of markers across entire genomes and transcriptomes. Although transcriptomics has been majorly used for gene expression analysis, it has also been utilized to identify molecular markers such as SSRs and SNPs especially in the genic regions. Such gene-based markers located in coding regions of the genes greatly enhance the opportunity of precise mapping of genes linked to important traits. Transcriptome sequencing offers another advantage for those crops in which a reference genome is not available. To identify SSRs from transcriptome data, several bioinformatics tools have been developed such as MISA (MIcroSAtellite identification tool) (pgrc.ipk-gatersleben.de/misa/), RISA (Rapid Identification of SSRs and Analysis of primers) (http://sol.kribb.re.kr/RISA/) and RepeatAnalyzer . In chickpea, initially a large number of molecular markers were derived from ESTs. Buhariwalla et al.  reported 106 EST-based markers developed from an EST library of root tissue from chickpea. In another study by Choudhary et al. , 2131 ESTs were utilized for development of 246 EST-SSR markers. Apart from SSRs, several types of markers such as ESTPs, PIPs and EST-SNPs were developed in chickpea using transcriptome data. For instance, Choudhary et al.  reported 125 EST-SSRs, 109 ESTPs, 102 SNPs and 151 ITPs. Gupta et al.  reported 367 novel EST-derived functional markers which included 187 EST-SSRs, 130 potential intron polymorphisms (PIPs) and 50 expressed sequence tag polymorphisms (ESTPs). In another study, 71 gene-based SNP markers were developed utilizing candidate chickpea transcripts . However, currently transcriptomic resources can be easily generated by high-throughput NGS technologies and utilized to identify molecular markers very rapidly and cost-effectively. Hiremath et al.  utilizing the Roche platform generated about 3000 gene-based markers from a large subset of transcripts derived from different chickpea tissues. Currently, SNPs are the markers of choice and are preferred over the SSRs and other markers because of their genome-wide presence and amenability to high-throughput genotyping. Theoretically, SNP calling may be defined as the process of identifying a single-nucleotide variation from an accession read that differs from the existing reference genome or a de novo assembly at similar nucleotide position. Read assembly files generated by mapping programs such as BWA, Bowtie and SOAP are used to perform SNP calling. Bioinformatics tools such as HaploSNPer , SAMtools [94, 95], POLYBAYES , SNVer  and SOAPsnp  have been designed to detect the variations in the NGS data. Comparison of transcriptome datasets from contrasting genotypes could help derive SNPs. To date, several studies have been carried out using NGS technology-based transcriptome sequencing to generate large sets of molecular markers in various crop species including chickpea. For instance, a report by Garg et al.  facilitated identification of 4816 SSRs from the de novo assembly of the chickpea transcriptome. In another study, sequencing the transcriptome of
10. Future perspectives
The last few years have witnessed legume genomics attaining new heights as genomes, and transcriptomes of many model legumes (