Skip to main content

Volume 22 Supplement 1

Genomics & Informatics

Genome-wide identification, characterization, and expression analysis of the small auxin-up RNA gene family during zygotic and somatic embryo maturation of the cacao tree (Theobroma cacao)

Abstract

Small auxin-up RNA (SAUR) proteins were known as a large family that supposedly participated in various biological processes in higher plant species. However, the SAUR family has been still not explored in cacao (Theobroma cacao L.), one of the most important industrial trees. The present work, as an in silico study, revealed comprehensive aspects of the structure, phylogeny, and expression of TcSAUR gene family in cacao. A total of 90 members of the TcSAUR gene family have been identified and annotated in the cacao genome. According to the physic-chemical features analysis, all TcSAUR proteins exhibited slightly similar characteristics. Phylogenetic analysis showed that these TcSAUR proteins could be categorized into seven distinct groups, with 10 sub-groups. Our results suggested that tandemly duplication events, segmental duplication events, and whole genome duplication events might be important in the growth of the TcSAUR gene family in cacao. By re-analyzing the available transcriptome databases, we found that a number of TcSAUR genes were exclusively expressed during the zygotic embryogenesis and somatic embryogenesis. Taken together, our study will be valuable to further functional characterizations of candidate TcSAUR genes for the genetic engineering of cacao.

1 Introduction

Cacao (Theobroma cacao L.) has been known as one of the most critical industrial crops globally, which belongs to the family Malvaceae. Originating from the Central and South America [1, 2], cacao has grown up to at least fifty nations located in the humid tropics. As an excellent source of essential nutrients, minerals and antioxidants, cacao beans have been used for chocolate production, confectionery, and cosmetics [3, 4]. However, climate change, especially biotic and abiotic stresses could threaten cocoa production [5, 6].

To address the issues caused by adverse environmental conditions, various studies have concentrated on the functions of gene families [7,8,9,10], because understanding the roles of these functional and regulatory genes may open up the possibility of developing new climate change-adapted lines through genetic engineering. Of our interest, the small auxin-up RNA (SAUR) proteins serve as the largest sub-group of the expansive auxin response factor gene family in higher plant species. Particularly, the expression of SAUR genes is elicited rapidly and transiently by auxin, thus playing crucial roles in regulating plant growth, development, and responses to environmental stresses [11]. To gain an insight into their functions, previous studies have been performed to analyze the SAUR families in many important crop species, including potato (Solanum tuberosum) and tomato (Solanum lycopersicum) [12], watermelon (Citrullus lanatus) [13], cotton (Gossypium spp.) [14], moso bamboo (Phyllostachys edulis) [15], poplar (Populus trichocarpa) [16], grape (Vitis vinifera) [17], apple (Malus domestica) [18], coffea (Coffea canephora) [19], Chinese white pear (Pyrus bretschneideri) [20], melon (Cucumis melo) [21], loquat (Eriobotrya japonica) [22], wax gourd (Benincasa hispida) [23], peanut (Arachis hypogaea) [24], pineapple (Ananas comosus) [25], foxtail millet (Setaria italica) [26], cucumber (Cucumis sativus) [27], and longan (Dimocarpus longan) [28]. However, this important gene family in cacao has not been recorded, even though the assembly of cacao was released [29].

Thus, the aim of this present study was to systematically identify, annotate and characterize the SAUR family in cacao. Firstly, all putative members of the SAUR family have been screened and validated in the cacao assembly. By using various web-based tools, the general features of the proteins and genes were then explored. We then constructed an unrooted phylogenetic tree of the SAUR proteins and predicted the duplication events in the SAUR gene family. Finally, we re-analyzed the previous transcriptome database to investigate the expression levels of the SAUR genes in various tissues during zygotic embryogenesis and somatic embryogenesis.

2 Materials and methods

2.1 Identification of the SAUR genes in cacao

In order to identify the SAUR family members from cacao genomes, the whole genome and proteome data of T. cacao cultivar “B97-61/B2” (NCBI RefSeq assembly: GCF_000208745.1, date of release: Jul 9, 2016) were downloaded from the NCBI [29]. The hidden Markov model profile of the conservative functional domain of SAUR (Pfam accession: PF02519) [11] was obtained from the Pfam database [30]. All protein sequences were then screened against the cacao proteomes [29] to obtain the potential members of the SAUR gene family. The full-length protein sequence, genomic DNA sequence, and coding DNA sequence of each member of the SAUR family in cacao were obtained for subsequent analysis.

2.2 Prediction of the SAUR protein characteristics in cacao

The full-length amino acid sequences of SAUR proteins in cacao were used as seed sequences for a search in the Expasy Protparam [31, 32] as previously guided [9, 33,34,35,36]. Particularly, the SAUR protein’s common features, including protein length, isoelectric point (pI), molecular weight (mW), aliphatic index (AI), and grand average of hydropathicity (GRAVY) were estimated.

2.3 Construction of the phylogenetic tree of the SAUR proteins in cacao

The full-length amino acid sequences of SAUR proteins in cacao were used to generate an unrooted phylogenetic tree as previously guided [9, 33,34,35,36]. Firstly, the ClustalW software [37, 38] was used for the multisequence alignment of the SAUR proteins in cassava. Additionally, all members of the SAUR families from Arabidopsis thaliana [39, 40] and coffea [19] were also downloaded for other trees. Results were then imported into the Molecular Evolutionary Genetics Analysis (MEGA) software [41] for constructing an unrooted phylogenetic tree. A maximum likelihood estimation with default settings was applied as the model selection parameter. Finally, the Adobe Illustrator software was used to edit and visualize the resultant tree.

2.4 Prediction of gene duplication of the SAUR genes in cacao

The duplicated events that occurred in the SAUR gene family in cacao were predicted based on the MEGA-based phylogenetic tree as previously described [9, 33,34,35,36]. Particularly, SAUR members in the same clade with high bootstrap values were assigned as duplicated pair. The criteria of sharing more than 70% identity were utilized for describing a duplicated gene pair. A duplicated pair was defined as a tandem duplication event if these genes are located next to each other on the same chromosome within a 100-kb distance, while a segmental duplication event referred to duplications of DNA segments that range in size from 1 to 200 kb and occur in the same chromosome [42]. Additionally, a duplicated pair resulting from a whole genome duplication event was known that duplicated genes were distributed in different chromosomes [42].

2.5 Exon/intron structural analysis of the SAUR genes in cacao

Gene exon–intron structure characteristics of genes encoding the SAUR proteins in cacao were analyzed as previously guided [9, 33,34,35,36]. Specifically, the genomic DNA sequence and coding DNA sequence of each gene encoding SAUR protein in cacao were used to align in the Gene Structure Display Server [43]. The order of the SAUR proteins in cacao obtained from the phylogenetic tree was then applied to visualize the gene structures. We then used the Adobe Illustrator software to edit the figure.

2.6 Transcriptome analysis of the SAUR genes in cacao

The expression profiles of the SAUR genes were analyzed based on the published transcriptome atlas available in the NCBI Gene Expression Omnibus [44] as previously described [9, 33,34,35,36]. We used the GSE55476 dataset to assess the expression levels of the SAUR genes in six tissue types and stages of embryogenesis [45]. Particularly, zygotic embryo tissues at 14 (T-ZE), 16 (EF-ZE), 18 (LF-ZE), and 20 weeks after pollination (M-ZE) and somatic embryo tissues harvested in the whole late torpedo stage (LT-SE) and cotyledon tissues from mature somatic embryos (M-SE) were extracted to prepare the library [45]. The genome-wide expression of the SAUR genes was visualized in R script [46]. The expression levels are described by a color bar that changes from green to red.

3 Results and discussion

3.1 Identification and annotation of TcSAUR genes in cacao

To identify all the putative SAUR genes in cacao, the seed sequence of the SAUR domain [11] was used to search against the proteome of cacao [29]. As a result, a total of 90 SAUR genes were identified and well-annotated in the genome of cacao (Table 1). Based on the order of occurrence in the cacao genome, all putative members of the SAUR family in cacao were defined from TcSAUR01 to TcSAUR90, with “Tc” and “SAUR” abbreviated for the scientific name of cacao (Theobroma cacao) and the full name of the protein (small auxin-up RNA) (Table 1, Fig. 1). It has been realized that all putative TcSAUR genes were localized in the cacao genome with an uneven ratio. Interestingly, the chromosomal distributions of the SAUR gene family in the genomes of other higher plant species, such as coffea [19], melon [21], and wax gourd [23] also confirmed our finding.

Table 1 Physical and chemical properties of the SAUR family in cacao
Fig. 1
figure 1

Physical distribution of the SAUR gene family in cacao genome

Previously, the SAUR family is being explored in various higher plant species, such as potato [12], tomato [12], watermelon [13], cotton [14], moso bamboo [15], poplar [16], grape [17], apple [18], coffea [19], Chinese white pear [20], melon [21], loquat [22], wax gourd [23], peanut [24], pineapple [25], foxtail millet [26], cucumber [27], and longan [28]. More specifically, 31 and 38 members of the SAUR families have been recorded in coffea and moso bamboo [15, 19]. Previous studies also revealed 52, 60, 62, and 68 SAUR proteins in pineapple, grape, cucumber, and wax gourd [17, 23, 25, 27]. Meanwhile, a total of 98, 105, and 116 putative SAUR proteins was found in apple, poplar and Chinese white pear [16, 20, 47]. Our comparisons suggested that the SAUR families in higher plant species were large groups, with greatly variable members.

3.2 Analysis of the general features of TcSAUR proteins in cacao

To better comprehend the TcSAUR proteins, the physic-chemical parameters of each member of the TcSAUR family, such as protein length, pI, mW, AI and GRAVY scores were analyzed as previously described [9, 33,34,35,36]. The general properties of the TcSAUR proteins were then provided in Table 1. We found that the proteins of TcSAUR family were varied from 60 (TcSAUR36) to 180 residues (TcSAUR54) in length (Table 1). The estimated mW ranged from 6.56 to 20.35 kDa, and TcSAUR36 and TcSAUR54 had the lowest and highest mW values, respectively (Table 1). The predicted pI scores of the TcSAUR proteins were varied from 4.04 (TcSAUR36) to 10.60 (TcSAUR11) (Table 1). Among them, a majority of members of the TcSAUR, particularly 68 out of 90 members had pI scores greater than 7.00 (Table 1). Next, the AI scores of the TcSAUR proteins were found between 65.75 (TcSAUR03) and 107.97 (TcSAUR66) (Table 1). Finally, 80 out of 90 TcSAUR proteins were predicted to be hydrophilic because their GRAVY scores were minus, ranging from − 0.81 (TcSAUR02 and TcSAUR55) to − 0.01 (TcSAUR37 and TcSAUR41) (Table 1). Ten remaining TcSAUR proteins, including TcSAUR18, TcSAUR26, TcSAUR27, TcSAUR34, TcSAUR38, TcSAUR40, TcSAUR44, TcSAUR49, TcSAUR67, and TcSAUR89, had plus GRAVY scores (Table 1), suggested that they were hydrophobic.

Previously, the general features of the SAUR proteins in higher plant species were discussed [20]. For example, the pI scores of the SAUR proteins in Chinese white pear ranged from 5.10 to 10.28, of which 63 (out of 116) SAUR proteins shared pI scores greater than 7.00 [20]. The mW values of the SAUR proteins in Chinese white pear have been reported to vary greatly, with the minimum mW and maximum mW of 7.47 and 122.22 kDa, respectively [20]. Similarly, the protein sizes of the SAUR proteins in Chinese white pear ranged from 67 to 1090 residues, while all proteins were hydrophilic (GRAVY scores were minus) [20]. In foxtail millet, the SAUR proteins were varied from 8.21 to 39.49 kDa in mass [26]. Interestingly, most of the SAUR proteins were basic molecules (pI scores greater than 7.00), whereas only 17 members of the SAUR family were acidic proteins (pI scores less than 7.00) [26]. The AI scores of the SAUR proteins in foxtail millet were varied from 53.19 to 104.15 [26]. In cucumber, the SAUR proteins were varied in mW values from 9.47 to 86.25 kDa, while the pI scores of these proteins ranged from 4.77 to 10.38 [27]. The sizes of the SAUR proteins in cucumber were reported to be between 84 and 746 residues, while the GRAVY scores of these molecules were varied from -0.96 to 0.05 [27].

3.3 Analysis of gene structures and phylogenetic tree of TcSAUR proteins in cacao

To get insight into the gene structures of the TcSAUR genes in cacao, we analyzed the exon/intron organization of all members. We found that 85 (out of 90) TcSAUR genes were intronless (Fig. 2). Five remaining TcSAUR genes, including TcSAUR22, TcSAUR24, TcSAUR62, TcSAUR67, and TcSAUR87 contained two exons (Fig. 2). Additionally, the coding DNA sequences of the TcSAUR genes were varied from 183 (TcSAUR36) to 2209 nucleotides (TcSAUR22) (Fig. 2). The high occurrence of intronless genes in the TcSAUR family in cacao could be consistent with the cases reported in other plant species. For example, most SAUR genes in pineapple did not have introns [25], while 85 (out of 95) SAUR genes in loquat also contained no intron [22]. In Chinese white pear, a majority of the SAUR genes were intronless, whereas only five SAUR genes contained at least one intron [20]. Similarly, 94 (out of 105) SAUR genes in poplar contained no introns [16]. Taken together, our study suggested that most SAUR genes in cacao, perhaps in plant species did not have introns.

Fig. 2
figure 2

Exon/intron organization of the SAUR gene family in cacao

Next, to understand the relationship of the SAUR proteins, all members of the SAUR families from Arabidopsis thaliana [39, 40], coffea [19] and cacao were used to construct a maximum likelihood-based phylogenetic tree. As provided in Fig. 3, all SAUR proteins from Arabidopsis thaliana [39, 40], coffea [19], and cacao were clearly classified into seven clades. According to the phylogenetic tree, whole members of the TcSAUR family were distributed in all seven clades (Fig. 3). Particularly, seven clades could be assigned into 10 sub-groups. Among them, sub-groups 4 and 6 contained three (TcSAUR73, TcSAUR83, and TcSAUR85) and three (TcSAUR01, TcSAUR51, and TcSAUR81) members of the TcSAUR family, while only two (TcSAUR52 and TcSAUR80) and two (TcSAUR53 and TcSAUR54) members of the TcSAUR family have been found in sub-groups 7 and 10, respectively (Fig. 3). Sub-groups 9 and 5 shared the highest members of the TcSAUR family, with 33 and 27 TcSAUR proteins, respectively, while sub-groups 8, 3, and 2 harbored five (TcSAUR02, TcSAUR04, TcSAUR05, TcSAUR55, and TcSAUR57), four (TcSAUR03, TcSAUR61, TcSAUR71, and TcSAUR89), and four (TcSAUR74, TcSAUR76, TcSAUR77, and TcSAUR79) members (Fig. 3). Sub-group 1 had seven members, including TcSAUR56, TcSAUR58, TcSAUR59, TcSAUR60, TcSAUR75, TcSAUR78, and TcSAUR84 (Fig. 3). Previously, the SAUR families in cucumber and wax gourd were categorized into seven branches [23, 27]. Meanwhile, all 52 and 60 members of the SAUR family in pineapple and grape, respectively, were divided into 12 sub-families based on phylogenetic analysis [17, 25].

Fig. 3
figure 3

Categorization of the SAUR families in Arabidopsis thaliana (At), coffea (Cc), and cacao (Tc)

As a major part of this study, the duplication events that occurred in the TcSAUR gene family in cacao have been predicted. According to the classification of the duplicated genes [42], a total of four, three, and three duplication events resulting from tandem duplication, segmental duplication, and whole genome duplication have been found in the TcSAUR gene family in cacao (Table 2). Particularly, a large cluster of 13 and three TcSAUR genes, including TcSAUR06, TcSAUR07, TcSAUR08, TcSAUR09, TcSAUR10, TcSAUR11, TcSAUR12, TcSAUR13, TcSAUR14, TcSAUR15, TcSAUR16, TcSAUR17, and TcSAUR18 and TcSAUR20, TcSAUR21, and TcSAUR22, were found as tandemly duplication events in chromosome 2 (Fig. 1, Table 2). One tandemly duplication event of seven TcSAUR genes (TcSAUR63, TcSAUR64, TcSAUR65, TcSAUR66, TcSAUR67, TcSAUR68, and TcSAUR69) localized in chromosome 5 was also reported (Fig. 1, Table 2). Next, three duplicated TcSAUR pairs were recorded as segmental duplication, including a pair of five genes (TcSAUR26, TcSAUR29, TcSAUR31, TcSAUR33, and TcSAUR36), a pair of 14 genes (TcSAUR34, TcSAUR35, TcSAUR38, TcSAUR39, TcSAUR41, TcSAUR42, TcSAUR43, TcSAUR44, TcSAUR45, TcSAUR46, TcSAUR47, TcSAUR48, TcSAUR49, and TcSAUR50), and a pair of TcSAUR74 and TcSAUR76 (Fig. 1, Table 2). Additionally, three gene pairs, including TcSAUR04 (chromosome 1) and TcSAUR55 (chromosome 3), TcSAUR24 (chromosome 2) and TcSAUR72 (chromosome 6), and TcSAUR25 (chromosome 2) and TcSAUR86 (chromosome 9) have resulted from the whole genome duplication events (Fig. 1, Table 2). Previously, tandemly duplication events, segmental duplication events, and whole genome duplication events were three main reasons for the expansion of the SAUR gene families in other crop species, like cotton [14], peanut [24], wax gourd [23], Chinese white pear [20], and foxtail millet [26].

Table 2 Duplication events found in the SAUR family in cacao

3.4 Analysis of the TcSAUR genes expression profiles during the zygotic and somatic embryo maturation of cacao

Of our interest, we investigated the expression patterns of the TcSAUR genes during the zygotic embryogenesis and somatic embryogenesis by re-explored the previous microarray data [45]. We then arranged the whole 90 members of the TcSAUR gene family into 10 sub-groups and provided in Fig. 4. As provided in Fig. 4, all TcSAUR genes were differentially expressed in six samples during the zygotic embryogenesis and somatic embryogenesis. Particularly, TcSAUR85 was exclusively expressed in all six samples, while TcSAUR83 tend to highly express in T-ZE (Fig. 4A). In sub-group 2, only TcSAUR51 exhibited a strong expression during the zygotic embryogenesis (Fig. 4B). In sub-group 3, at least two genes, particularly TcSAUR20 and TcSAUR21, were noted to be strongly expressed during the zygotic embryogenesis and somatic embryogenesis, while TcSAUR35 was highly expressed in the LT-SE and M-SE (Fig. 4C). Two genes, like TcSAUR22 and TcSAUR23, were exclusively expressed in T-ZE (Fig. 4C). Interestingly, a majority (four out of five) members of the TcSAUR family belonging to sub-group 4, including TcSAUR04, TcSAUR05, TcSAUR55, and TcSAUR57, exhibited a strong expression in both six tissues, whereas TcSAUR02 was highly expressed in LT-SE and M-SE (Fig. 4D). We also found that TcSAUR genes in sub-group 5 tend to be moderately expressed in all tissues during zygotic embryogenesis and somatic embryogenesis (Fig. 4E). Additionally, two (TcSAUR52 and TcSAUR80), one (TcSAUR79), and four (TcSAUR56, TcSAUR59, TcSAUR75 and TcSAUR84) genes in sub-group 6, 7, and 8 were strongly expressed in all samples (Fig. 4F, G, H). In sub-group 9, TcSAUR90 was highly expressed in LT-SE and M-SE, while TcSAUR62 and TcSAUR63 proteins were highly accumulated in M-ZE (Fig. 4I). Finally, two TcSAUR genes in sub-group 10, like TcSAUR53 and TcSAUR54, were exclusively expressed in M-SE (Fig. 4J).

Fig. 4
figure 4

Expression patterns of the SAUR gene family during the zygotic embryogenesis and somatic embryogenesis in cacao

Up till now, the SAUR functions in higher plant species have been investigated. For example, a recent study found that a member of the SAUR in tomato, namely SlSAUR69, increased fruit sensitivity to ethylene by suppressing polar auxin transport to alter the unripening-to-ripening transition [12]. Previously, the functions of the SAUR genes during embryogenesis were also recorded. Specifically, a number of the SAUR genes in coffea exhibited more expression in at least one of the developing embryo stages or plantlets [19]. Among them, the expression of coffea SAUR12 gene increased in non-embryogenic calli and the developing embryo stages [19]. In coconut, the expression patterns of the SAUR genes in the embryogenic callus stage were reported to be significantly higher than that in the initial culture and somatic embryo stage [48]. Recently, a number of the SAUR genes in longan were strongly expressed in the globular embryos, suggesting that they might play an important role during the early longan somatic embryogenesis [28]. In the future, point-mutation genetic tests should be performed to confirm their crucial significance in the biochemical function of TcSAUR proteins in cacao.

4 Conclusion

To sum up, this current study provided new insight into the identification, annotation, characterization, and expression of the TcSAUR gene family in cacao. Our results indicated that all members of the TcSAUR family were slightly conserved based on their structure and phylogenetic tree. Among them, our results clearly indicated that tandemly segmental duplication events, segmental duplication events, and whole genome duplication events could be explained for the evolution of this important gene family. Of our interest, we found that the expression of the TcSAUR genes showed significant expression levels in various tissues during the zygotic embryogenesis and somatic embryogenesis by re-analyzing the previous microarray database. Taken together, our study provided fundamental information on the molecular mechanism of TcSAUR genes involved in cacao embryogenesis. Manipulation of TcSAUR expression will facilitate and accelerate zygotic embryogenesis and somatic embryogenesis during cacao tissue culture.

References

  1. Jaimez RE, Barragan L, Fernandez-Nino M, Wessjohann LA, Cedeno-Garcia G, Sotomayor Cantos I, et al. Theobroma cacao L. cultivar CCN 51: a comprehensive review on origin, genetics, sensory properties, production dynamics, and physiological aspects. PeerJ. 2022;10:e12676. https://doi.org/10.7717/peerj.12676.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Diaz-Valderrama JR, Leiva-Espinoza ST, Aime MC. The history of cacao and its diseases in the Americas. Phytopathology. 2020;110(10):1604–19. https://doi.org/10.1094/PHYTO-05-20-0178-RVW.

    Article  PubMed  Google Scholar 

  3. Tan TYC, Lim XY, Yeo JHH, Lee SWH, Lai NM. The health effects of chocolate and cocoa: a systematic review. Nutrients. 2021;13(9). https://doi.org/10.3390/nu13092909.

  4. Samanta S, Sarkar T, Chakraborty R, Rebezov M, Shariati MA, Thiruvengadam M, et al. Dark chocolate: an overview of its biological activity, processing, and fortification approaches. Curr Res Food Sci. 2022;5:1916–43. https://doi.org/10.1016/j.crfs.2022.10.017.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Gateau-Rey L, Tanner EVJ, Rapidel B, Marelli JP, Royaert S. Climate change could threaten cocoa production: effects of 2015–16 El Niño-related drought on cocoa agroforests in Bahia, Brazil. PLoS One. 2018;13(7):e0200454. https://doi.org/10.1371/journal.pone.0200454.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Schroth G, Läderach P, Martinez-Valle AI, Bunn C, Jassogne L. Vulnerability to climate change of cocoa in West Africa: patterns, opportunities and limits to adaptation. Sci Total Environ. 2016;556:231–41. https://doi.org/10.1016/j.scitotenv.2016.03.024.

    Article  CAS  PubMed  Google Scholar 

  7. Hou S, Zhang Q, Chen J, Meng J, Wang C, Du J, et al. Genome-wide identification and analysis of the GRAS transcription factor gene family in Theobroma cacao. Genes (Basel). 2022;14(1). https://doi.org/10.3390/genes14010057.

  8. Du J, Zhang Q, Hou S, Chen J, Meng J, Wang C, et al. Genome-wide identification and analysis of the R2R3-MYB gene family in Theobroma cacao. Genes (Basel). 2022;13(9). https://doi.org/10.3390/genes13091572.

  9. Hong LV. Genome-wide identification and analysis of heat shock protein 70 family in Theobroma cacao. Pak J Biol Sci. 2022;25(7):608–18. https://doi.org/10.3923/pjbs.2022.608.618.

    Article  CAS  Google Scholar 

  10. Zhang Q, Hou S, Sun Z, Chen J, Meng J, Liang D, et al. Genome-wide identification and analysis of the MADS-box gene family in Theobroma cacao. Genes (Basel). 2021;12(11). https://doi.org/10.3390/genes12111799.

  11. Stortenbeker N, Bemer M. The SAUR gene family: the plant’s toolbox for adaptation of growth and development. J Exp Bot. 2019;70(1):17–27. https://doi.org/10.1093/jxb/ery332.

    Article  CAS  PubMed  Google Scholar 

  12. Wu J, Liu S, He Y, Guan X, Zhu X, Cheng L, et al. Genome-wide analysis of SAUR gene family in Solanaceae species. Gene. 2012;509(1):38–50. https://doi.org/10.1016/j.gene.2012.08.002.

    Article  CAS  PubMed  Google Scholar 

  13. Zhang N, Huang X, Bao Y, Wang B, Zeng H, Cheng W, et al. Genome-wide identification of SAUR genes in watermelon (Citrullus lanatus). Physiol Mol Biol Plants. 2017;23(3):619–28. https://doi.org/10.1007/s12298-017-0442-y.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Li X, Liu G, Geng Y, Wu M, Pei W, Zhai H, et al. A genome-wide analysis of the small auxin-up RNA (SAUR) gene family in cotton. BMC Genomics. 2017;18(1):815. https://doi.org/10.1186/s12864-017-4224-2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Bai Q, Hou D, Li L, Cheng Z, Ge W, Liu J, et al. Genome-wide analysis and expression characteristics of small auxin-up RNA (SAUR) genes in moso bamboo (Phyllostachys edulis). Genome. 2017;60(4):325–36. https://doi.org/10.1139/gen-2016-0097.

    Article  CAS  PubMed  Google Scholar 

  16. Hu W, Yan H, Luo S, Pan F, Wang Y, Xiang Y. Genome-wide analysis of poplar SAUR gene family and expression profiles under cold, polyethylene glycol and indole-3-acetic acid treatments. Plant Physiol Biochem. 2018;128:50–65. https://doi.org/10.1016/j.plaphy.2018.04.021.

    Article  CAS  PubMed  Google Scholar 

  17. Li M, Chen R, Gu H, Cheng D, Guo X, Shi C, et al. Grape small auxin upregulated RNA (SAUR) 041 is a candidate regulator of berry size in grape. Int J Mol Sci. 2021;22(21). https://doi.org/10.3390/ijms222111818.

  18. Zhou Y, Lan Q, Yu W, Zhou Y, Ma S, Bao Z, et al. Analysis of the small auxin-up RNA (SAUR) genes regulating root growth angle (RGA) in apple. Genes (Basel). 2022;13(11). https://doi.org/10.3390/genes13112121.

  19. Zanin FC, Freitas NC, Pinto RT, Maximo WPF, Diniz LEC, Paiva LV. The SAUR gene family in coffee: genome-wide identification and gene expression analysis during somatic embryogenesis. Mol Biol Rep. 2022;49(3):1973–84. https://doi.org/10.1007/s11033-021-07011-7.

    Article  CAS  PubMed  Google Scholar 

  20. Wang M, Manzoor MA, Wang X, Feng X, Zhao Y, He J, et al. Comparative genomic analysis of SAUR gene family, cloning and functional characterization of two genes (PbrSAUR13 and PbrSAUR52) in Pyrus bretschneideri. Int J Mol Sci. 2022;23(13). https://doi.org/10.3390/ijms23137054.

  21. Tian Z, Han J, Che G, Hasi A. Genome-wide characterization and expression analysis of SAUR gene family in Melon (Cucumis melo L.). Planta. 2022;255(6):123. https://doi.org/10.1007/s00425-022-03908-0.

    Article  CAS  PubMed  Google Scholar 

  22. Peng Z, Li W, Gan X, Zhao C, Paudel D, Su W, et al. Genome-wide analysis of SAUR gene family identifies a candidate associated with fruit size in loquat (Eriobotrya japonica Lindl.). Int J Mol Sci. 2022;23(21). https://doi.org/10.3390/ijms232113271.

  23. Luo C, Yan J, He C, Liu W, Xie D, Jiang B. Genome-wide identification of the SAUR gene family in wax gourd (Benincasa hispida) and Functional characterization of BhSAUR60 during fruit development. Int J Mol Sci. 2022;23(22). https://doi.org/10.3390/ijms232214021.

  24. Liu Y, Xiao L, Chi J, Li R, Han Y, Cui F, et al. Genome-wide identification and expression of SAUR gene family in peanut (Arachis hypogaea L.) and functional identification of AhSAUR3 in drought tolerance. BMC Plant Biol. 2022;22(1):178. https://doi.org/10.1186/s12870-022-03564-2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Zhang Y, Ye T, She Z, Huang S, Wang L, Aslam M, et al. Small Auxin Up RNA (SAUR) gene family identification and functional genes exploration during the floral organ and fruit developmental stages in pineapple (Ananas comosus L.) and its response to salinity and drought stresses. Int J Biol Macromol. 2023;237:124061. https://doi.org/10.1016/j.ijbiomac.2023.124061.

    Article  CAS  PubMed  Google Scholar 

  26. Ma X, Dai S, Qin N, Zhu C, Qin J, Li J. Genome-wide identification and expression analysis of the SAUR gene family in foxtail millet (Setaria italica L.). BMC Plant Biol. 2023;23(1):31. https://doi.org/10.1186/s12870-023-04055-8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Luan J, Xin M, Qin Z. Genome-wide identification and functional analysis of the roles of SAUR gene family members in the promotion of cucumber root expansion. Int J Mol Sci. 2023;24(6). https://doi.org/10.3390/ijms24065940.

  28. Chen Y, Ma X, Xue X, Liu M, Zhang X, Xiao X, et al. Genome-wide analysis of the SAUR gene family and function exploration of DlSAUR32 during early longan somatic embryogenesis. Plant Physiol Biochem. 2023;195:362–74. https://doi.org/10.1016/j.plaphy.2023.01.006.

    Article  CAS  PubMed  Google Scholar 

  29. Argout X, Salse J, Aury JM, Guiltinan MJ, Droc G, Gouzy J, et al. The genome of Theobroma cacao. Nat Genet. 2011;43(2):101–8. https://doi.org/10.1038/ng.736.

    Article  CAS  PubMed  Google Scholar 

  30. Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 2021;49(D1):D412–9. https://doi.org/10.1093/nar/gkaa913.

    Article  CAS  PubMed  Google Scholar 

  31. Gasteiger E, Hoogland C, Gattiker A, Wilkins MR, Appel RD, Bairoch A. Protein identification and analysis tools on the ExPASy server. The proteomics protocols handbook: Springer; 2005. p. 571–607.

    Google Scholar 

  32. Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A. ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 2003;31(13):3784–8. https://doi.org/10.1093/nar/gkg563.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. La HV, Chu HD, Tran CD, Nguyen KH, Le QTN, Hoang CM, et al. Insights into the gene and protein structures of the CaSWEET family members in chickpea (Cicer arietinum), and their gene expression patterns in different organs under various stress and abscisic acid treatments. Gene. 2022;819:146210. https://doi.org/10.1016/j.gene.2022.146210.

    Article  CAS  PubMed  Google Scholar 

  34. La HV, Chu HD, Ha QT, Tran TTH, Van Tong H, Van Tran T, et al. SWEET gene family in sugar beet (Beta vulgaris): genome-wide survey, phylogeny and expression analysis. Pak J Biol Sci. 2022;25(5):387–95.

    Article  CAS  Google Scholar 

  35. Le Man T, Huyen Tran TT, Quyen VuX, Duc Chu H, Chau Pham T, Le Thi H, et al. Genome-wide identification and analysis of genes encoding putative heat shock protein 70 in papaya (Carica papaya). Pak J Biol Sci. 2022;25(6):468–75. https://doi.org/10.3923/pjbs.2022.468.475.

    Article  CAS  Google Scholar 

  36. Tran CD, Chu HD, Nguyen KH, Watanabe Y, La HV, Tran KD, et al. Genome-wide identification of the TCP transcription factor family in chickpea (Cicer arietinum L.) and their transcriptional responses to dehydration and exogenous abscisic acid treatments. J Plant Growth Regul. 2018;37(4):1286–99. https://doi.org/10.1007/s00344-018-9859-y.

    Article  CAS  Google Scholar 

  37. Thompson JD, Gibson TJ, Higgins DG. Multiple sequence alignment using ClustalW and ClustalX. Current protocols in bioinformatics/editoral board, Andreas D Baxevanis [et al]. 2002;Chapter 2:Unit 2 3. https://doi.org/10.1002/0471250953.bi0203s00.

  38. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics (Oxford, England). 2007;23(21):2947–8. https://doi.org/10.1093/bioinformatics/btm404.

    Article  CAS  PubMed  Google Scholar 

  39. Spartz AK, Ren H, Park MY, Grandt KN, Lee SH, Murphy AS, et al. SAUR inhibition of PP2C-D phosphatases activates plasma membrane H+-ATPases to promote cell expansion in Arabidopsis. Plant Cell. 2014;26(5):2129–42. https://doi.org/10.1105/tpc.114.126037.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Spartz AK, Lee SH, Wenger JP, Gonzalez N, Itoh H, Inze D, et al. The SAUR19 subfamily of SMALL AUXIN UP RNA genes promote cell expansion. Plant J Cell Mol Biol. 2012;70(6):978–90. https://doi.org/10.1111/j.1365-313X.2012.04946.x.

    Article  CAS  Google Scholar 

  41. Tamura K, Stecher G, Kumar S. MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021;38(7):3022–7. https://doi.org/10.1093/molbev/msab120.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Panchy N, Lehti-Shiu M, Shiu S-H. Evolution of gene duplication in plants. Plant Physiol. 2016;171(4):2294–316. https://doi.org/10.1104/pp.16.00523.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics (Oxford, England). 2015;31(8):1296–7. https://doi.org/10.1093/bioinformatics/btu817.

    Article  PubMed  Google Scholar 

  44. Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic acids Res. 2013;41(Database issue):D991–5. https://doi.org/10.1093/nar/gks1193.

    Article  CAS  PubMed  Google Scholar 

  45. Maximova SN, Florez S, Shen X, Niemenak N, Zhang Y, Curtis W, et al. Genome-wide analysis reveals divergent patterns of gene expression during zygotic and somatic embryo maturation of Theobroma cacao L, the chocolate tree. BMC Plant Biol. 2014;14:185. https://doi.org/10.1186/1471-2229-14-185.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Liao Y, Smyth GK, Shi W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 2019;47(8): e47. https://doi.org/10.1093/nar/gkz114.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wang P, Lu S, Xie M, Wu M, Ding S, Khaliq A, et al. Identification and expression analysis of the small auxin-up RNA (SAUR) gene family in apple by inducing of auxin. Gene. 2020;750:144725. https://doi.org/10.1016/j.gene.2020.144725.

    Article  CAS  PubMed  Google Scholar 

  48. Rajesh MK, Fayas TP, Naganeeswaran S, Rachana KE, Bhavyashree U, Sajini KK, et al. De novo assembly and characterization of global transcriptome of coconut palm (Cocos nucifera L.) embryogenic calli using Illumina paired-end sequencing. Protoplasma. 2016;253(3):913–28. https://doi.org/10.1007/s00709-015-0856-8.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was funded by the fundamental research program of Hung Vuong University under the project grant No. 01/2023/KHCN (HV01.2023).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: HDC, PBC, Data curation: NTBC, MTL, HVL, TDL, CTL, Formal analysis: HDC, PBC, NTBC, QTNL, LTMT, Methodology: HDC, PBC, NTBC, HTTT, CTL, Writing—original draft: HDC, PBC, NTBC, Writing—review and editing: HDC, PBC, All authors will have reviewed, discussed, and agreed to their individual contributions.

Corresponding authors

Correspondence to Phi Bang Cao or Ha Duc Chu.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chu, N.T.B., Le, M.T., La, H.V. et al. Genome-wide identification, characterization, and expression analysis of the small auxin-up RNA gene family during zygotic and somatic embryo maturation of the cacao tree (Theobroma cacao). Genom. Inform. 22, 2 (2024). https://doi.org/10.1186/s44342-024-00003-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s44342-024-00003-6

Keywords