By Subhash C. Lakhotia
Much of the very exciting progress in biology during the past four decades has been propelled by the reductionist belief, commonly known as the "central dogma of molecular biology," that a functional "gene" must produce RNA, which must be translated into a protein (Crick, 1970). Studies using this paradigm have enabled us to move from molecular genetics to genetic engineering and genomics, and now to the post-genomic era.
Success of the "central dogma" led to the common belief that any sequence of DNA or a gene is of relevance only if it has a protein-coding function. Paradoxically, however, almost all eukaryotes have much more DNA than accounted for by the protein coding "genes": e.g., the protein coding DNA sequences account for only ~2% of the human genome (Venter et al., 2001). Consequently, the role of the bulk of the genomic DNA in eukaryotes has remained a persisting riddle, and because of the strong belief in the "central dogma," such "non-coding" sequences have often been brushed aside as "selfish" or "junk." In this context, it is still more paradoxical that a large proportion of the "noncoding" genomic DNA is indeed transcribed (Mattick, 2001). Therefore, it appears that the "noncoding genes" are not "junk," but indeed meaningful components of genomes (Lakhotia, 1996, 1999; Erdmann et al., 2001; Mattick, 2001; Barciszewski and Erdmann, 2003).
A noncoding gene of Drosophila melanogaster being actively studied in the Cytogenetics Laboratory of Banaras Hindu University is the 93D or the hsrw gene. This gene is developmentally active, is induced by a variety of stresses and produces several transcripts, but does not code for any protein (for recent reviews, see Lakhotia, 2001, 2003). The 93D or the hsr-omega (hsrw) gene of Drosophila melanogaster became an interesting gene more than 3 decades ago in view of its unique inducibility with a brief benzamide treatment. Subsequent studies revealed many unusual features of this gene, a homologue of which is present in all the Drosophila species examined. This gene is developmentally active in nearly all cell types of Drosophila, is induced by heat shock along with the other heat shock genes, but is singularly induced by a variety of amides, all of which also inhibit general chromosomal transcription. The hsrw gene in all species of Drosophila has a characteristic architecture with two exons and an intron and a long stretch (>5 to ~15 kb) of tandem repeats on the 3' end of the gene. Like several other noncoding genes, the base sequence of the unique as well as the tandem repeat region of the hsrw gene is not conserved in different species. However, in all the Drosophila species examined, two primary nucleus-limited transcripts, ~2 kb and >10 kb, respectively, are produced, but none of them carry any significant open-reading frame. The ~2 kb transcript is spliced to generate a 1.2 kb cytoplasmic transcript, which has a translatable ORF of 23-27 amino acids.
The large nucleus-limited >10kb hsrw-n transcript is so far the only known eukaryotic large RNA that shows a speckled distribution in the nucleoplasm (Lakhotia et al., 1999; Prasanth et al., 2000). In addition to being present at the site of transcription, the hsrw-n transcripts are distributed in the nucleoplasm as many nucleoplasmic speckles close to the chromatin domains. The various nuclear hnRNPs (heterogenous RNA-binding proteins) and some other proteins like Sxl remain bound with the different transcriptionally active chromatin sites and with the nucleoplasmic speckles formed by the hsrw-n transcripts. These speckles, designated as "omega speckles" (Prasanth et al., 2000), are distinct from the well-known inter-chromatin granule clusters or IGCs. The hsrw-n transcripts have an essential role in organizing the omega speckles, which serve to dynamically regulate the availability of hnRNPs and related proteins for RNA processing activities at any given time (Lakhotia et al., 1999; Prasanth et al., 2000; Lakhotia 2001, 2003). Mutants that mis-express the hsrw gene and thus affect the omega speckles have diverse phenotypic consequences (Rajendra et al., 2001), presumably because of aberrant processing of various nuclear pre-mRNAs due to altered availability of hnRNPs, etc.
Every cell needs an enormously large variety of transcripts and proteins in variable quantities, and this requirement keeps changing with time. The regulatory strategies at transcriptional level ensure the production of different transcripts required by a cell at any given time. These nascent transcripts are subjected to intricate processing steps (splicing, capping, poly-adenylation, etc.) that not only generate the functional mRNAs but also regulate their transport, translatability and half-lives in the cell. One of the very important steps in the post-transcriptional processing of the precursor mRNAs is splicing of the exons (Hastings and Krainer 2001). Since many of the eukaryotic genes are multi-exonic, a versatile regulatory strategy has evolved for cell-type and/or development stage specific alternative splicing of certain transcripts to generate a significantly greater diversity in gene products (Maniatis and Tasic, 2002; Harrison et al., 2002; Stamm, 2002; Venables, 2002). In addition to the normal developmental requirements, each cell must also be ready to adapt quickly to unexpected changes in its environment. All these obviously require very elaborate and precise regulatory circuits so that the highly integrated organization displayed by live cells can be maintained and sustained. A large variety of classes of proteins have been identified and shown to be the key players in these diverse regulatory circuits (Neubauer et al., 1998). It is also known that these proteins, belonging to two major families, viz, the hnRNPs (Dreyfuss et al., 2002) and the SR proteins (Graveley, 2000), interact in different combinations to fine-tune the post-transcriptional regulatory circuits (Smith and Valcarcel, 2000).
The unengaged hnRNPs, which are not productively associated with chromatin sites for RNA processing, remain localized at the omega speckles. Under conditions of cellular stress, which inhibit most of the nuclear transcription and RNA processing, the hnRNPs move away from chromatin and get associated with the concomitantly increased levels of hsrw-n RNA (Prasanth et al., 2000). The omega speckles provide a storage site for the unengaged hnRNPs and, therefore, modulation of the levels of the hsrw-n transcripts has a pivotal role in regulating the availability of the various hnRNPs for post-transcriptional processing of pre-mRNAs. Since the ratio of hnRNPs and SR proteins at the splice-sites has significant roles in regulating alternative splicing (Smith and Valcarcel, 2000), the hsrw-n transcripts assume a key role in integrating RNA processing activity by regulating the levels of hnRNPs in the active (chromatin associated) and inactive (omega speckles) compartments (Lakhotia 2003).
The hsrw-c, the smaller (~1.2kb) transcript of the hsrw gene, is cytoplasmic, generally short-lived and codes for only a short peptide (23-27 amino acids long), which, together with the 1.2kb RNA, is apparently degraded as soon as translated. The purpose of the act of translation of the short ORF in hsrw-c RNA seems to be to monitor the efficiency of cellular translational machinery. Any perturbation in translational activity stabilizes the hsrw-c RNA, resulting in an increase in its level. The increase in hsrw-c RNA level perhaps signals other response/s in the cell.
Thus the hsrw gene, although not coding for a typical protein product, contributes to the self-organization of cellular activities through its transcripts (Lakhotia 2003).
The novel function of this noncoding RNA, together with the increasing awareness about other classes of noncoding RNAs, like the Xist in mammals, Rox1 and Rox2 in Drosophila, the large variety of micro-RNAs, etc., make it clear that the different noncoding RNAs present in pro- as well as eukaryotes have essential functions rather than merely being products of "junk" or "selfish" DNA sequences (Barciszewski and Erdmann, 2003). Further studies on such DNA sequences will be essential to integrate our understanding of the diverse ways in which the genome functions and maintains the self-organization of biological systems.
References