MicroRNAs are short noncoding RNAs that play critical roles in regulating gene expression in normal physiology and disease. Despite having tightly controlled expression levels, little is known about how miRNAs themselves are regulated because their genes are poorly defined.
Although mature miRNAs are only ~22 nucleotides, their transcripts are up to hundreds of kilobases long. Primary miRNA transcripts, or pri-miRNAs, are quickly processed into mature miRNAs from hairpin structures located in the exons or introns of pri-miRNA transcripts. Because processing occurs very quickly, standard methods such as RT-PCR or RNA sequencing detect full-length pri-miRNAs with poor sensitivity. Many miRNA genes, therefore, lack annotated features such as a promoter or splice sites, hindering progress in understanding their transcriptional and post-transcriptional regulation.
To overcome this, researchers from the University of Texas Southwestern Medical Center and Johns Hopkins University stabilized pri-miRNAs by expressing a dominant negative form of DROSHA, the RNase III enzyme responsible for pri-miRNA cleavage, in a variety of human and mouse cell lines. By deeply sequencing nuclear RNAs and applying the computational tool StringTie to assemble transcripts, the researchers were able to annotate 69% of human miRNAs and 75% of mouse miRNAs. The newly annotated pri-miRNA gene structures can be visualized using standard genome browsers including the UCSC Genome Browser.
"One remarkable feature of primary miRNAs is their extreme length, even in cases where they function only to produce a single ~22 nucleotide miRNA," said Joshua Mendell, corresponding author of the study. "Although it seems wasteful to produce such long RNAs, most of which will be immediately degraded, this organization may have arisen to allow complex mechanisms of regulation of the encoded miRNA."
Although clustered miRNAs are generally thought to be co-transcribed, the researchers found evidence that several human intergenic conserved miRNA clusters have alternative promoters. "We were surprised to find cases where such miRNAs could be alternatively co-transcribed or separately transcribed, depending on which promoter is used," Mendell said.
The researchers classified pri-miRNAs into three broad categories based on gene structure: Class I, which are transcribed independently of other genes and likely represent independent transcription units, Class II, which are transcribed as an extension of a protein-coding gene, and Class III, which are transcribed as an extension of a noncoding RNA. One example of a Class III pri-miRNA is located upstream of the noncoding RNA SPACA6P and encodes miR-99b, let-7e, and miR-125a. The authors suggest SPACA6P is a cleavage product of DROSHA processing due to the fact the 5' end is immediately adjacent to the 3' end of the pre-mir-125a hairpin.
The new annotations will allow researchers to address questions regarding the transcriptional complexity of miRNA regulation, including how a miRNA promoter may be regulated or how splicing affects miRNA expression. "It is also possible that these data will reveal that some previously identified disease-associated sequence variants actually fall within a formerly unrecognized primary miRNA gene, thereby raising the possibility that such variants may influence expression of the encoded miRNA," Mendell said.