top of page
CFIm Protein Complex in
During the interval between an architect finalizing plans and houses going up in a neighborhood, multiple crews get involved to interpret the drawings, select the materials, situate the doors and windows for the best views, and so forth. The situation is analogous in cells: a gene provides a basic blueprint for the proteins that make up our cells, but that blueprint needs to be transcribed from DNA into messenger RNA (mRNA) and then translated into protein. Or various proteins, to be more accurate, since a given gene can make multiple versions of the same basic protein, with slight alterations in function or location, rather like a basic house plan that might take on a number of different finishes in a given community. A lot of the information that governs how much or where to make the protein is contained in signature sequences in the mRNA that are the equivalent of the details in the blueprint that determine where electric outlets, lighting, and plumbing are to be installed—sounds boring, but it's essential to how the house actually works for the people living inside it.
Alternative polyadenylation (APA) is one kind of fine print that has been particularly difficult to decipher. The APA machinery binds to specific signal sequences in the mRNA and cleaves it before adding a long stretch of adenosines to protect the molecule from degradation and to halt transcription. Cleavage at different APA sites can produce different versions of the same protein, but so do other processes, such as alternative splicing. APA signal sequences are sprinkled throughout the entire mRNA, but most labs have focused exclusively on APA sequences located in the 3'UTR (untranslated region) at the far end of the mRNA. In part this is because earlier research had shown that several different types of cancer cells tend to undergo cleavage at sites that shorten the 3'UTR and thereby remove a bunch of regulatory sequences, which should in theory cause more protein to be produced. The effect of these shortened 3'UTRs has been hard to discern, however, since subsequent studies did not find a correlation with protein levels; many in the field concluded that where APA takes place has no effect on protein levels at all.
As a postdoc I was interested in the regulation of MECP2, the gene that is mutated in Rett syndrome. MeCP2 is peculiar for having a very long (~8.5 kb) 3’UTR that contains two prominent poly-adenylation (p(A)) sites: a proximal p(A) and a distal p(A), resulting, respectively, in short and long messenger mRNA isoforms. We found that MeCP2 protein levels are regulated by the CFIm protein complex through these APA sites. We then identified eleven individuals with neuropsychiatric disease who have copy number variations (CNVs) spanning NUDT21, which encodes CFIm25. These individuals suffer from autism spectrum disorder and notable developmental regression, very similar to Rett syndrome. We thus discovered that NUDT21 duplication causes a new autism spectrum disorder (eLife, 2015).
Recently, we have discovered that APA governs protein dosage. We studied global APA throughout the transcriptome; however, where most studies focus on just the 3' UTR, we expanded our study to the whole-transcript level. We found that the APA machinery chooses binding sites in the gene body or in the 3'UTR according to how much protein the cell needs at that particular moment. This new understanding of APA at the whole-mRNA and whole-organism level was made possible by having identified human subjects with mutations in a gene called CPSF6, which is part of a large multiprotein complex that carries out APA. When this gene is mutated or deleted, there is not enough of the CPSF6 protein to guide the APA machinery to the right APA signal. Instead, mRNA that normally undergoes APA at the 3'UTR instead undergoes APA within the gene body, and vice versa. To better understand the effect of CPSF6 insufficiency, we studied a zebrafish model with the help of CUIMC pediatric cardiologist Kimara Targoff, MD. Along with collaborators Marko Jovanovic (Columbia University), Eric Wagner (Univ. of Rochester), and Hari Yalamanchili (Neurological Research Institute and Baylor College of Medicine), we analyzed the pattern of APA site choice throughout the zebrafish larvae and discovered the different APA preferences of brain, skeletal, and cardiovascular tissues. Because the zebrafish larvae and human neonatal samples cover roughly the same developmental stage, we believes it is likely that the pattern of APA site choice observed in the zebrafish reflects the needs of this early stage of postnatal life in humans. During an organ's growth, cells need to produce a large number of proteins as foundational building blocks; once growth is achieved, the cells no longer need to expend so much energy making new proteins. Abnormal APA had been previously identified in specific genes in the context of specific diseases, such as diabetes or lupus, but this is the first time APA has been found to be disrupted throughout the whole organism causing a developmental syndrome. Given that the APA process is sensitive to input from hormones and other extracellular molecular cues, however, this work raises the possibility that anything altering APA site selection could cause disease or dysfunction, such as abnormal heart or skeletal growth or subtle forms of intellectual disability (de Prisco et al., Science Advances, 2023)
We are seeking to understand the mechanisms of polyA swtich in disease.
bottom of page