Information

Why does 4-thiouracil labelling to map RNA-binding proteins cause a T-C change?

Why does 4-thiouracil labelling to map RNA-binding proteins cause a T-C change?



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I am now reading a paper about the purification and identification of mRNAs and its RNA binding proteins by using UV crosslinking and immunoprecipitation. I came upon this sentence which puzzled me

Up to 28% of all uridines present in 3' UTRs showed diagnostic T-C changes in the protein occupancy profiling sequence reads. This number is reasonably high, considering observations that typically only one of a few uridines in RNA binding sites when substituted by 4SU, crosslinks to proteins.

What I do not understand is why does 4-thiouracil (4SU) labelling in the cell show T-C changes. Would the uracil-adenine base pairing not be affected?

How does the 4SU labelling really work?


The technique used in that paper is called PAR-CLIP (Photoactivatable Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation; see Ascano et al., 2012 and Spitzer et al, 2014). This technique involves substituting uridine (U) with 4-thiouridine (4SU) or substituting guanosine (G) with 6-thioguanosine (6SG). The rationale behind using these modified bases is that they improve the efficiency of RNA-protein crosslinking. Moreover, PAR-CLIP uses 365nm UV instead of the 254nm UV used in the conventional crosslinking. However, the crosslinking process causes the loss of the sulphur atom causing theUto be read asCduring reverse transcription. Likewise,6SGgets read asA.

Ascano et al., 2012


Why does 4-thiouracil labelling to map RNA-binding proteins cause a T-C change? - Biology

New technologies have enabled systematic mapping of RNA–RNA interactions, revealing tens of thousands of such interactions.

In endogenous cellular conditions, miRNAs and lincRNAs tend to specifically target one or a few mRNAs, indicating that the entire RNA interactome is a scale-free network.

Hundreds of snoRNA genes produce miRNA-like short RNAs that interact with mRNAs, thus providing a gene repertoire of new regulatory RNAs.

Pseudogenes and transposons produce RNAs that interact with mRNAs, through base pairing. The interaction regions exhibit increased interspecies conservation levels.

New technologies enabled systematic mapping of RNA–DNA interactions, revealing hundreds of chromatin-interacting RNAs.

As transcription of the human genome is quite pervasive, it is possible that many novel functions of the noncoding genome have yet to be identified. Often the noncoding genome’s functions are carried out by their RNA transcripts, which may rely on their structures and/or extensive interactions with other molecules. Recent technology developments are transforming the fields of RNA biology from studying one RNA at a time to transcriptome-wide mapping of structures and interactions. Here, we highlight the recent advances in transcriptome-wide RNA interaction analysis. These technologies revealed surprising versatility of RNA to participate in diverse molecular systems. For example, tens of thousands of RNA–RNA interactions have been revealed in cultured cells as well as in mouse brain, including interactions between transposon-produced transcripts and mRNAs. In addition, most transcription start sites in the human genome are associated with noncoding RNA transcribed from other genomic loci. These recent discoveries expanded our understanding of RNAs’ roles in chromatin organization, gene regulation, and intracellular signaling.


Abbreviations

ADAR, adenosine deaminase acting on RNA

ChIP, chromatin immunoprecipitation

dsRBD, double-stranded RNA-binding domain

KSRP, KH-type splicing regulatory protein

lncRNAs, long noncoding RNAs

MAPK, mitogen-activated protein kinase

PAZ, Piwi/Argonaute/Zwille

RBPs, RNA-binding proteins

Rhed, RNA-binding heme domain

RIIID, RNase III domain

RISC, RNA-induced silencing complex

snRNPs, small nuclear ribonucleoproteins

TDP-43, Tar DNA-binding protein 43

Regulation of gene expression by small noncoding RNAs is at the heart of an ever-increasing number of biological pathways and can definitely not be overlooked by biologists whatever their field of research. There are different types of small regulatory RNA, which can be classified by their genomic origin or their function. They can also come in different flavors based on the kingdom, but for the sake of brevity, we will only mention here the different families that exist in animals. Despite some differences in their biogenesis, small RNAs share the same mode of action. Indeed, they act as sequence-specific guides for effector proteins, which belong to the Piwi/Argonaute (AGO) family [1] . Upon loading, they direct them toward their intended target RNAs. Broadly speaking, one can distinguish two main classes of small RNAs: (a) small interfering (si)RNAs, and micro (mi)RNAs, which are generated by the cleavage of varying size double-stranded (ds) RNA precursor molecules by type III ribonucleases, also called Dicer proteins [2] and (b) germline specific piwi-associated (pi)RNAs, which do not depend on the dicing of a dsRNA molecule (for a review see [3] ). Although they share some common biogenesis factors, siRNAs and miRNAs are very different in terms of their biological role in the cell. The former can be seen as a defense system against foreign or unwanted double-stranded nucleic acids, whereas the later are constitutively expressed and play important roles as fine-tuners of gene expression. The focus of this review is on miRNAs, and we will therefore not dwell longer on si- and piRNAs.

The biogenesis of miRNAs, as described in Fig. 1, is a complex and compartmented process that begins with the transcription of a long primary transcript called pri-miRNA, which contains all the features of a coding mRNA. This transcription is mostly performed by RNA polymerase II (RNA pol II) [4] , but there are some cases of virus-encoded miRNAs that are transcribed by RNA polymerase III (RNA pol III) [5, 6] . Upon recognition of a stem-loop structure within the pri-miRNA by the RNase III Drosha [7] and its cofactor DGCR8 [8] , i.e., the Microprocessor complex, the

65-nucleotide long precursor (pre) miRNA is cleaved and taken in charge by the Exportin 5 factor [9] to be translocated to the cytoplasm. There, the pre-miRNA undergoes a second cleavage event, which is mediated by the RNase III Dicer [10] , with the help of its cofactor TRBP [11] . The resulting small RNA duplex is then assembled into one AGO protein, where it is unwound to keep only one of the two strands [12] , which becomes the mature, 22-nucleotide long, miRNA. This process has been shown to require the help of chaperones such as Hsp90 [13] . In humans, there are four AGO proteins, which can indiscriminately accommodate miRNAs (for a review see [14] ). The AGO protein loaded with a miRNA, also referred to as RNA-induced silencing complex (RISC) [15] , scans the population of mRNA molecules within the cell until it finds a sequence match. The targeting process is complex and has been the subject of a tremendous amount of work by several groups, and we will not cover it into details here. Briefly, the recognition of the target by the RISC involves a handful (6–8) of nucleotides located 5′ of the miRNA, and coined the seed [16] . Because the requirement for miRNA–mRNA pairing that results in an efficient regulation by AGO is so limited, it is no wonder that the vast majority of the coding genome can be regulated by miRNAs. Indeed, there are currently almost 2000 miRNA genes reported for human alone [17] , and the conservative estimates are that at least 60% of mRNAs are miRNA targets [18] . The mechanism by which the miRISC regulates its target mRNA requires a review on its own. Suffice to say that it involves the recruitment of an adaptor protein called GW182 or TNRC6 in human that in turn will interact with a number of other proteins ultimately leading to the inhibition of translation initiation and destabilization of the mRNA by deadenylation (for a review see [19] ).

Although we described here the key steps involved in the canonical miRNA biogenesis, there are a number of alternate ways that have been reported in the literature by which these small RNAs can also be matured. We already referred to the involvement of RNA pol III in the transcription of pri-miRNA, which to date has only been reported in few viruses such as the murid herpesvirus 4 (MuHV4), which synthesizes a tRNA–pre-miRNA hybrid matured by tRNase Z [6] , or the bovine leukemia virus [5] . The maturation step by Drosha is not mandatory to make a miRNA there are a number of Drosha-independent ways to synthesize them. The most well-known are the mirtrons, which are generated by splicing of the pre-miRNA out of an mRNA [20, 21] . Other miRNAs are generated in a Dicer-independent manner, although they are much rarer. In this case, the pre-miRNA is directly loaded into AGO2, which cleaves one arm of the hairpin, before the resulting RNA gets shortened by an exoribonuclease [22] .

These alternate pathways for miRNA biogenesis highlight the various steps that can be diverted and that are therefore under tight control by the cell. Given the regulatory power of miRNAs, it is of prime importance to maintain their expression in check and to ensure quality control at each and every step along the way. We now know that regulation of the miRNA biogenesis does occur from the transcription of the pri-miRNA all the way down to the stability of the final mature miRNA product (for a general review on miRNA biogenesis regulation see [23] ). Here, we will review the first step of miRNA maturation mediated by the Microprocessor complex in the nucleus. We will describe how it occurs before focusing on its regulation by various cofactors that help to control cellular homeostasis or stress response. We will more specifically detail the protein cofactors and their mode of action, but recent findings on alternative modes of pri-miRNA processing regulation will also be discussed.


INTRODUCTION

Localization of mRNAs is a universal mechanism to efficiently drive protein targeting in eukaryotes and prokaryotes. The targeting of mRNAs facilitates the accumulation of the locally translated proteins to specific cellular compartments and, hence, is an essential mechanism in establishing cell polarity, patterning, and fate determination as well as protein sorting ( Herbert and Costa, 2019 Hughes and Simmonds, 2019 Tian et al., 2019b, 2020.

mRNA localization occurs as a multistep process. After transcription, cis-acting elements (RNA zipcodes) are recognized and bound by trans-acting factors, mainly RNA binding proteins (RBPs) to form a primary mRNA–nucleoprotein (mRNP) complex. After export to the cytoplasm, the mRNP complex undergoes extensive remodeling with recruitment of new factors and detachment of others enabling cytoskeletal-based transport to the destination site ( Blower, 2013 Weis et al., 2013 Tian and Okita, 2014).

Although extensive knowledge on mRNA localization has been acquired by studies in Drosophila melanogaster, yeast (Saccharomyces cerevisiae), and mammalian cells, only a few examples have emerged from higher plants. The best-defined model in plants is storage protein mRNA localization in developing rice (Oryza sativa) endosperm cells, where mRNAs encoding glutelin and prolamine are recognized by zipcode RBPs and transported to two distinct cortical endoplasmic reticulum (ER) subdomains, the cisternal-ER, and protein body-ER (PB-ER), respectively ( Chou et al., 2019 Tian et al., 2019b). Translation of prolamine mRNAs on the PB-ER results in the assembly of prolamine intracisternal granules that form an ER-derived protein body I (PB-I), while glutelin precursors are exported to the Golgi and then transported to protein storage vacuoles (PSVs) for processing and storage ( Chou et al., 2019 Tian et al., 2019b). Although several cytoskeleton-associated RBPs required for mRNA localization have been identified ( Doroshenk et al., 2009, 2012), information on how these mRNAs are transported to distinct ER subdomains remains elusive.

Emerging evidence from fungal model systems reveals the intimate link of mRNA transport with membrane trafficking ( Schmid et al., 2006 Jansen et al., 2014 Haag et al., 2015 Niessing et al., 2018). Several mRNAs from yeast, Candida albicans, and Ustilago maydis are co-transported with mobile ER or shuttling endosomes ( Schmid et al., 2006 Jansen et al., 2014 Haag et al., 2015 Pohlmann et al., 2015 Niessing et al., 2018). ASH1 as well as other mRNAs are co-transported on tubular ER that moves to the emerging bud or daughter cell in yeast. This process is mediated by the RBPs She2p and She3p, with She2p having membrane binding properties and She3p serving as an adaptor protein linking the mRNP-cER to Myo4P protein ( Schmid et al., 2006 Niessing et al., 2018). The cdc3 mRNA is transported on shuttling endosomes in the smut fungus, U. maydis, a process requiring localization of the RBP Rrm4 on the endosomes and the interaction of a membrane-associated linker protein Upa1 with Rrm 4 ( Pohlmann et al., 2015 Niessing et al., 2018). Specific adaptor proteins appear to be needed to hitch mRNPs on endosomes for active transport over long distance. More recently, neuronal RNA granules have been shown to hitchhike on moving lysosomes using annexin11 as a tether ( Liao et al., 2019). Although co-transport of mRNAs with membranous compartments was proposed to be a common mechanism in higher eukaryotes ( Jansen et al., 2014), whether the mechanism is utilized by higher plants remains to be determined.

Previous studies suggested that endocytosis and membrane trafficking likely play a role in mRNA localization in plants. For example, the loss of function of the small GTPase Rab5a and its cognate guanine nucleotide exchange factor (GEF) resulted in defects in endocytosis and membrane trafficking and the mis-targeting of glutelin proteins to the prolamine containing PB-I as well as to the extracellular paramural body (PMB) in rice endosperm cells ( Fukuda et al., 2011 Wen et al., 2015). As storage protein targeting is regulated by their mRNA localization in rice endosperm cells, the mis-targeting of glutelin proteins in the mutant suggests a relationship between endosomal transport and glutelin mRNA localization in rice. The extracellular distribution of glutelin mRNAs within PMBs from a mutant expressing a defective GEF ( Yang et al., 2018) further supports the possible involvement of endosomal trafficking in glutelin mRNA transport. However, direct evidence depicting the co-transport of glutelin mRNAs with shuttling endosomes and how endosomal trafficking are engaged in glutelin mRNA localization have yet to be established. Such mis-targeting of glutelin mRNAs in rice lines expressing mutant Rab5a and GEF may simply be a consequence of pleiotropy.

Recent studies ( Tian et al., 2018, 2019a) identified two RBPs, RBP-P and RBP-L, which contain two and three RNA recognition motif (RRM) domains, respectively. These RBPs specifically bind to the glutelin zipcode mRNA sequences and regulate glutelin mRNA localization. In this study, using these two glutelin zipcode RBPs as entry points, we identified their interacting partners, n -ethylmaleimide-sensitive factor (NSF) and the small GTPase Rab5a, which participate in endosomal membrane trafficking. The four proteins may form a quaternary complex carrying glutelin mRNAs for active transport on endosomes to the cortical ER membrane. The identification of these key linker proteins that enable endosome-mediated mRNA transport in rice endosperm cells provides new insights on how mRNAs can be distributed to specific locations in eukaryotes.


MATERIALS AND METHODS

Cell culture, DNA constructs and transfection

The HeLa cell line was obtained by the American Type Culture Collection (ATCC, CCL-2, Manassas, VA, RRID:CVCL_0030). Cell lines were cultured in RPMI 1640 (HeLa cells) (Thermo Fisher Scientific, Monza MB, IT) supplemented with 10% fetal bovine serum (Thermo Fisher Scientific), penicillin (100 U/ml), streptomycin (100 μg/ml) and 2 mM glutamine at 37°C in 5% CO2. The plasmids encoding the sequences of the HNRNPD isoforms (p45, p42, p40 and p37) fused to the FLAG-tag were a gift from R.J. Schneider, Department of Microbiology and Radiation Oncology, NYU School of Medicine. The plasmid encoding SAF-A-FLAG wt was a gift from Nick Gilbert, MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Crewe Road, Edinburgh, UK. The plasmid encoding the human GFP-RNase H1 was a gift from Robert Joseph Crouch, Developmental Biology Division, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA. To generate the HNRNPD mutants we cloned into the pFLAG CMV-1 vector, through EcoRI and BamHI sites, the corresponding DNAs amplified by PCR through the primers listed in Supplementary Table S1 . To generate the HNRNPD mutants for E. coli expression we cloned in pET-duet vector, through the EcoRI and HindIII sites, the corresponding DNAs amplified by PCR through the primers listed in Supplementary Table S1 . To generate HeLa cell lines stably expressing ER-AsiSI ( 17), cells were seeded at 90% confluence on 6-well plates and transfected with 1μg of pBABE ER-AsiSI (a gift from G. Legube, Center for Integrative Biology, Université Paul Sabatier, France) through Lipofectamine 2000 Reagent (Thermo Fisher Scientific). ER AsiSI-expressing cells were selected using puromycin (Sigma Aldrich S.r.l, Milan IT) at the previously optimized concentration of 5 μg/ml. For silencing experiments, HeLa cells were transfected with, ON-TARGETplus Human HNRNPD siRNA Dharmacon, 30 nM of siCTR (D-001810–10) or siHNRNPD (L-004079) using Dharmafect 1 according to the manufacturer instructions. ON-TARGETplus siRNA are optimized to achieve high reduction of off target effects. The HeLa HNRNPD knockout cells were generated through the CRISPR-Cas9 system. Briefly, cells were transfected with the pSpCas9 (BB)-2A-Puro (PX459) V2.0 ( 18) (gift from Feng Zhang, Addgene plasmid #62988) containing guide RNAs targeting the HNRNPD exon two (5′-TCCTATCACAGGGCGATCAA-3′) and selected with 5μg/ml of puromycin. The generated cell clones were analyzed by western blot and sequencing to verify knockout of HNRNPD. For reconstitution experiments we generated the PAM resistant HNRNPD isoforms through the QuikChange II Site-Directed Mutagenesis Kit (Agilent) according to the manufacturer instructions with the primers indicated in Supplementary Table S1 .

The pCBASceI and pDRGFP plasmids were a gift from Maria Jasin, Addgene plasmid #26477 ( 19) and Addgene plasmid #26475 ( 20), respectively.

Antibodies and western blot

The following antibodies were used: HNRNPD (1:1000, 07-260, Millipore, RRID:AB_2117338), HNRNPD (1:1000, D6O4F, Cell Signalling, Danvers, MA, RRID:AB_2616009), RPA32 (1:5000, A300–244A, Bethyl Laboratories, RRID:AB_185548), RPA32 S4/S8 (1:2000, A300–245A, Bethyl Laboratories), H3 (1:1000, #9715, Cell Signalling, RRID:AB_331563), CHK1 S345 (1:1000, #2348, Cell Signalling), CHK1 (1:1000, #2360, Cell Signalling), GAPDH (1:1000, sc-25778, Santa Cruz, Dallas), MRE11 (1:1000, NB100–142, Novus Biological), EXOI (1:1000, A302–640A, Bethyl Laboratories), CtIP (1:1000, #61142, Active Motif) SAF-A (1:1000, ab10297, Abcam), RAD17 S645 (1:2000, ab3620, Abcam), FLAG-M2 (1:1000, F1804, SIGMA Aldrich), HA-tag (1:500, sc-805, Santa Cruz Biotechnology), Lamin A/C (1:1000, #4777, Cell Signalling), GFP (1:5000, ab6556, Abcam), His-tag (1:1000, 05-531, Millipore). For total protein extraction, cells were lysed at 4°C in 50 mM HEPES pH7.5, 1% Triton X-100, 150 mM NaCl, 5 mM EGTA, supplemented with protease and phosphatase inhibitor cocktail (Roche Applied Science). Lysates were clarified by centrifugation at 10 000 × g for 20 min. Lysates containing equal amounts of proteins, estimated through the Bradford assay (Bio-Rad), were subjected to SDS-page. The chemiluminescent images were obtained using the ImageQuant LAS 500 (GE Healthcare).

Immunoprecipitation

For protein co-immunoprecipitation, HeLa cells were lysed in protein extraction buffer as for western blot. The protein lysate was quantified and 2 mg, for each condition, were pre-cleared with protein G plus agarose (22851, Thermo Fischer Scientific) 45 min at 4°C on rocking. Immunoprecipitation was carried out at 4°C on rocking over night with either FLAG-M2 (1 μg Ab to 1 mg of proteins, F1804, Sigma Aldrich) and its negative control IgG1 (BD Pharmingen™), or HA-tag (5 μg Ab to 2 mg of proteins, 000000011583816001, Sigma Aldrich) and its negative control IgG2 (550339, BD Pharmingen™).

RPA-ssDNA/dsDNA pull-down

Biotinylated DNA pull-down assay was performed as reported by Yang and Zou ( 21) with some modifications (see Figure 1). Briefly, 87 pmol of 70 nt biotinylated ssDNA were annealed with 87 pmol of 21 nt ssDNA, partially complementary, to generate the DNA end-resection intermediate used as bait in the proteomic screening, or with the same amount of other, different length, ssDNAs to generate DNA fragments either blunt or with varying ssDNA length. Reactions were performed in annealing buffer (20mM NaCl, 10mM Tris–HCl pH7.5) for 3 min at 90°C followed by incubation 15 min at 37°C in a water bath. Annealed DNA was attached to streptavidin-coated magnetic beads (Thermo Fisher Scientific) in binding buffer (10 mM Tris–HCl pH 7.5, 100 mM NaCl, 10% glycerol, 0.01% NP-40) for 15 min rocking at RT followed by incubation with or without purified RPA at RT for 30 min. At the end of incubation, DNA-streptavidin-coated magnetic bead complexes were washed twice with the binding buffer to remove the unbound proteins and subsequently incubated with protein extract from HeLa enriched for the nuclear fraction at RT for 30 min, followed by two washes before western blot analysis or LC/MS. The DNA sequences used are listed in Supplementary Table S1 .

Proteomic screen and HNRNPD chromatin binding ability. (A) Schematic representation of the proteomic screen by using a synthetic DNA structure coated with the heterotrimeric RPA wt complex, including the 70, 32 and 14 kDa isoforms, which was used as bait for the proteomic screen after being challenged with HeLa nuclear extracts. (B) DNA pull down assay of the synthetic DNA structure coated or not with the recombinant RPA complex produced in E. coli (input) followed by western blot analysis with the indicated antibodies. HeLa nuclear extract was incubated for 30 min with 0,3 mg/ml of RNase A on ice, followed by centrifugation to remove debris. (Sequences are listed in Supplementary Table S1 ). (C) DNA pull-down of the schematically indicated biotinylated DNA structures (Sequences are listed in Supplementary Table S1 ). Biotin is represented as black dots. Western blot analysis was performed with the indicated antibodies. The asterisk indicates a non-specific band (which appears using the Millipore antibody). (D) Chromatin enriched purification was performed upon 2 h treatment with 1μM of CPT followed by western blot analysis. DNase I treatment of the first pellet fraction (P) was performed, when indicated, with 80U of enzyme for 30 min at 30°C originating a new supernatant and pellet fraction (S1 and P1 respectively). All the purification steps were performed upon incubation for 30 min with 0,3 mg/ml of RNase A on ice, followed by centrifugation to remove debris. H3 and GAPDH were used as markers for the chromatin and soluble fractions, respectively. RAD17 S645 was used as a DNA damage control. (E) Schematic representation of HNRNPD deletion mutants. (F) HeLa cells were transfected with the indicated DNA, followed by 48 h of incubation. Chromatin enriched purification was performed upon 2 h treatment with 1μM of CPT followed by western blot analysis. All the purification steps were performed upon incubation for 30 min with 0.3 mg/ml of RNase A on ice, followed by centrifugation to remove debris. H3 and GAPDH were used as markers for the chromatin and soluble fractions, respectively. RPA32 was used as a DNA damage control.

Proteomic screen and HNRNPD chromatin binding ability. (A) Schematic representation of the proteomic screen by using a synthetic DNA structure coated with the heterotrimeric RPA wt complex, including the 70, 32 and 14 kDa isoforms, which was used as bait for the proteomic screen after being challenged with HeLa nuclear extracts. (B) DNA pull down assay of the synthetic DNA structure coated or not with the recombinant RPA complex produced in E. coli (input) followed by western blot analysis with the indicated antibodies. HeLa nuclear extract was incubated for 30 min with 0,3 mg/ml of RNase A on ice, followed by centrifugation to remove debris. (Sequences are listed in Supplementary Table S1 ). (C) DNA pull-down of the schematically indicated biotinylated DNA structures (Sequences are listed in Supplementary Table S1 ). Biotin is represented as black dots. Western blot analysis was performed with the indicated antibodies. The asterisk indicates a non-specific band (which appears using the Millipore antibody). (D) Chromatin enriched purification was performed upon 2 h treatment with 1μM of CPT followed by western blot analysis. DNase I treatment of the first pellet fraction (P) was performed, when indicated, with 80U of enzyme for 30 min at 30°C originating a new supernatant and pellet fraction (S1 and P1 respectively). All the purification steps were performed upon incubation for 30 min with 0,3 mg/ml of RNase A on ice, followed by centrifugation to remove debris. H3 and GAPDH were used as markers for the chromatin and soluble fractions, respectively. RAD17 S645 was used as a DNA damage control. (E) Schematic representation of HNRNPD deletion mutants. (F) HeLa cells were transfected with the indicated DNA, followed by 48 h of incubation. Chromatin enriched purification was performed upon 2 h treatment with 1μM of CPT followed by western blot analysis. All the purification steps were performed upon incubation for 30 min with 0.3 mg/ml of RNase A on ice, followed by centrifugation to remove debris. H3 and GAPDH were used as markers for the chromatin and soluble fractions, respectively. RPA32 was used as a DNA damage control.

Cell fractionation

Cell fractionation was performed as previously described by Ishii et al. with minor modifications ( 22). Briefly, 3 × 10 6 cells, per condition, were collected and resuspended in 200 μl of CSK buffer (10 mM PIPES pH 6.8, 100 mM NaCl, 300 mM MgCl2, 1 mM EGTA, 1 mM DTT, 0.1% Triton X-100, 0.34 M sucrose) supplemented with protease and phosphatase inhibitors and kept 5 min on ice. The soluble cytoplasmic fraction (S) was separated from nuclei (P) by 4 min centrifugation at 1300 × g at 4°C. The P fraction was washed with CSK then resuspended in 200 μl of ‘western blot buffer’, sonicated and centrifuged for 30 min at 4°C at 10 000 × g. Following SDS-PAGE, samples were analyzed by western blot with the indicated antibodies. For DNase I treatment the P fraction was incubated with 80U of DNase I (Roche Applied Science, Mannheim Germany) for 30 min at 30°C followed by 30min centrifugation at 1300 × g at 4°C to obtain the solubilized fraction (S1) and pellet post DNase A (P1).

Immunofluorescence and UVC micro-irradiation

HeLa cells, grown on glass coverslips, were fixed with 4% paraformaldehyde and permeabilized with 0.5% triton. Samples were blocked 10 min in 1%BSA at RT and incubated 1 h with anti-BrdU (1:200, 347580, BD Biosciences) or anti-γH2Ax S139 (1:600, 05-636, Millipore) 37°C. After washing, samples were incubated 45 min at 37°C with AlexaFluor 594-conjugated chicken anti-rabbit and 488-conjugated rabbit anti-mouse or 555-conjugated goat anti-mouse IgG (H+L) (Life Technologies), and analyzed with a Zeiss LSM100 confocal microscope.

UVC micro-irradiation was performed as described by Suzuki et al. ( 23), then the following antibodies were used for protein detection HNRNPD (1:200, 07-260, Millipore), γH2Ax S139 (1:600, 05-636, Millipore).

Flow cytometry analysis

Upon 48 h post transfection with siRNA, HeLa cells were treated or not with 1 μM camptothecin (CPT) (Sigma Aldrich) for 2 h and processed as reported by Forment et al. ( 24) with some modifications. Briefly, 1 × 10 6 HeLa cells, per condition, were resuspended in CSK buffer + 0.5% Triton + 0.3 mg/ml RNase A solution with protease inhibitors and incubated for 5 min on ice. At the end of incubation cells were washed twice with CSK buffer followed by centrifugation for 3 min a 1300 × g at 4°C. Afterwards, cells were fixed with 4% paraformaldehyde for 15 min at RT, washed twice in PBS1X and re-suspended in 100 μl of incubation buffer (PBS 1× + 0.5% Saponin) for 1 h with, RPA (1:200, A300-244A, Bethyl Laboratories) or γH2Ax S139 (1:200, 05-636, Millipore) at 37°C. After two washes with incubation buffer, samples were incubated 45 min at 37°C with Alexa Fluor 594-conjugated chicken anti-rabbit or 488-conjugated rabbit anti-mouse (Thermo Fisher Scientific). The percentage of positive cells was determined using the CellQuest Software (Becton Dickinson).

The ssDNA formation assay was performed as reported previously ( 25) with some modifications. Briefly, HeLa cells, 24 h after transfection with the indicated siRNAs, were pulse-labelled for additional 24 h with 10 μM BrdU and treated with 1 μM CPT for 2 h. In order to quantify the amount of resected ssDNA we worked in non-denaturing conditions. Cells were harvested by trypsinization. After washing with the PBS1X, 1 × 10 6 cells for each experimental point were fixed in PBS 1× + 4% paraformaldehyde for 15 min at RT followed by permeabilization with PBS 1× containing 0.1% Triton X-100 for 30 min at RT. Cells were washed in PBS 1× twice and re-suspended in 100μl of incubation buffer (PBS 1× + 0.5% Saponin) with the anti-BrdU antibody (1:100, clone B44, 347580, BD Biosciences) for 1 h at RT or just incubation buffer as control. Cells were washed with incubation buffer and resuspended with 100 μl of 488-conjugated rabbit anti-mouse (Thermo Fisher Scientific) for 45 min at RT. The percentage of BrdU positive cells was determined using the CellQuest Software (Becton Dickinson). The threshold level identifying FITC positivity was set following comparison with cells incubated with only the secondary antibody.

Nuclear extract preparation

Nuclear extracts were prepared as previously reported ( 26). The nuclear fraction was dialyzed over night at 4°C in 10 mM Tris–HCl pH 7.5, 100 mM NaCl, 10% glycerol, 0.01% NP-40.

SMART

SMART (single molecule analysis of resection tracks) was carried out as previously described by Cruz-Garcia et al. ( 27). The same number of control and HNRNPD silenced cells was spotted onto the slides, so, presumably, the same DNA content attached to the slides, although we have not assessed this. However the SMART technique was used to gain a snapshot visual assessment of the effect of HNRNPD silencing on bulk ssDNA amount upon CPT.

ER-AsiSI resection assay

Genomic DNA extraction and preparation for the measurement of resection in mammalian cells was performed as previously ( 28) described with some modifications. Briefly, HeLa cells, stably expressing the ER-AsiSI system (gift from Gaelle Legube CBI-Centre de Biologie Integrative-Toulouse), were treated with or without 300 nM of 4-OHT (Sigma Aldrich) for 4 h. Cells were then trypsinized and resuspended in 0.6% Low gelling agarose (Sigma Aldrich) at a concentration of 6 × 10 6 cells/ml. Fifty microliters of cells were spotted onto a piece of Parafilm to produce an agar solid ball, which was then resuspended in 1 ml of ESP buffer (0.5 M EDTA, 2% N-lauroylsarcosine, 1 mg/ml proteinase-K, 1 mM CaCl2, pH 8.0) for 20 h at 16°C with rotation followed by treatment with 1 ml of HS buffer (1.85 M NaCl, 0.15 M KCl, 5 mM MgCl2, 2 mM EDTA, 4 mM Tris, 0.5% Triton X-100, pH 7.5) for 20 h at 16°C with rotation. After 6X washes of 10 min each with ice cold PBS1X (8 mM Na2HPO4, 1.5 mM KH2PO4, 133 mM KCl, 0.8 mM MgCl2, pH 7.4) at 4°C, the agar ball was melted at 70°C for 15 min and then diluted 15-fold with 70°C of ddH2O. The DNA was diluted with an equal volume of 2X NEB 3.1 buffer. Twenty microliters of genomic DNA were digested or mock digested with 20 units of BamHI-HF (NEB) over night at 37°C. Four microliters were used as template for the real time quantitative PCR reaction (qRT-PCR) with the indicated primers using the SYBR Green real time master mix (Thermo Fisher Scientific). All qRT-PCR reactions were performed in a 7900HT fast-RealTimePCR (Applied Biosystem). The ΔCt value was calculated by subtracting the Ct value of the mock-digested sample from the Ct value of the digested sample. The %ssDNA = 1/(2 (ΔCt-1) +0.5) × 100 ( 28). The sequences of primers used are listed in Supplementary Table S1 .

Chromatin immunoprecipitation (ChIP)

ChIP was carried out as reported by Lee et al. ( 29) with some modifications. HeLa cells expressing ER-AsiSI were mock-treated or treated with 300 nM of 4-OHT for 1 h. Cells were cross-linked with 1% formaldehyde for 10 min at 37°C followed by inactivation with 0.125 M glycine. Cells were washed twice with ice cold PBS 1X for each condition 1 × 10 6 cells were re-suspended in ChIP lysis buffer (50 mM Tris–HCl pH 8, 10 mM EDTA pH 8, SDS 0.1% and protease inhibitors) and incubated 10 min on ice followed by sonication (20 s-on/10 s-off twelve times at 90% of amplitude using Sonics Vibra-Cell Sonics, Newtown, CT, USA). Lysate was clarified for 30 min at 14 000 rpm at 4°C and diluted tenfold with the dilution buffer (50 mM Tris–HCl pH 8, 150 mM NaCl, 1% Triton X-100, SDS 0.1% and protease inhibitors). FLAG M2 antibody (Sigma Aldrich) was used to immunoprecipitate the HNRNPD p45-FLAG or SAF-A-FLAG wt over night at 4°C. To isolate the immunocomplexes, 20 μl of protein G plus agarose (Thermo Fisher Scientific) was added to each sample and incubated, on rocking, 45 min at 4°C. The pellets were sequentially washed with low salt buffer (0.1% SDS, 1% Triton X-100, 2 mM EDTA pH8, 20 mM Tris–HCl pH8, 50 mM NaCl), high salt buffer (0.1% SDS, 1% Triton X-100, 2 mM EDTA pH 8, 20 mM Tris–HCl pH 8, 500 mM NaCl), LiCl buffer (0.25 M LiCl, 1% NP40, 1 mM EDTA pH 8, 1% deoxycholate acid, 10 mM Tris–HCl pH 8) and TE buffer twice (10 mM Tris–HCl ph8 and 1mM EDTA) followed by cross-link reversion in elution buffer (1% SDS and 0.1 M NaHCO3) with 270 mM NaCl at 65°C for 4 h. Protein digestion was carried out by adding 12 mM EDTA pH 8, 54 mM Tris–HCl pH 8 and 30 μg of proteinase K for 1 h at 45°C. DNA was recovered by phenol/chloroform extraction and analyzed, with the indicated primers, using a SYBR Green real time master mix (Thermo Fisher Scientific). All qRT-PCR reactions were performed in a 7900HT fast-RealTimePCR (Applied Biosystems). Primers sequences used as qRT-PCR are listed in Supplementary Table S1 ( 28). Data are reported as mean ± s.d. of three independent experiments. IP efficiency was calculated as % of immunoprecipitated input DNA.

Cell cycle profile

For DNA content analysis cells were fixed in ice-cold 70% ethanol at –20°C. At least 10 000 cells were analyzed by FACS (Becton Dickinson) following staining with 5 mg/ml propidium iodide and 0.25 mg/ml RNase A treatment (Sigma-Aldrich). Data were analyzed through the CellQuest Software (Becton Dickinson).

Quantitative real time PCR

Total RNA was extracted using Trizol (Life Technologies) and treated with TurboDNase (Life Technologies). 1 μg of RNA was retro-transcribed using the Superscript VILO cDNA synthesis kit (Life Technologies). cDNA samples were amplified by real-time quantitative reverse transcriptase-PCR (qRT-PCR) using SYBR Green PCR Master Mix (Life Technologies) with the primers listed in Supplementary Table S1 . Expression levels were normalized to those of the β-actin gene. HNRNPD, MRE11, CTIP and EXOI expression levels in siHNRNPD cells were calculated by the 2 –ΔΔCt method relatively to siCTR control cells.

Colony formation assay

For clonogenic assays, 300 cells were seeded in 24-wells plates and either untreated or treated with the indicated doses of CPT or olaparib (Selleckem) and incubated for 10 days. Colonies were counted after fixation with methanol and staining with crystal violet.

In vitro DNA end-resection assay

Nuclear proteins from HeLa wt and HNRNPD KO cl10 were purified as described above for nuclear extract preparation, followed by dialysis over night at 4°C in 50 mM Tris–HCl pH 7.5, 50 mM NaCl, 2 mM MgCl2, 1 mM DTT and 0.1 mg/ml BSA. A DNA plasmid vector was digested with KpnI (5′ overhangs), HindIII (3′ overhangs) or EcoRV (blunt ends) followed by column purification (Qiagen, Germantown, MD, USA). The reactions were carried out in a final volume of 20 μl with 5 μg of nuclear proteins, per condition, and 300 ng of linearized DNA vector for the indicated time points followed by incubation in 10 mM EDTA, 0.25% SDS and 100 μg/ml proteinase K for 10 min at 37°C. DNA products separated on 0.8% agarose were stained with ethidium bromide.

DNA–RNA immunoprecipitation (DRIP)

DNA–RNA immunoprecipitation was carried out as reported by Li et al. ( 15) with some modifications. HeLa ER-AsiSI were transfected with siCTR or siHNRNPD for 48 h followed by treatment with 4-OHT 300 nM for 4 h. For each condition, 5 × 10 6 cells were re-suspended in TE buffer (10 mM Tris–HCl pH 7.5 + 0.5 mM EDTA) + 0.5% SDS + 300 μg/ml proteinase K a 37°C over night under agitation. At the end of incubation, the DNA:RNA hybrids were extracted with the phenol/chlorophorm protocol the precipitate was washed with 70% of EtOH and air dried. The pellet was resuspendend in 100 μl of H2O and digested over night with 50 U of each of the following restriction enzymes (BsaI, BstXI, NdeI, EcoRI, EcoRV) in 1X restriction buffer. DNA:RNA hybrids were purified through phenol/chlorophorm. The pellet was resuspended in 50 μl of water and treated or mock treated with 15 U di RNase H (18021071, Thermo Fischer Scientific) in a final volume of 100 μl and incubated over night at 37°C. DNA:RNA hybrids were purified through phenol/chlorophorm and resuspended in 50 μl of H2O. 5μg of digested DNA were incubated with 10 μg of S9.6 antibody (ENH001, Kerafast) in binding buffer (10 mM Tris–HCl ph 7.5, 1 mM EDTA, 10 mM NaPO4, 140 mM NaCl, 0.05% Triton X-100) over night a 4°C. At the end of incubation, we added 20 μl of protein G plus agarose (22851, Thermo Fischer Scientific) for 2 h at 4°C on rocking washed three times with binding buffer and elution of immunocomplexes with five times of protein G volume with 50 mM Tris–HCl pH 7.5, 10 mM EDTA, 0.5% SDS, 500 μg/ml proteinase K for 45 min at 55°C. DNA:RNA hybrids were purified with the phenol/chlorophorm protocol and resuspended in 50 μl of H2O. Four microliters were used as template for the real time quantitative PCR reaction (qRT-PCR) with primers amplifying a region which is ∼600 bp from the AsiSI-induced DSB site (listed in Supplementary Table S1 DSB-F/R ChIP and DRIP) using the SYBR Green real time master mix (Thermo Fisher Scientific).

RPA complex cloning and purification

Molecular cloning of the wt RPA complex was performed in pET-duet vector with the primers listed in the Supplementary Table S1 . Purification of wt RPA was performed as reported previously ( 30). Briefly, wt RPA was transformed in BL21 DE3 (Rosetta) followed by induction with 300 μM IPTG for 4 h. The bacterial pellet was resuspended in lysis buffer (50 mM NaH2PO4 pH 7, 300 mM NaCl, 15 mM Imidazole and 10% glycerol followed by sonication). The Ni-NTA resin was used for affinity purification for 2 h at 4°C. At the end of incubation time, the resin was washed six times with 10 mM Tris–HCl pH 8, 300 mM NaCl and scalar concentration of imidazole from 10 mM to 60 mM. The RPA complex was eluted with 10 mM Tris–HCl pH 8, 300 mM NaCl and 300 mM imidazole. The dialysis was carried out over night at 4°C with 100 mM NaCl, 10 mM Tris–HCl pH 7.5, 10% glycerol and 0.01% NP40.

Homologous recombination reporter assay

HeLa cells stable expressing the reporter plasmid pDR-GFP, were transfected with the pDR-GFP plasmid ( 20) (gift from Maria Jasin, Addgene plasmid #26475) and selected with puromycin. HeLa pDRGFP cell lines were co-transfected with the coding plasmid for the endonuclease I-SceI (pCBA SceI, a gift from Maria Jasin, Addgene plasmid #26477(19) and the siCTR, siHNRNPD or siMRE11 (L-009271, Dharmacon). Upon 48 h of incubation we analyzed the GFP values (as a readout of HR frequency) through the FACS analysis.

Identification by LC–MS/MS

Peptide sequencing was performed on a Nano-scale by LC–ESI/MS–MS ( 31). LC–MS system consists of PHOENIX 40 (ThermoQuest Ltd., Hemel Hempstead, UK) connected to LCQ DECA Ion-Trap mass spectrometer (Finnigan, San Jose, CA, USA). Twenty microliters of trypsin-digested solutions were injected in a six-port valve and were trapped in a C18 trapping column (20 mm × 100 μm ID × 360 μm OD, Nano-separations, Nieuwkoop, NL) using 100% HPLC grade water + 0.1% v/v formic acid (solvent A) at a flow rate of 5 μl/min for 10 min. A pre-column splitter restrictor enabled the flow rate to be set at 100–125 nl/min on a C18 analytical column (30 cm × 50 μm ID × 360 μm OD, Nano-separations). Analytical separation was performed using a linear gradient up to 60% acetonitrile + 0.1% (v/v) formic acid (solvent B) for 60 min. At the end of separation, trapping and analytical columns were washed for 10 min in 100% solvent B and were equilibrated for 10 min in 100% solvent A. An ESI needle, composed of gold-coated fused silica (5 cm × 25 μm ID × 360 μm OD, Nano-separations), was heated to 195°C and 2 kV was applied for stable spray operation. Xcalibur™ 1.2 software (Thermo) managed the LC pump and the automatic spectral recording. MS/MS ion search was performed in Swiss-Prot/UniprotKB databases using MASCOT. We set Homo sapiens as taxonomy, peptide precursor charge at 2+ or 3+, mass tolerance at ±1.2 Da for precursor peptide and ±0.6 Da for fragment peptides, only one missed cleavage site as acceptable, carbamidomethylation of cysteine as fixed modification, and methionine oxidation as possible modification. Peptides with individual ion scores –10*log[P] were considered significant. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE ( 32) partner repository with the dataset identifier PXD012045 and 10.6019/PXD012045.

Statistical analysis and reproducibility

Paired two-sided Student's t test was used to compare the means of two matched groups P < 0.05 was considered statistically significant. Representative experiments are shown out of at least two independent ones detailed information (number of independent experiments, P-values) are listed in the individual figure legends.


RESULTS

C6orf203 has a predicted S4-like RNA-binding domain

BLAST analysis revealed an S4-like RNA-binding domain within the human C6orf203 protein (Figure 1A). This domain received its name through its initial identification in the prokaryotic integral ribosomal protein S4 (RPS4) ( 14). RPS4 binds to 16S rRNA in the maturing 30S subunit of the bacterial ribosome, and promotes correct rRNA folding during ribosomal subunit assembly ( 21), as well as binding to its own mRNA to repress translation ( 22). The RPS4 protein consists of two RNA-binding domains (Figure 1A), both of which form extensive contacts with rRNA in the 30S subunit ( 23). However, it is the C-terminal domain of RPS4, termed the S4-like RNA binding domain, which demonstrates conservation to the C-terminal domain of C6orf203 (Figure 1A). Comparison of the predicted folding of the S4-like RNA-binding domain of C6orf203 to the previously resolved S4-like RNA-binding domain of Escherichia coli RPS4 (PDB:3J9Y) suggests structural conservation between the two domains (Figure 1B and C), characterized by an α-helical bundle packed against anti-parallel β-sheets ( 23).

C6orf203 contains an S4-like RNA-binding domain (A). Domain architecture of a diverse selection of proteins containing S4-like RNA-binding domains, including Homo sapiens C6orf203 (Q9P0P8), Escherichia coli 30S ribosomal protein S4 (RPS4) (P0A7V8), Clostridium sporogenes NJ4 rRNA methyltransferase (A0A2P8MJ46), E. coli tyrosyl-tRNA synthetase (P0AGJ9) and Saccharomyces cerevisiae bifunctional protein RIB2 pseudouridine synthetase (Q12362). Proteins are aligned with respect to their S4-like domain (blue) with additional characterized functional domains indicated for each protein. Additional functional domains: NS4/S9 – N-terminal S4/S9 RNA binding domain FtsJ, FtsJ-like methyltransferase tRNA synth, tRNA synthetase class I domain (W and Y), PseudoU synt, RNA pseudouridylate synthetase. (B) Previously resolved structure of E. coli ribosomal protein S4 (PDB: 3J9Y). The C-terminal S4-like RNA-binding domain is enlarged and highlighted in yellow. (C) Structural prediction of the S4-like RNA-binding domain of human C6orf203. Prediction was generated in SWISS-MODEL ( 38) using the primary sequence residues 143–216 of human C6orf203 (Q9P0P8), and modeled in USCF Chimera ( 39). (D) Multiple protein sequence alignment of the S4-like RNA-binding domain of several family member proteins, including H. sapiens C6orf203. Alignment of protein sequence was constructed using Clustal Omega ( 40). The conserved positions are shaded using the 80% consensus rule. The amino acid classes used in building the consensus were as follows: red, polar residues green, hydrophobic residues gray boxes, small residues. Names of the proteins are as assigned in UniProt: RPS4_E. coli: ribosomal protein S4, Escherichia coli (P0A7V8) RS4 (chloroplast)_M. palacea: chloroplast ribosomal protein S4, Marchantia paleacea (P06358) RPS9_H. sapiens: 40S ribosomal protein S9, Homo sapiens (P46781) rRNA methylase_C. sporogenes: rRNA methyltransferase, Clostridium sporogenes (A0A2P8MJ46) TyrS_E. coli: Tyrosyl-tRNA synthetase, Escherichia coli (P0AGJ9) RIB2_S. cerevisiae: RIB2, Saccharomyces cerevisiae (Q12362). (E) RNA electrophoretic mobility shift assay (EMSA) indicating preference of C6orf203 for binding double stranded RNA. Recombinant human C6orf203 was incubated with fluorescein-labeled RNAs (100 nM). RNA template sequences used were as indicated (nt. position in mtDNA). Protein concentrations used were 0, 0.02, 0.04, 0.08, 0.16, 0.36, 0.64, 1.28, 2.56 μM, respectively. (F) Quantification of proportion of unbound RNA ligand relative to no addition of C6orf203 protein, as in Figure 1E. Dissociation constant (Kd) of RNA ligand with C6orf203 is indicated for experiments where it could be determined.

C6orf203 contains an S4-like RNA-binding domain (A). Domain architecture of a diverse selection of proteins containing S4-like RNA-binding domains, including Homo sapiens C6orf203 (Q9P0P8), Escherichia coli 30S ribosomal protein S4 (RPS4) (P0A7V8), Clostridium sporogenes NJ4 rRNA methyltransferase (A0A2P8MJ46), E. coli tyrosyl-tRNA synthetase (P0AGJ9) and Saccharomyces cerevisiae bifunctional protein RIB2 pseudouridine synthetase (Q12362). Proteins are aligned with respect to their S4-like domain (blue) with additional characterized functional domains indicated for each protein. Additional functional domains: NS4/S9 – N-terminal S4/S9 RNA binding domain FtsJ, FtsJ-like methyltransferase tRNA synth, tRNA synthetase class I domain (W and Y), PseudoU synt, RNA pseudouridylate synthetase. (B) Previously resolved structure of E. coli ribosomal protein S4 (PDB: 3J9Y). The C-terminal S4-like RNA-binding domain is enlarged and highlighted in yellow. (C) Structural prediction of the S4-like RNA-binding domain of human C6orf203. Prediction was generated in SWISS-MODEL ( 38) using the primary sequence residues 143–216 of human C6orf203 (Q9P0P8), and modeled in USCF Chimera ( 39). (D) Multiple protein sequence alignment of the S4-like RNA-binding domain of several family member proteins, including H. sapiens C6orf203. Alignment of protein sequence was constructed using Clustal Omega ( 40). The conserved positions are shaded using the 80% consensus rule. The amino acid classes used in building the consensus were as follows: red, polar residues green, hydrophobic residues gray boxes, small residues. Names of the proteins are as assigned in UniProt: RPS4_E. coli: ribosomal protein S4, Escherichia coli (P0A7V8) RS4 (chloroplast)_M. palacea: chloroplast ribosomal protein S4, Marchantia paleacea (P06358) RPS9_H. sapiens: 40S ribosomal protein S9, Homo sapiens (P46781) rRNA methylase_C. sporogenes: rRNA methyltransferase, Clostridium sporogenes (A0A2P8MJ46) TyrS_E. coli: Tyrosyl-tRNA synthetase, Escherichia coli (P0AGJ9) RIB2_S. cerevisiae: RIB2, Saccharomyces cerevisiae (Q12362). (E) RNA electrophoretic mobility shift assay (EMSA) indicating preference of C6orf203 for binding double stranded RNA. Recombinant human C6orf203 was incubated with fluorescein-labeled RNAs (100 nM). RNA template sequences used were as indicated (nt. position in mtDNA). Protein concentrations used were 0, 0.02, 0.04, 0.08, 0.16, 0.36, 0.64, 1.28, 2.56 μM, respectively. (F) Quantification of proportion of unbound RNA ligand relative to no addition of C6orf203 protein, as in Figure 1E. Dissociation constant (Kd) of RNA ligand with C6orf203 is indicated for experiments where it could be determined.

The S4-like RNA-binding domain consists of 60–65 amino acids, with key conserved residues coordinating RNA binding (Figure 1D). The S4-like domain has been found in a diverse range of RNA-interacting proteins across different species. The exact location of the S4-like domain can vary within protein sequence and is often found in combination with additional functional protein domains, including those which confer enzymatic activity (Figure 1A). Alignment of primary sequences of proteins across phylogeny was performed to demonstrate the diversity of S4-like domain containing proteins: H. sapiens ribosomal protein S9, E. coli tyrosyl-tRNA synthetase, Marchantia paleacea chloroplast ribosomal protein S4, C. sporogenes rRNA methyltransferase, S. cerevisiae bifunctional protein RIB2 pseudouridylate synthase (Figure 1A and D). The S4-like domain provides RNA-binding ability to these proteins, positioning the functional domains to their cognate RNAs ( 14). Whilst the biological role of these extra domains has been characterized in many of these proteins (Figure 1A), our attempts to predict previously characterized functional domains in C6orf203, aside from the S4-like domain, through either BLAST search or iTASSER prediction (Iterative Threading ASSEmbly Refinement ( 24) ), were unsuccessful.

With evidence suggesting that C6orf203 contains a functional S4-like RNA binding domain, we sought to determine if recombinant human C6orf203 possesses the ability to bind RNA in vitro by performing electrophoretic mobility shift assays (EMSA). EMSA did not indicate affinity of C6orf203 to a fluorescein-labeled single stranded (ss)RNA template (a portion of ATP6 mRNA, mtDNA nt. 8657–8685) (Figure 1E and F). In contrast, when incubated with double-stranded (ds)RNA templates, either portions of 16S mt-rRNA (mtDNA nt. 1919–1985) or mt-tRNA Pro (mtDNA 15956–16023), C6orf203 demonstrated binding affinity (Figure 1E and F), with Kd values of 0.23 and 0.29μM, respectively. In addition, we repeated our C6orf203 EMSA with fluorescein-labeled mt-tRNA Pro dsRNA in the presence of either unlabeled ssRNA or structured dsRNA molecules at a concentration 100-fold higher than the labeled mt-tRNA Pro ( Supplementary Figure S1B ). Whilst the unlabeled mt-tRNA Pro dsRNA was successful in competing with the labeled RNA, preventing binding of C6orf203, no such competition was observed for the unlabeled ssRNA template, supporting the notion that C6orf203 preferentially binds structured RNA over linear templates. Together, these data suggest that C6orf203 possesses RNA-binding ability, with specificity for binding ds and/or highly structured RNA.

C6orf203 localizes to the mitochondrial matrix

We next sought to characterize the subcellular localization of C6orf203. According to the MitoCarta 2.0 inventory, C6orf203 is predicted to localize to mitochondria in humans ( 13). In order to confirm the prediction, we performed immunocytochemistry (ICC) of human 143B osteosarcoma (HOS) cells transiently transfected with cDNA of C6orf203 tagged with a C-terminal FLAG epitope tag (C6orf203::FLAG). Resultant ICC images indicated strong colocalization of C6orf203::FLAG with MitoTracker Red CMXRos (Figure 2A), indicating a predominantly mitochondrial localization of the recombinant C6orf203::FLAG protein.

C6orf203 localizes to the mitochondrial matrix in human cells. (A) Intracellular localization of C6orf203 via immunocytochemistry. C6orf203::FLAG cDNA was transiently transfected into HOS cells. Cell nuclei were stained with DAPI (blue). The C6orf203::FLAG protein product was detected via an anti-FLAG antibody and visualized using a secondary antibody conjugated to Alexa Fluor 488 (green). Mitochondria were visualized using MitoTracker Red CMXRos (red). A digitally merged image of DAPI, Alexa Fluor 488 and MitoTracker signals reveals colocalization of C6orf203::FLAG with the mitochondrial network scale bar = 10 μm. (B) Subcellular fractionation of HEK293T cells expressing C6orf203::FLAG protein. HEK293T cell lysates were fractionated into cytosolic and mitochondrial fractions. In addition, aliquots of the mitochondrial fractions were treated with 25 μg/ml Proteinase K with or without treatment with 1% Triton X-100. Fractions (40 μg) were analyzed by western blotting and the localization of C6orf203::FLAG was assessed in comparison to that of protein markers of the cytosol (cytosolic ribosomal protein uL1), outer mitochondrial membrane (TOM20), and mitochondrial matrix (mitochondrial ribosomal protein uL3m and uS15m). T, total cell lysate D, debris C, cytosolic fraction.

C6orf203 localizes to the mitochondrial matrix in human cells. (A) Intracellular localization of C6orf203 via immunocytochemistry. C6orf203::FLAG cDNA was transiently transfected into HOS cells. Cell nuclei were stained with DAPI (blue). The C6orf203::FLAG protein product was detected via an anti-FLAG antibody and visualized using a secondary antibody conjugated to Alexa Fluor 488 (green). Mitochondria were visualized using MitoTracker Red CMXRos (red). A digitally merged image of DAPI, Alexa Fluor 488 and MitoTracker signals reveals colocalization of C6orf203::FLAG with the mitochondrial network scale bar = 10 μm. (B) Subcellular fractionation of HEK293T cells expressing C6orf203::FLAG protein. HEK293T cell lysates were fractionated into cytosolic and mitochondrial fractions. In addition, aliquots of the mitochondrial fractions were treated with 25 μg/ml Proteinase K with or without treatment with 1% Triton X-100. Fractions (40 μg) were analyzed by western blotting and the localization of C6orf203::FLAG was assessed in comparison to that of protein markers of the cytosol (cytosolic ribosomal protein uL1), outer mitochondrial membrane (TOM20), and mitochondrial matrix (mitochondrial ribosomal protein uL3m and uS15m). T, total cell lysate D, debris C, cytosolic fraction.

We next generated a HEK293T cell line, which allowed for inducible overexpression of C6orf203::FLAG via the Flp-In TREx system (Invitrogen). Subcellular fractionation of HEK293T cells overexpressing C6orf203::FLAG further supported our ICC findings that C6orf203 localizes to human mitochondria (Figure 2B). Although FLAG signal for the C6orf203::FLAG was too weak to detect in total lysate (T), C6orf203::FLAG was evident in the mitochondrial fraction, indicating an enriched localization of the protein to mitochondria (Figure 2B). Following Proteinase K treatment of isolated mitochondria, C6orf203::FLAG remains intact, similar to the behavior of mitoribosome subunit proteins uL3m and uS15m, while cytosolic ribosomal protein uL1 and outer mitochondrial membrane protein TOM20 are both degraded by Proteinase K treatment (Figure 2B). This indicates that the C6orf203::FLAG protein remains protected from proteolysis within mitochondria. Together, these results confirm that C6orf203 localizes to mitochondria in human cultured cells.

Mitochondrial gene expression is perturbed in the absence of C6orf203

As the EMSA results have suggested that C6orf203 may have RNA-binding capability (Figure 1E), we sought to explore the role of C6orf203 in mitochondrial gene expression. To investigate this, several knockout HEK293T cell lines (KOs) were produced using CRISPR/Cas9 technology ( 15) and knockout was confirmed via western blotting (Figure 3A) and PCR analysis ( Supplementary Figure S2B ). KO 1 and KO 4 were chosen for future analysis.

Mitochondrial gene expression is perturbed in the absence of C6orf203. (A) Western blotting to confirm knockout of C6orf203 in HEK293T lines KO 1 , KO 2 , KO 4 and KO 6 . Total lysate from WT and KO clones was resolved via SDS-PAGE, immunoblotting was performed and membranes were probed with anti-C6orf203 antibody and anti-GAPDH as a loading control. (B) Growth curve of WT HEK293T, and C6orf203 KO cells in DMEM containing 0.9 g/l galactose (three biological replicates were performed and mean average cell numbers at each time point is indicated, error bars = 1 SD). (C) Western blotting of OXPHOS subunits in C6orf203 KO lines. Total lysate from WT HEK293T and C6orf203 KO 1 and KO 4 was resolved via SDS-PAGE Western blotting was performed and membranes were probed with antibodies against C6orf203 and both mtDNA-encoded (COX1, COX2) and nuclear-encoded (NDUFB8, SDHA, ATP5a) OXPHOS subunits. Anti-VDAC1 antibody was used as loading control. (D) Western blotting was performed on the control HEK293T (WT), C6orf203 KO 1 cells and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6::F) that were used for Oxygraph in (E). Antibody staining against C6orf203 was performed to confirm knockout of C6orf203, and re-expression of the C6orf203::FLAG protein. α-tubulin was used as loading control. (E) Ratio of maximal and basal mitochondrial respiration (each expressed in pmol of oxygen flux/mg of protein) measured by Oroboros oxygraph. Maximal respiration was measured after permeabilization of the cells (with Digitonin), inhibition of complex V (with Oligomycin) and treatment with the protonophore (CCCP) (n = 3, * P-value = 0.0308). (F) BN-PAGE and in-gel activities of complex I, complex II, complex IV and complex V activities in mitochondrial protein extracts from WT, C6orf203 KO clones and KO 1 cells expressing C6orf203::FLAG (KO 1 + C6::F). Coomassie staining of the gel is shown to indicate equal loading. Asterisk indicates accumulated F1-containing sub-complexes of complex V.

Mitochondrial gene expression is perturbed in the absence of C6orf203. (A) Western blotting to confirm knockout of C6orf203 in HEK293T lines KO 1 , KO 2 , KO 4 and KO 6 . Total lysate from WT and KO clones was resolved via SDS-PAGE, immunoblotting was performed and membranes were probed with anti-C6orf203 antibody and anti-GAPDH as a loading control. (B) Growth curve of WT HEK293T, and C6orf203 KO cells in DMEM containing 0.9 g/l galactose (three biological replicates were performed and mean average cell numbers at each time point is indicated, error bars = 1 SD). (C) Western blotting of OXPHOS subunits in C6orf203 KO lines. Total lysate from WT HEK293T and C6orf203 KO 1 and KO 4 was resolved via SDS-PAGE Western blotting was performed and membranes were probed with antibodies against C6orf203 and both mtDNA-encoded (COX1, COX2) and nuclear-encoded (NDUFB8, SDHA, ATP5a) OXPHOS subunits. Anti-VDAC1 antibody was used as loading control. (D) Western blotting was performed on the control HEK293T (WT), C6orf203 KO 1 cells and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6::F) that were used for Oxygraph in (E). Antibody staining against C6orf203 was performed to confirm knockout of C6orf203, and re-expression of the C6orf203::FLAG protein. α-tubulin was used as loading control. (E) Ratio of maximal and basal mitochondrial respiration (each expressed in pmol of oxygen flux/mg of protein) measured by Oroboros oxygraph. Maximal respiration was measured after permeabilization of the cells (with Digitonin), inhibition of complex V (with Oligomycin) and treatment with the protonophore (CCCP) (n = 3, * P-value = 0.0308). (F) BN-PAGE and in-gel activities of complex I, complex II, complex IV and complex V activities in mitochondrial protein extracts from WT, C6orf203 KO clones and KO 1 cells expressing C6orf203::FLAG (KO 1 + C6::F). Coomassie staining of the gel is shown to indicate equal loading. Asterisk indicates accumulated F1-containing sub-complexes of complex V.

To assess mitochondrial function in the C6orf203 knockout lines, clones were cultured in media containing galactose as the sole carbon source, which forces cells to rely almost entirely on mitochondria for ATP production. A reduced ability of KO clones to proliferate in galactose media relative to the growth of wild-type (WT) HEK293T cell line population was observed (Figure 3B), suggesting a mitochondrial dysfunction.

To further investigate this observation, western blotting of OXPHOS subunits was performed, indicating a mild reduction in the steady-state levels of components of complex I (NDUFB8) and IV (COX1 and COX2) (Figure 3C), suggesting that the C6orf203 knockout interferes with mitochondrial gene expression.

In order to confirm that the observed OXPHOS deficiency was due to the loss of the C6orf203 protein, we complemented the KO 1 clone via the Flp-In T-Rex system to allow for inducible expression of wild-type C6orf203::FLAG cDNA (Figure 3D). We next measured oxygen consumption rates in knockout and complemented cell lines. The ratio of maximal respiration to basal respiration was modestly decreased in the absence of C6orf203 (Figure 3E), which was in line with the moderate decrease in the steady-state levels of components of OXPHOS complexes. This mild decrease was rescued by re-expression of C6orf203::FLAG protein, also confirming that FLAG-tagged protein can functionally substitute endogenous C6orf203.

In addition, in-gel activity of OXPHOS complexes were analyzed for both knockout and complemented lines (Figure 3F). Activity of complexes I, II and IV was unchanged or mildly affected for tested KO clones (Figure 3F). However, upon staining for in-gel activity of complex V, we observed activity derived from complexes of a reduced molecular mass relative to the F1F0 holoenzyme (Figure 3F). Due to the presence of ATP hydrolysis, these are likely formed due to F1-containing subcomplexes. This suggests that a mild assembly defect is occurring in complex V in the absence of C6orf203, which is rescued upon the re-expression of C6orf203::FLAG cDNA (Figure 3F). Together, these results support that C6orf203 contributes to efficient OXPHOS integrity within mitochondria, due to the mild phenotype present upon C6orf203 ablation.

In light of the localization of C6orf203 to mitochondria, we sought to determine if the mild OXPHOS phenotype in C6orf203 KO was directly attributable to a defect in mtDNA expression. Metabolic labeling of mitochondrial translation products indicated a strong, general translation defect in independent KO clones (about 50% compared to control, n = 3), which was restored upon expression of C6orf203::FLAG cDNA in KO 1 (Figure 4A). As our earlier findings (Figures 1E and 2) suggest that C6orf203 may function as a mitochondrial RNA-binding protein, we next investigated whether alteration to the mitochondrial transcriptome was an underlying factor to the mitochondrial translation defect observed in C6orf203 KO. However, northern blotting of all mtDNA-encoded mRNAs, rRNAs and tRNAs (Figure 4B, C and D) did not indicate any change to steady-state levels, nor disturbance in RNA processing, that would explain the severe, general mitochondrial translation defect observed.

Knockout of C6orf203 leads to decreased mitochondrial translation without affecting the steady-state levels of mitochondrial RNAs. (A) Mitochondrial translation in C6orf203 KO cell lines. Following inhibition of cytosolic translation, products of mitochondrial translation were labeled with [ 35 S]-methionine in WT, C6orf203 KO clones and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6::F, numbers indicate clones 1 and 2). Mitochondrial proteins were separated by SDS-PAGE and visualized by autoradiography. Coomassie Brilliant Blue (CBB) staining is provided to confirm equal protein loading. Immunoblotting was used to show C6orf203 expression. (B) Northern blotting of mt-mRNAs (B), mt-rRNAs (C) and mt-tRNAs (D) for WT and C6orf203 KO clones. Nuclear-encoded 28S and 18S rRNA were used as a loading control for mt-mRNAs and mt-rRNAs, and cytosolic tRNA (Tryptophan) was used as a loading control for mt-tRNA blots.

Knockout of C6orf203 leads to decreased mitochondrial translation without affecting the steady-state levels of mitochondrial RNAs. (A) Mitochondrial translation in C6orf203 KO cell lines. Following inhibition of cytosolic translation, products of mitochondrial translation were labeled with [ 35 S]-methionine in WT, C6orf203 KO clones and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6::F, numbers indicate clones 1 and 2). Mitochondrial proteins were separated by SDS-PAGE and visualized by autoradiography. Coomassie Brilliant Blue (CBB) staining is provided to confirm equal protein loading. Immunoblotting was used to show C6orf203 expression. (B) Northern blotting of mt-mRNAs (B), mt-rRNAs (C) and mt-tRNAs (D) for WT and C6orf203 KO clones. Nuclear-encoded 28S and 18S rRNA were used as a loading control for mt-mRNAs and mt-rRNAs, and cytosolic tRNA (Tryptophan) was used as a loading control for mt-tRNA blots.

C6orf203 interacts with the mitochondrial ribosomal large subunit

As no steady-state level change of any of the mitochondrial transcripts was observed, we next sought to identify specific interactions of C6orf203, as this may provide more insight into the underlying mechanism of the observed mitochondrial translation defect. Following C6orf203::FLAG induction in our overexpressing HEK293T line, FLAG immunoprecipitation (IP) without crosslinking was performed. To control for non-specific binding, FLAG-tagged mitochondrially targeted luciferase was expressed in HEK293T and immunoprecipitated by the same procedure ( Supplementary Figure S3B ). Western blotting of the eluate revealed specific enrichment of proteins of the mt-LSU (mL37 and uL3m) in C6orf203 pulldown, while protein subunits of the mt-SSU were not enriched (Figure 5A). Additionally, we found that the observed interaction with mt-LSU subunits was RNA-dependent, since RNase A treatment led to a loss of interaction of C6orf203 with mL37 and uL3m ( Supplementary Figure S3C ).

C6orf203 interacts with the mt-LSU. (A) Immunoblotting of C6orf203::FLAG pulldown. Input mitochondrial lysates and eluate of FLAG-IP from HEK293T expressing C6orf203::FLAG and control HEK293T without FLAG protein expression (WT) were resolved via SDS-PAGE western blotting was performed and subsequent membranes were probed with antibodies against FLAG and for proteins of either the mt-LSU proteins (mL37, uL3m) or mt-SSU proteins (bS16m, uS15m). (B) Mass spectrometry analysis of proteins interacting with C6orf203::FLAG. Following FLAG-immunoprecipitation, eluates were analyzed by label-free quantitative mass spectrometry (LFQ) (n = 3). Volcano plot indicates only proteins found in the Mitocarta 2.0 database. Proteins with fold change >2.5 are marked. Inset: Boxplot displaying comparison of the logFC (log fold change) of proteins (as in main B) of the mt-LSU or mt-SSU in comparison to global proteins. Stated P-values indicate pair-wise significance of difference in logFC between global proteins, 39S mt-LSU proteins or 28 mt-SSU proteins as determined via Welch’s unequal variances t-test. (C) Quantitative real-time PCR to assess enrichment of mt-RNAs upon C6orf203::FLAG pulldown. Anti-FLAG Immunoprecipitation was performed on mitochondrial lysate from HEK293 cells and HEK293T cells overexpressing C6orf203::FLAG. RNA was extracted from both mitochondrial lysates and elution fractions, reverse transcribed, and qPCR was performed using primer and probes pairs to mitochondrial DNA encoded transcripts. The relative fold change of transcript abundance in the C6orf203::FLAG immunoprecipitated eluates versus the HEK293 eluates relative to transcript abundance in their respective input mitochondrial lysates were calculated and are shown here. Graph represents combination of three biological replicate experiments error bars = S.E.M.

C6orf203 interacts with the mt-LSU. (A) Immunoblotting of C6orf203::FLAG pulldown. Input mitochondrial lysates and eluate of FLAG-IP from HEK293T expressing C6orf203::FLAG and control HEK293T without FLAG protein expression (WT) were resolved via SDS-PAGE western blotting was performed and subsequent membranes were probed with antibodies against FLAG and for proteins of either the mt-LSU proteins (mL37, uL3m) or mt-SSU proteins (bS16m, uS15m). (B) Mass spectrometry analysis of proteins interacting with C6orf203::FLAG. Following FLAG-immunoprecipitation, eluates were analyzed by label-free quantitative mass spectrometry (LFQ) (n = 3). Volcano plot indicates only proteins found in the Mitocarta 2.0 database. Proteins with fold change >2.5 are marked. Inset: Boxplot displaying comparison of the logFC (log fold change) of proteins (as in main B) of the mt-LSU or mt-SSU in comparison to global proteins. Stated P-values indicate pair-wise significance of difference in logFC between global proteins, 39S mt-LSU proteins or 28 mt-SSU proteins as determined via Welch’s unequal variances t-test. (C) Quantitative real-time PCR to assess enrichment of mt-RNAs upon C6orf203::FLAG pulldown. Anti-FLAG Immunoprecipitation was performed on mitochondrial lysate from HEK293 cells and HEK293T cells overexpressing C6orf203::FLAG. RNA was extracted from both mitochondrial lysates and elution fractions, reverse transcribed, and qPCR was performed using primer and probes pairs to mitochondrial DNA encoded transcripts. The relative fold change of transcript abundance in the C6orf203::FLAG immunoprecipitated eluates versus the HEK293 eluates relative to transcript abundance in their respective input mitochondrial lysates were calculated and are shown here. Graph represents combination of three biological replicate experiments error bars = S.E.M.

Next, we performed label-free quantitative mass spectrometry on the eluate from the C6orf203::FLAG IP relative to a negative control (HEK293T without FLAG protein expression). This revealed enrichment of a large number of proteins of the mt-LSU that co-immunoprecipitated with C6orf203::FLAG (Figure 5B), while only a single subunit of the mt-SSU was observed (uS15m) with a logFC of >2. Indeed, when the enrichment profiles of the constituent proteins from the mt-LSU and the mt-SSU were considered separately, all mt-LSU subunits showed a consistent enrichment over global protein distribution, while mt-SSU proteins were not enriched over that of global proteins (Figure 5B, inset). We also observed enrichment of several cytosolic ribosomal proteins. However, this interaction was reduced upon Proteinase K treatment of mitochondria prior to IP, as tested via western blotting ( Supplementary Figure S3A ).

In addition to mt-LSU components, a number of other proteins were highly enriched in C6orf203::FLAG IP, including a number of factors known to be involved in assembly of the mt-LSU (i.e. GTPBP10, DHX30, MALSU1, RPUSD4, TRMT10C) (Figure 5B). As many of these assembly factors would likely be excluded from the mature ribosome, this may indicate that the observed interaction of C6orf203 occurs within an assembly intermediate of the mt-LSU.

Curiously, one of the most highly enriched proteins observed upon C6orf203::FLAG pulldown was C12orf65, a member of a family of four mitochondrial class I peptide release factors. However, the exact role of C12orf65 in mitochondrial translation is currently unknown, with family member mtRF1a having been found to be sufficient for termination of translation in all 13 mitochondrial open reading frames ( 25). The interaction of C12orf65 with C6orf203 requires further investigation to determine functional relevance.

As we have earlier characterized C6orf203 as having RNA-binding affinity, we sought to investigate whether C6orf203 interacted with mitochondrial RNAs and so repeated our C6orf203::FLAG immunoprecipitations and extracted RNA from both input and eluates for the C6orf203::FLAG line and WT HEK293 control. Through quantitative RT-PCR focused on mt-mRNAs and mt-rRNAs, we observed specific enrichment of 16S mt-rRNA upon C6orf203::FLAG IP compared to control (Figure 5C). This observation further supports our proteomic findings concerning interaction of C6orf203 with either the full or a near-assembled intermediate of mt-LSU.

Absence of C6orf203 leads to reduction of mt-mRNAs loaded onto mitoribosomes

In light of the observed interaction between C6orf203 and the mt-LSU, as well as the enrichment of a complement of known mt-LSU assembly factors (Figure 5B), we next sought to assess the integrity of the mitoribosomes in our C6orf203 KO lines. Western blotting revealed no decrease of steady-state levels of a number of constituent proteins of either the mt-LSU or mt-SSU (Figure 6A), and sucrose gradient fractionation indicated that overall integrity of both the mt-SSU and mt-LSU was retained in the C6orf203 KO clones (Figure 6B). This suggests that there is no global defect in the assembly of either subunit of the mitoribosome in the absence of C6orf203 (Figure 6B), and so C6orf203 is unlikely to play a role in the early stages of assembly of the mt-LSU. This is consistent with the observation of a near-full complement of mt-LSU proteins upon C6orf203::FLAG pulldown (Figure 5B), suggesting instead that C6orf203 may act on a later stage assembly intermediate of mt-LSU or the full mt-LSU.

C6orf203 loss affects the engagement of mt-mRNAs with mitoribosomes. (A) Western blotting of steady-state levels of mitochondrial ribosomal proteins in WT, C6orf203 KO and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6::F). Total cell lysate from WT HEK293T, C6orf203 KO clones and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6F) was resolved via SDS-PAGE Western blotting was performed and membranes were probed with antibodies against proteins of either the mt-LSU or mt-SSU, and VDAC1 was used as loading control. (B) Sedimentation of mitochondrial ribosomes on 10–30% isokinetic sucrose gradients for WT HEK293T and C6orf203 KO 1 . Mitochondria were isolated from cells, and lysates were loaded onto gradients. Following centrifugation, obtained fractions were analyzed by western blotting with antibodies against proteins of the mt-LSU (uL3m and mL37), the mt-SSU (uS15m) and C6orf203. (C) Mitoribosome profiling analysis. Relative ratio of mitoribosome-protected fragments (per million mapped reads, RPM) for C6orf203 KO 1 versus WT HEK293T for each mt-mRNA ORF determined via MitoRiboSeq. Individuals CDSs are displayed according to their ORF length. Reads with 5′ ends mapping between the first nucleotide of the start codon and 30 nt 5′ of the stop codon were counted for each library. Overlapping regions for ORFs on bicistronic transcripts (ATP8/ATP6 and ND4l/ND4) were excluded from the analysis. Results represents data from a single MitoRiboSeq experiment.

C6orf203 loss affects the engagement of mt-mRNAs with mitoribosomes. (A) Western blotting of steady-state levels of mitochondrial ribosomal proteins in WT, C6orf203 KO and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6::F). Total cell lysate from WT HEK293T, C6orf203 KO clones and KO 1 cells expressing C6orf203::FLAG (KO 1 +C6F) was resolved via SDS-PAGE Western blotting was performed and membranes were probed with antibodies against proteins of either the mt-LSU or mt-SSU, and VDAC1 was used as loading control. (B) Sedimentation of mitochondrial ribosomes on 10–30% isokinetic sucrose gradients for WT HEK293T and C6orf203 KO 1 . Mitochondria were isolated from cells, and lysates were loaded onto gradients. Following centrifugation, obtained fractions were analyzed by western blotting with antibodies against proteins of the mt-LSU (uL3m and mL37), the mt-SSU (uS15m) and C6orf203. (C) Mitoribosome profiling analysis. Relative ratio of mitoribosome-protected fragments (per million mapped reads, RPM) for C6orf203 KO 1 versus WT HEK293T for each mt-mRNA ORF determined via MitoRiboSeq. Individuals CDSs are displayed according to their ORF length. Reads with 5′ ends mapping between the first nucleotide of the start codon and 30 nt 5′ of the stop codon were counted for each library. Overlapping regions for ORFs on bicistronic transcripts (ATP8/ATP6 and ND4l/ND4) were excluded from the analysis. Results represents data from a single MitoRiboSeq experiment.

Curiously, despite the observed translational defect in C6orf203 KO cells (Figure 4A), mitochondrial monosome formation is not affected in the absence of C6orf203 (Figure 6B). To investigate this disparity between mitochondrial monosome formation and the strong translation defect present in C6orf203 KO, we employed mitochondrial ribosome profiling (MitoRiboSeq) in one of the C6orf203 KO lines (KO 1 ). Through this, we observed that occupancy of all mt-mRNAs on mitochondrial monosomes was greatly reduced relative to WT HEK293T (Figure 6C), although no specific mt-mRNAs were affected to a greater extent than others (Figure 6C). The MitoRiboSeq results are in accordance with the mitochondrial translation defect observed in C6orf203 KO (Figure 4A).

Further analysis of MitoRiboSeq revealed that, although the engagement of mRNAs with mitoribosomes is decreased, once the mRNAs are loaded and translation initiated, there is no major disturbance in the elongation efficiency, as no obvious difference in mitoribosome pausing was observed between KO and WT cells ( Supplementary Figure S4A and B ), nor mitoribosome drop-off across the length of translated transcripts (which could indicate spurious loss of reading frame) ( Supplementary Figure S4C ). Additionally, we did not observe any increase in occupancy of the mitoribosomes on specific codons in the C6orf203 knockout lines, which suggests that there is no perturbation of the availability of any individual aminoacylated mt-tRNA ( Supplementary Figure S5 ). Taken together, in the absence of C6orf203, mitochondrial monosomes are formed, but their function is compromised, resulting in reduced mitochondrial translation.


Divergent Coding/Noncoding Promoters Provide a Unique Opportunity to Compare RNAPII Transcription Cycles Across lncRNA/mRNA Pairs

Profiling nascent transcription has been particularly informative in demonstrating that most nucleosome-depleted regions initiate transcription in both directions (Wei et al. 2011 Scruggs et al. 2015) ( Figure 3 ). Cap analysis gene expression (CAGE) analyses demonstrate that transcription cycles in both directions begin by promoting the same 5′-cap modification to the first transcribed base (Andersson et al. 2014). However, divergent lncRNAs are often targeted for exosome-mediated degradation (Preker et al. 2008), with sense-strand mRNA being noticeably more stable. This is even the case for divergent lncRNA/mRNA pairs that show similar levels of nascent transcription (Sigova et al. 2013). Thus, most promoters appear to preferentially drive transcription in the coding direction. The mechanism(s) by which directionality is achieved are not well-understood. Canonical transcript cleavage, polyadenylation, and termination sequences are enriched in some divergent lncRNAs (Almada et al. 2013). Interestingly, while these cis DNA elements lead to the formation of stable mRNA in the coding direction, divergent lncRNA termination triggers exosome-mediated lncRNA degradation (Ntini et al. 2013). The different effects of cleavage and polyadenylation cis-elements on mRNAs and lncRNAs reconcile differences in the stability of transcript pairs from such promoters. Additionally, the strength of transcription in either direction is selectively regulated at the chromatin level, since divergent lncRNA transcription can be suppressed or activated by a number of chromatin remodelling factors (Churchman and Weissman 2011 Marquardt et al. 2014 Scruggs et al. 2015). RNAPII-associated factors such as PAFI, DSIF, and Ssu72 are also implicated in controlling directionality (Tan-Wong et al. 2012 Fischl et al. 2017 Shetty et al. 2017). Interestingly, there is evidence to suggest that the RNAPII CTD is enriched for phosphorylation of Tyr1 during divergent noncoding transcription (Descostes et al. 2014 Hsin et al. 2014), further revealing the unique properties of lncRNA transcription. Although transcription from coding/noncoding pairs is tightly controlled, our understanding of how RNAPII, RNAPII-associated factors, and chromatin-remodellers discriminate between coding and noncoding orientations is rudimentary. Moreover, the purpose of noncoding transcription from bidirectional promoters remains poorly understood. Considering that widespread divergent noncoding transcription is detected in many organisms, it may prove to be an unavoidable consequence of eukaryotic promoter structure. However, the identification of factors required for specific activation or repression of divergent lncRNA transcription illustrates tight regulation of this process, which is indicative of functional significance although that significance is yet to be fully explained.

Bidirectional promoters. Although most promoters can initiate transcription in either direction, the sense orientation is generally favored. Several RNAPII-associated factors and chromatin changes appear to control directionality. For example, PAF1, DSIF, and Ssu72 appear to stimulate RNAPII transcription in the sense direction (Tan-Wong et al. 2012 Fischl et al. 2017 Shetty et al. 2017). Chromatin remodellers also compete to regulate divergent transcription. Notably, CAF-I suppresses divergent transcription by favoring the incorporation of nucleosomes with H3K56ac, while the SWI/SNF complex opposes this activity and thereby promotes divergent transcription (Marquardt et al. 2014). Divergent RNAPII transcription can also be enriched for Tyr1P (Y1P) (Descostes et al. 2014 Hsin et al. 2014). Finally, many divergent lncRNAs are enriched for promoter-proximal polyadenylation sites and targeted for early termination and exosome-mediated degradation in the nucleus (Preker et al. 2008 Almada et al. 2013 Ntini et al. 2013). In contrast, stable mature mRNAs are transported to the cytoplasm for protein synthesis. H3, histone H3 lncRNA, long noncoding RNA poly-(A) site (PAS), RNAPII, RNA Polymerase II switch/sucrose non-fermentable(SWI/SNF).


Functional lncRNAs Display mRNA-Like Transcriptional Properties

Predictably, there are exceptions to the dominant patterns highlighted above. Nascent transcription analyses have also uncovered lncRNAs that exhibit more mRNA-like transcriptional properties (See Figure 2B ). For example, two such transcripts detected in human HeLa cells include the highly abundant transcripts NORAD and TINCR (Schlackow et al. 2017). NORAD localizes to the cytoplasm where it controls gene expression by interacting with and regulating Pumilio RNA-binding proteins (Lee et al. 2016), while TINCR maintains somatic cell differentiation by interacting with and stabilizing differentiation-specific mRNAs (Kretz et al. 2013). Since lncRNAs generally display stronger tissue-specific expression patterns than protein-coding genes (Derrien et al. 2012 Kornienko et al. 2016), nascent transcript analyses must be expanded to other human cell types to identify more mRNA-like lncRNAs since this subclass is highly likely to have specific and discernible functions.


Proteomics Strategies to Identify SUMO Targets and Acceptor Sites: A Survey of RNA-Binding Proteins SUMOylation

SUMOylation is a protein posttranslational modification that participates in the regulation of numerous biological processes within the cells. Small ubiquitin-like modifier (SUMO) proteins are members of the ubiquitin-like protein family and, similarly to ubiquitin, are covalently linked to a lysine residue on a target protein via a multi-enzymatic cascade. To assess the specific mechanism triggered by SUMOylation, the identification of SUMO protein substrates and of the precise acceptor site to which SUMO is bound is of critical relevance. Despite hundreds of mammalian proteins have been described as targets of SUMOylation, the identification of the precise acceptor sites still represents an important analytical challenge because of the relatively low stoichiometry in vivo and the highly dynamic nature of this modification. Moreover, mass spectrometry-based identification of SUMOylated sites is hampered by the large peptide remnant of SUMO proteins that are left on the modified lysine residue upon tryptic digestion. The present review provides a survey of the strategies that have been exploited in order to enrich, purify and identify SUMOylation substrates and acceptor sites in human cells on a large-scale format. The success of the presented strategies helped to unravel the numerous activities of this modification, as it was shown by the exemplary case of the RNA-binding protein family, whose SUMOylation is here reviewed.

This is a preview of subscription content, access via your institution.


Discussion

Three major conclusions from our study provide insight into how plant alternative splicing is regulated and how it impacts abiotic stress responses. First, our splicing analysis, RNA binding assays, and phylogenetic analysis show that HIN1 is a previously unknown and unexpected type of plant-specific RNA-binding protein and splicing regulator. Second, HIN1–HAI1 interaction, as well as HAI1 dephosphorylation of HIN1, provides a link between stress signaling and pre-mRNA splicing. Third, the shift toward increased splicing efficiency of IR-prone introns during acclimation to moderate-severity low ψw indicates a specific, and also unexpected, effect of drought acclimation on IR that has yet to be explored. The genes where low ψw and 35S:HIN1 led to enhanced splicing efficiency were enriched for abiotic stress and signaling-related functions. Together with the enhanced growth maintenance of 35S:HIN1 during low ψw, these results demonstrated that HIN1 and regulation of pre-mRNA splicing are important for drought acclimation.

The effect of HIN1 on IR and its binding to a GAA-containing motif enriched in intron-flanking regions indicated that HIN1 may be functionally similar to splicing enhancer and inhibitor proteins which have been best characterized in metazoans but are not fully understood in any organism. Although we cannot rule out other possibilities, our data suggest that HIN1 acts as a splicing inhibitor. HIN1 binding to ESE-like elements may act to recruit splicing-related proteins which interact with HIN1 and may also block RNA-binding of other splicing-associated proteins which associate with the same, or similar, RNA motif (Fig. 6). Several types of plant SR proteins, including the HIN1-colocalized and HIN1-interacting proteins RSZ22 SC35, SRP30, and SRP34, are thought to bind cis-elements that regulate splicing (18). RSZ22 has been shown to interact with components of the U1 SNRP and may recruit it to the 5′ splice site (18, 48). Interestingly, RSZ22 preferentially binds to a different RNA motif than HIN1 (49) perhaps these proteins colocalize and interact when bound to adjacent RNA sites or when one or both of them is not bound to RNA. The SC35-related protein SCL30 (and perhaps SC35 itself) bound to a GAA-containing RNA motif enriched in sequences within 100 bases of SC35/SCL30-regulated introns (43). Also, a motif similar to the GAA-containing motif we identified was enriched in the intron-flanking regions of SR45-associated transcripts, including ABA and stress-related genes (20). SR45 mutant and overexpression lines had altered ABA sensitivity. Such results, along with the results presented here, hint at a complex interaction and competition between proteins that bind to GAA-containing ESE-like elements and indicate that such interaction/competition is important for abiotic stress and ABA responses (Fig. 6). Such interactions may explain why either loss of HIN1 function or ectopic HIN1 expression could lead to enhanced splicing efficiency in unstressed plants. It is possible that ectopic expression of HIN1 sequesters other splicing factors or that that higher levels of HIN1 expression lead to different posttranslational modification (such as phosphorylation) that determines HIN1 function. Such HIN1 interactions and regulation by posttranslational modification may also explain the more prominent localization of HIN1 in nuclear speckles during stress and could explain the observation that 35S:HIN1 had almost no additional effect on splicing during low ψw stress. Whether speckle-localized HIN1 is active or whether the speckles are sites where HIN1 is sequestered, as is thought to be the case for other splicing factors, is uncertain.

Possible mechanisms of HAI1-HIN1 function in splicing regulation. Dephosphorylation of HIN1 by HAI1 (and other Clade A PP2Cs) may affect HIN1 interaction with other splicing-related proteins. HIN1 binding to exon regions near intron–exon junctions may also affect the recruitment or exclusion of other splicing factors, including those that bind to similar RNA cis-element sequences as HIN1. An alternative, and not mutually exclusive, hypothesis is that HIN1 recruits HAI1 (and perhaps other Clade A PP2Cs) to specific sites in order to facilitate dephosphorylation of other target proteins, such as RSZ22 and other splicing-related proteins identified by phosphoproteomic analysis of hai1-2 (24). One or a combination of these mechanisms underlie the ability of HIN1 to affect splicing of specific IR-prone introns.

The increased prevalence of HIN1 nuclear speckles in the hai1-2 background and decreased IR of HIN1-affected genes in hai1-2 indicate that HAI1 affects HIN1 function. However, it is unlikely that this occurs via effects on HIN1 RNA binding since this occurred at the N-terminal region, while HAI1 interacted with and dephosphorylated the C-terminal portion of HIN1. Instead, HAI1 regulation of HIN1phosphorylation could affect interaction with other splicing-related proteins. Another, not mutually exclusive, possibility is that recruitment by HIN1 brings HAI1 to the appropriate location to dephosphorylate splicing factors such as RSZ22 or others (Fig. 6). RSZ22 is known to be affected by phosphorylation as phosphatase and kinase inhibitors alter its mobility and distribution in nuclear speckles (50 ⇓ –52). More generally, splicing factor phosphorylation is known to be affected by abiotic stress (32, 53, 54), and our data indicate that HAI1–HIN1 interaction is part of a previously unknown mechanism connecting stress signaling to splicing factors.

Of the HIN1-related genes in Arabidopsis, only LIMYB has been characterized. LIMYB was reported to interact with Ribosomal Protein L10 and down-regulate expression of ribosomal protein genes leading to a general down-regulation of translation to defend against viral replication (33). They interpreted the function of LIMYB as that of a transcription factor and found enrichment of LIMYB on ribosomal gene promoters in chromatin immunoprecipitation assays. However, they did not test whether LIMYB directly bound DNA or was part of a larger protein complex associated with ribosomal genes, perhaps including proteins involved in cotranscriptional processing of ribosomal RNAs. Regardless of whether LIMYB is also an RNA binding protein, its function seems divergent from that of HIN1 as limyb did not affect growth, and LIMYB only weakly interacted with HAI1. Also, the 2 experimentally observed phosphorylation sites in the C-terminal domain of HIN1, which are likely targets of HAI1 dephosphorylation, are not conserved in LIMYB (SI Appendix, Fig. S4). It must also be noted that fresh weight and dry weight of hin1 mutants was reduced by nearly one-fourth even in the unstressed control. Whether or not this reduced growth was caused solely by effects on splicing is unknown. While our data establish a role for HIN1 in splicing regulation, we do not rule out the possibility that HIN1 has additional functions which may affect the physiological phenotypes of hin1 mutants or 35S:HIN1 plants.

Other studies of stress-induced changes in splicing patterns have not, to our knowledge, observed the shift toward increased splicing efficiency of IR-prone introns that we observed. In fact, it has been reported that short-term salt stress or heat stress increased the prevalence of IR (9, 55). A key difference between our study and others is that we exposed plants to a moderate-severity low ψw over 96 h to allow time for stress acclimation. Genes where HIN1 and low ψw led to enhanced splicing efficiency were enriched for stress and signaling-related GO terms. A number of these were Early Response to Desiccation (ERD) genes including ERD6, 7, 10, 13, and 14 (Datasets S9 and S15) as well as other genes known to be transcriptionally induced by stress or involved in stress signaling (for example, MAP Kinase 3 and many genes related to calcium signaling). Total transcript level of these genes was generally unchanged in our longer-term low ψw treatment or by 35S:HIN1. For genes such as the ERDs, the implication is that these genes were transcriptionally regulated in the early acute phase of stress response, while splicing regulation of these genes is initiated later, or persists longer, during low ψw acclimation. In the unstressed control, a high level of IR leading to the production of unstable transcripts or transcripts that do not encode an active protein may be means to ensure that production of stress-associated proteins is turned off. Increased splicing efficiency during stress would amplify the transcriptional activation or extend the duration of increased protein production from these genes during drought acclimation.

Manual inspection of stress and 35S:HIN1-affected transcripts found that nearly all of the IR transcripts had premature stop codons, consistent with other studies of plant alternative splicing (10). If the transcript is stable and transcribed, it is likely that the truncated protein produced is nonfunctional. However, in some cases, a protein of altered function may be produced. For example, IR of the last intron of NTL6/NAC062 (AT3G49530 Fig. 4C) produces a transcript encoding a truncated protein lacking the membrane tethering domain. This truncated protein would no longer need to be released from the plasma membrane by stress- or ABA-induced cleavage to enter the nucleus and regulate gene expression (56, 57). Interestingly, intron retention leading to a similar C-terminal truncation of an orthologous NAC transcription factor has been observed in maize (58), suggesting that alternative splicing may be a conserved mechanism to regulate NAC transcription factor function. The effect of HIN1 on growth at low ψw may represent a cumulative effect of altered function or altered abundance of signaling and stress-related proteins, such as NAC062, encoded by genes with HIN1-regulated changes in IR. Case-by-case characterization is required to determine which of the IR transcripts we observed lead to decreased protein level and which lead to the production of functionally distinct isoforms. As stated above, we do not rule out the possibility that other aspects of HIN1 molecular function could also contribute to its physiological phenotypes.

Our data also show that HAI1 may interact with several splicing factors. This is consistent with phosphoproteomic analysis of hai1-2 that identified putative HAI1 dephosphorylation targets related to splicing and RNA processing (23). Characterization of these additional HAI1-regulated proteins will be a promising approach to reveal further connections of stress signaling to pre-mRNA splicing. This could include regulation of IR events and splice donor–acceptor site changes affected by stress but not affected by HIN1. This in turn will lead to better overall understanding of how and why abiotic stress has such a pronounced effect on alternative splicing, a question which continues to become more prominent in plant stress biology.