A Protein Antagonist of Activation-Induced Cytidine Deaminase Encoded by a Complex Mouse Retrovirus

Complex retroviruses, such as human-pathogenic immunodeficiency virus type 1 (HIV-1), cause many human deaths. These retroviruses produce lifelong infections through viral proteins that interfere with host immunity. The complex retrovirus mouse mammary tumor virus (MMTV) allows for studies of host-pathogen interactions not possible in humans. A mutation preventing expression of the MMTV Rem protein in two different MMTV strains decreased proviral loads in tumors and increased viral genome mutations typical of an evolutionarily ancient enzyme, AID. Although the presence of AID generally improves antibody-based immunity, it may contribute to human cancer progression. We observed that coexpression of MMTV Rem and AID led to AID destruction. Our results suggest that Rem is the first known protein inhibitor of AID and that further experiments could lead to new disease treatments.

we performed transfection experiments in 293T cells with plasmids expressing either untagged or tagged rem expression plasmids. Both plasmids produced Rem-CT (Fig. 1B), which has no known function.
To determine Rem-CT activity in vivo, we used the previously described infectious MMTV provirus carrying the downstream SD mutation (MMTV-SD) (Fig. 1C). The SD mutation consists of 6 mutations, resulting in a single valine-to-leucine change in the Env protein and no replication defects in tissue culture (9). MMTV requires superantigen (Sag)-mediated amplification in T cells and mature B cells for transmission to the mammary gland prior to breast cancer induction (10,11,30). The MMTV-SD mutant was previously shown to produce normal viral RNA levels as well as intracellular Gag levels equivalent to those seen with wild-type MMTV (MMTV-WT) in tissue culture (Fig. 1D) (9). As expected, reverse transcription-PCR (RT-PCR) analyses of spliced viral mRNAs indicated loss of rem upon introduction of the SD mutation (see Fig. S1A in the supplemental material). Similar levels of SP activity were produced from Env processing after transfection of mutant and wild-type MMTV proviruses (Fig. S1B), indicating that absence of Rem expression does not affect Env production in cultured fibroblasts. In agreement with previous results (9), the fraction of animals with mammary tumors decreased and levels of tumor latency increased after inoculation of MMTV-SD compared to MMTV-WT (Fig. 1E). To determine if the SD mutation affected proviral copy numbers in tumors, proviral loads were assessed by PCR. MMTV-WT-induced or MMTV-SD-induced tumors had increased loads compared to the levels of DNA from uninfected mice containing copies of endogenous Mtvs (Fig. 1F), suggesting that these tumors had acquired between 4 and 7 haploid proviruses. The average proviral load of three independent tumors from different mice revealed that the loads in tumors induced by MMTV-WT were only slightly higher than, but significantly different from, those in tumors induced by MMTV-SD.
Increased Apobec-mediated mutations in MMTV proviruses lacking Rem expression. Apobec family proteins are known restriction factors for MMTV replication (22). In specific retroviruses, Apobec-mediated cytidine deaminations lead to G-to-A transition mutations on the plus strand of proviral DNA (23,31). We considered whether Rem was an antagonist of Apobec enzymes and whether the decreased proviral load in MMTV-SD-induced tumors relative to that of MMTV-WT-induced tumors was the result of Apobec-mediated hypermutation. Therefore, we used mammary tumors from five different mice to extract DNA for mutational analysis by PCR of the env region followed by cloning and Sanger sequencing. This method allowed us to obtain a longer region from independent proviral integrations, to avoid the necessity of analysis of the related endogenous Mtvs, and to verify whether the original SD mutation was maintained. We selected ϳ5 to 6 clones from each tumor based on calculations of proviral loads and eliminated clones with the same mutations to avoid multiple samplings of the same proviruses. The combined plus-strand sequencing results from all proviruses indicated increases in transition and transversion mutations within MMTV-SD proviruses compared to MMTV-WT proviruses, with the largest difference in C-to-T transitions seen on the plus strand (Table 1). Such mutations are typical of Apobec family members (12), although C-to-T mutations induced by mA3 during reverse transcription are expected to occur on the proviral minus strand (12,17). Further, both transition mutations and transversion mutations were elevated on the plus strand of MMTV-SD proviruses relative to MMTV-WT proviruses (Table 1). Together, these results indicate that alteration of rem and/or sag mRNA synthesis as a consequence of the SD mutation (i) reduced the ability of MMTV to induce mammary tumors, (ii) decreased the proviral load, and (iii) diminished the genomic integrity of proviruses in these tumors.
Next, proviral sequences were analyzed for mutations occurring in consensus sites that are preferentially targeted by different Apobec family members. Apobec3 is known to induce G-to-A hypermutation on the retroviral plus strand of some MuLVs during reverse transcription in lymphocytes (23,32). However, the Apobec family member AID acts on both host DNA strands, leading to G-to-A and C-to-T mutations on the coding strand of the variable region of the immunoglobulin genes (28). AID-induced immu-noglobulin gene mutations often are followed by repair of deaminated cytidines by an error-prone process (33,34), resulting in additional mutations. AID activity also results in class switch recombination (28) and, in some cases, cancer (29). This cytidine deaminase is preferentially expressed in germinal center B cells (35), and MMTV requires replication in mature B cells for efficient mammary gland transmission (10). Interestingly, a clear selection was observed for SD site repair since 22% of the MMTV-SD proviruses isolated from mammary tumors contained a wild-type splice site, presumably resulting from recombination with endogenous Mtvs, since all 6 bp of the mutant SD site sequence were restored. Therefore, we explored whether the increased levels of mutations observed in MMTV-SD proviruses that retained the SD site mutation (MMTV-SD non-recombinants) and in those that had repaired the SD site mutation (MMTV-SD recombinants) were consistent with preferred AID and mA3 motifs.
Preliminary examination of motifs associated with Apobec-mediated mutagenesis revealed differences between MMTV-WT and MMTV-SD proviruses (Table 1). Since the variation in the number of mutations among clones was large, we applied a nonparametric statistical test to examine significant differences in the distributions of the numbers of mutations/clone. Mutations observed in MMTV proviruses were analyzed for the WRC motif, reported to be a "hot spot" for AID-induced base changes within immunoglobulin genes (36). The distribution of clones with cytidine mutations in the WRC context was not significantly different between MMTV-WT and MMTV-SD nonrecombinant proviruses ( Fig. 2A). However, the clones carrying the splice site reversion (MMTV-SD recombinants), which would allow normal sag and rem mRNA splicing (1,5), had an increased number of cytidine changes compared with either the MMTV-WT or MMTV-SD nonrecombinant proviruses. In contrast, the distribution of cytidine mutations in the SYC context, which has been described previously as an AID "cold spot" (37,38), was not significantly different across any of the three groups of proviruses (Fig. 2B). We also analyzed cytidine changes in two other sequence contexts (TYC and ATC). Mutations in TYC motifs are typical of mA3 (23,39), but changes in the ATC motif were detected by our TransitionFinder software. Analysis of the distribution of TYC mutations/clone indicated a significant difference in the MMTV-SD proviruses regardless of the splice donor site reversion (Fig. 2C). Mutations in the ATC context have also been associated with mA3 in vitro (40), and analysis of the distribution of mutations/clone indicated a significant enrichment only in the MMTV-SD recombinant proviruses (Fig. 2D). Interestingly, the MMTV-SD recombinant proviruses, which correct the SD mutation by recombination with Mtvs, have multiple base changes, but lack stop codons in the envelope region (Fig. S2), suggesting selection for replication-competent viruses after Apobec-mediated mutagenesis. a Number of mutations/number of clones was determined on the basis of Sanger sequencing of 1,100 bp of the plus strand of at least five proviral env gene clones obtained from each of five independent BALB/c tumors induced by MMTV-WT (n ϭ 25) or MMTV-SD (n ϭ 32). Frequencies of WRC, SYC, TYC, and ATC-motif mutations were calculated from both strands. W ϭ A or T; R ϭ A or G; S ϭ G or C; Y ϭ C or T. For statistical analysis of the data, see scatter plots in Fig. 2. Boldface data highlight the greatest differences between MMTV-WT and MMTV-SD sequences typical of AID-mediated mutagenesis.
Increased Apobec-mediated mutations in Sag-independent MMTV (TBLV) lacking Rem expression. To address specifically whether MMTV-encoded Rem or Sag accounts for differences in Apobec-induced mutations in tumor-derived proviruses, we used the MMTV-related retrovirus, type B leukemogenic virus (TBLV). TBLV encodes a truncated, non-functional sag gene and lacks transmission to the mammary gland. Infection with TBLV, which has a T-cell-tropic LTR enhancer, induces lymphomas, rather than breast cancer, by insertional mutagenesis (41)(42)(43). The infectious TBLV molecular clone (TBLV-WT) differs from MMTV-WT only within the U3 region and does not require sag for replication in vivo (41). We introduced the SD mutation into TBLV-WT (TBLV-SD) prior to transfection into human Jurkat T cells, which lack endogenous Mtvs. As demonstrated for MMTV, env mRNA levels and SP activity remained relatively unaffected by the SD mutation, while rem mRNA production was blocked ( Fig. S3A and B). No differences in the levels of virus production were observed between cells expressing TBLV-WT and those expressing TBLV-SD (Fig. S3C). In contrast to results with MMTV, injection of TBLV-WT and TBLV-SD into BALB/c mice gave no difference in incidence or latency of thymic lymphomas (Fig. 3A), although, like MMTV-induced tumors, we observed statistically significant differences in proviral loads (Fig. 3B). These results suggest that the altered tumor incidence and latency seen with the mice inoculated with MMTV-WT relative to those inoculated with MMTV-SD were due to the differences in the levels of sag mRNA expression from the intragenic env promoter (44) as originally reported (9). Both the MMTV-SD-induced tumors and the TBLV-SD-induced tumors had lower proviral loads than their wild-type counterparts, suggesting that Rem C-terminal sequences may counteract factors that limit virus replication in BALB/c mice. Since TBLV does not replicate in the mammary gland, restriction appears to occur at an early step after viral infection, such as during replication in B and T cells, prior to virus transmission to the mammary gland (2).
To test whether TBLV-SD proviruses also showed evidence of Apobec-mediated mutagenesis, independent T-cell lymphomas from three different wild-type TBLVinfected or SD mutant-infected mice were used for PCR amplification of two different proviral regions (the envelope-3= LTR and polymerase regions). Because TBLV induces polyclonal tumors (45), unlike the more clonal MMTV-induced mammary cancers (46), high-throughput sequencing was used initially for the analysis of proviral mutations. We mapped proviral paired-end reads from both TBLV-WT-induced and TBLV-SDinduced tumors to the original sequence of the cloned provirus. After subtracting a 3% error rate for each base due to PCR effects, data from averaged reads indicated SD site reversion within some proviruses in the mutant-induced tumors relative to TBLV-WTinduced tumors, consistent with selection to maintain Rem expression (see Table S1 in the supplemental material).
To assess Apobec-induced mutations in proviruses from TBLV-SD-induced tumors compared to those from TBLV-WT-induced tumors, we eliminated likely PCR duplicates and quantified the number of reads aligned with each base compared to the reference sequence. The dichotomized alternative base frequency was then modeled as a function of sample group (WT or SD) using a mixed-effects logistical regression approach incorporating a random effect accounting for intersample variation within the group. The G-to-A changes on the plus strand within TBLV-SD proviruses were highly statistically significantly different from those observed within TBLV-WT proviruses (Fig. 3C), in agreement with the mA3-induced changes (12,25,32). We also observed increased A-to-G, C-to-T, and T-to-C transitions on the plus strand of TBLV-SD proviruses relative to TBLV-WT proviruses, which are not typical of mA3 (25, 32) ( Fig. 3D; see also Fig. S4). Differences were not observed between Gapdh gene sequences obtained from TBLV-WT-induced and SD-induced tumors after analysis by the same method (Fig. S4). Since TBLV does not induce mammary tumors (47)(48)(49), these data were consistent with generation of both mA3 mutations and non-mA3 mutations during viral replication in hematopoietic cells.
To analyze longer regions for detection of Apobec-mediated consensus site mutations present only within the envelope gene, DNA samples from three different polyclonal TBLV-induced T-cell lymphomas were used for PCR. Individual clones then were subjected to Sanger sequencing. Analysis of Ͼ50 env clones from either the TBLV-WT-induced or TBLV-SD-induced tumors revealed increased levels of transition mutations in acquired proviruses, with the highest increase in C-to-T mutations on the plus strand (Table 2). These data were slightly different from the high-throughput results since the latter data included the pol, env, and 3= LTR regions and did not distinguish the 9% of the TBLV-SD clones with SD site reversion by recombination with indicated by an asterisk (P Ͻ 0.05). NS ϭ not significant. (C) Analysis of the average number of G-to-A mutations above a 3% threshold within proviruses obtained from three independent TBLV-WT-induced or TBLV-SD-induced tumors after PCR and Illumina sequencing. (D) Analysis of the average number of T-to-C mutations above a 3% threshold within proviruses obtained from three independent TBLV-WTinduced or TBLV-SD-induced tumors after PCR and Illumina sequencing. (E to H) Analysis of independent clones obtained after PCR and Sanger sequencing. TBLV-SD proviral clones recovered from T-cell tumors were classified as non-recombinant (retaining the inoculated TBLV-SD sequence) or recombinant (carrying the wild-type SD2 sequence after recombination with endogenous Mtvs). The number of C mutations within different motifs on either proviral strand is presented graphically for each clone. Previous results showed that the levels of human A3G mutations are highest for single-stranded DNA mutagenesis just 5= to the polypurine tracts, a region targeted by our Sanger sequencing (50). Moreover, four out of five recombinants (80%) had stop codons in the env gene (Fig. S5). As a control, Sanger sequencing of multiple clones of the c-Myc gene were analyzed, and no difference was observed in the number of sequence changes between clones from TBLV-WT-induced and TBLV-SD-induced tumors (Table S2). These results suggested that, in contrast to MMTV-SD recombinants, little selective pressure was applied to maintain the integrity of TBLV proviruses after Mtv recombination, which presumably occurred in lymphoid cells.
We also analyzed the distribution of mutations within specific sequence motifs in TBLV proviruses from T-cell tumors. TBLV-SD proviruses had increased mutations in Apobec-associated sequence motifs (Table 2). Statistical analysis of the distribution of WRC and SYC-motif mutations/clone within indicated a significant increase in the distribution of mutations/clone within TBLV-SD recombinant proviruses compared to TBLV-WT and TBLV-SD non-recombinant proviruses ( Fig. 3E and F). The distribution of TYC motif mutations was significantly different in TBLV-SD proviruses regardless of the splice-donor site reversion by recombination (Fig. 3G). Using the same type of analysis, the ATC motif mutations differed between TBLV-WT and TBLV-SD recombinant proviruses (Fig. 3H). We also examined the relationship between recombination-mediated SD site repair and the number of mutations in WRC, SYC, TYC, and ATC motifs observed in MMTV and TBLV proviruses from T-cell tumors (Fig. 3E to H). A statistically significant correlation was observed between SD site repair and mutation in each of these motifs ( Fig. 3I and J; see also Fig. S7). These results suggest that viral replication in lymphoid cells allows both recombination with endogenous Mtv transcripts to repair the SD mutation as well as Apobec-mediated hypermutation. Together, these data are consistent with multiple Apobec-type mutations typical of AID and mA3, which occur during a common replication step for both TBLV and MMTV in hematopoietic cells (41).
Rem independence of WRC and TYC motif mutations within MMTV proviruses in AID-deficient mice. To test whether AID is responsible for increased MMTV-induced mutations in the absence of Rem, we infected AID-deficient (Aicda Ϫ/Ϫ ) mice on the BALB/c background with MMTV-WT or MMTV-SD. Mammary tumors developed more slowly in the Aicda Ϫ/Ϫ mice after inoculation of MMTV-SD than after inoculation of MMTV-WT (Fig. 4A), but this result was not statistically different from that obtained in wild-type BALB/c (Fig. S6). Nevertheless, we observed that the proviral loads were unchanged between MMTV-WT-induced and MMTV-SD-induced tumors in Aicda Ϫ/Ϫ mice, whereas proviral loads in either MMTV-WT-induced or TBLV-WT-induced tumors differed from those in MMTV-SD-induced or TBLV-SD-induced tumors in wild-type a Number of mutations/number of clones was determined on the basis of Sanger sequencing of 1,100 bp of the plus strand of at least fifteen proviral env gene clones obtained from each of three independent BALB/c tumors induced by TBLV-WT (n ϭ 50) or TBLV-SD (n ϭ 55). Frequencies of WRC, SYC, TYC, and ATC-motif mutations were calculated from both strands. W ϭ A or T; R ϭ A or G; S ϭ G or C; Y ϭ C or T. For statistical analysis of the data, see scatter plots in Fig. 3. Boldface data highlight the greatest differences between MMTV-WT and MMTV-SD sequences typical of AID-mediated mutagenesis.
BALB/c mice (Fig. 4B). These results suggested that differences in proviral load were due to AID-mediated mutagenesis in AID-expressing mice.
To determine whether proviral mutagenesis was affected in the absence of AID, we obtained DNA from five independent tumors induced by MMTV-WT and MMTV-SD for Sanger sequence analysis of proviruses after PCR and cloning. We observed that 62.5% of the clones from MMTV-SD proviruses had reversion of the original SD mutation,  (Table 3). Further, the mutation frequencies in the AID-associated WRC motif seen in MMTV-WT and MMTV-SD proviruses did not differ in Aicda Ϫ/Ϫ mice ( Table 3). The mutation frequency in the WRC motif of MMTV-SD proviruses also greatly declined in Aicda Ϫ/Ϫ mice compared to wild-type BALB/c mice (compare Table 1 to Table 3). Analysis of the distributions of mutations/clone showed no significant differences between WRC, TYC, or ATC motif mutations among the MMTV-WT and either the MMTV-SD non-recombinant proviruses or the recombinant proviruses in Aicda Ϫ/Ϫ mice (Fig. 4C, E, and F). In contrast to the results from MMTV-infected BALB/c wild-type mice, the number of mutations/clone in SYC sites showed enrichment in MMTV-SD recombinant proviruses (Fig. 4D). The correlation between SYC motif mutations and SD site reversion resulting from recombination with endogenous Mtvs was maintained in both BALB/c and Aicda Ϫ/Ϫ mice ( Fig. S7 and Fig. S8). Due to a lack of mutations in MMTV-SD recombinants recovered from tumors in Aicda Ϫ/Ϫ mice, correlation coefficients could not be calculated for the WRC, TYC, or ATC motifs (Fig. S8). Together, these results indicated that Rem antagonizes AID and cytidine deaminases that induce SYC, TYC, and ATC motif mutations. Our data suggest that loss of Rem activity is analogous to loss of HIV-1-encoded Vif since absence of Vif expression leads to hypermutation of the viral genome by specific Apobec3 cytidine deaminases (16-19, 51, 52).
To directly analyze differences in distribution of specific MMTV proviral mutations in the presence and absence of AID, we compared the numbers of mutations/clone in MMTV-WT proviruses from mammary tumors in BALB/c mice versus Aicda Ϫ/Ϫ mice (Fig. 5). None were significantly different. In contrast, the number of C-to-T mutations as well as WRC and TYC motif mutations within MMTV-SD proviruses (MMTV-SD recombinants and MMTV-SD non-recombinants combined) significantly declined in Aicda Ϫ/Ϫ mice compared to wild-type BALB/c mice (Fig. 5B, C, and E), whereas those in SYC motifs increased in AID-knockout mice (Fig. 5D). Only mutations in the ATC motif showed a marginal (but insignificant) decline when comparing the number of proviral MMTV-SD mutations/clone between BALB/c and Aicda Ϫ/Ϫ mice (P ϭ 0.08) (Fig. 5F). These results are consistent with the interpretation that the absence of Rem expression leads to increased mutations by AID, similar to HIV-1 proviral hypermutation by A3G in the absence of Vif (16,51).
AID, but not mA3, proteasomal degradation in the presence of rem. APOBEC3 enzymes that are incorporated into HIV-1 virions in the absence of Vif activity deaminate cytidines on negative-strand DNA during reverse transcription in the target cell a Number of mutations/number of clones was determined on the basis of Sanger sequencing of 1,100 bp of the plus strand from at least five proviral env gene clones obtained from five independent BALB/c Aicda Ϫ/Ϫ tumors induced by MMTV-WT (n ϭ 25) or MMTV-SD (n ϭ 40). Frequencies of WRC, SYC, TYC, and ATC-motif mutations were calculated from both strands. W ϭ A or T; R ϭ A or G; S ϭ G or C; Y ϭ C or T. For statistical analysis of the data, see scatter plots in Fig. 4. Boldface data highlight the greatest differences between MMTV-WT and MMTV-SD sequences typical of AID-mediated mutagenesis. (12). Vif expression targets APOBEC3 for proteasomal degradation (53), leading to decreased deaminase packaging and proviral mutagenesis. Since murine AID belongs to the Apobec family (12), we determined whether AID is packaged into MMTV virions. As a control for deaminase packaging, 293T cells were transiently transfected with expression plasmids for cytomegalovirus (CMV) promoter-driven MMTV-WT as well as mA3 tagged with hemagglutinin (mA3-HA). Previous experiments have shown that the mA3 deaminase is packaged in MMTV particles isolated from virus-containing milk (22). Virus released from 293T cells was concentrated, and Western blotting was performed on concentrated cell supernatants after treatment with subtilisin to remove cellular proteins on the surface of virions. MMTV Gag-specific antibody detected capsid (CA) proteins both in the cell extracts and in the concentrated supernatants. As expected, Singh et al.

®
we also observed mA3 in cell extracts and in supernatants (Fig. 6A). The presence of mA3 remained detectable after subtilisin treatment, confirming that mA3 is present within viral cores (22). Similar experiments were performed to determine whether AID was packaged in MMTV particles by cotransfection of CMV-MMTV-WT and murine AID (mAID) into 293T cells. Enrichment of the cleaved Gag protein was easily observed in cell supernatants containing virion particles compared to extracts (Fig. 6B). Under these conditions, Western blotting with AID-specific antibody easily detected mAID in cell extracts, but not in cell supernatants (Fig. 6B). To confirm the effectiveness of the subtilisin, Rev-like SP protein also was not detectable in viral cores after treatment (Fig. 6C). These experiments suggest that AID induces MMTV hypermutation without virion incorporation.
Since HIV-1 Vif expression induces proteasomal degradation of specific APOBEC3 enzymes (53), we tested whether Rem expression from the wild-type provirus affects (E) Rem expression leads to decreased mAID-GFP levels and is dependent on the proteasome. Cells (293 line) were cotransfected with expression plasmids for mAID-GFP and either GFP-tagged or untagged Rem. Samples in even-numbered lanes were prepared from cells treated with the proteasomal inhibitor MG-132. Western blots of cell extracts were incubated with GFP-specific or actin-specific antibody (upper and lower panels, respectively). (F) Rem expression does not affect mA3-HA levels. Cells (293 line) were transfected with the indicated amount of mAID-GFP expression vector or mA3-HA in the presence or absence of the indicated amount of untagged Rem expression plasmid. The blot shown at upper left was incubated with GFP-specific antibody, and that shown at upper right was incubated with HA-specific antibody. Actin-specific antibody (lower panels) was used to verify protein loading.
A Protein Antagonist of AID ® AID levels. Human Jurkat T cells were transfected with a plasmid expressing murine AID with a C-terminal green fluorescent protein (GFP) (mAID-GFP) in the presence or absence of the wild-type provirus (Fig. 6D). Western blotting of transfected cell extracts and incubation with GFP-specific antibody showed mAID-GFP levels that were greatly diminished by the presence of the Rem-expressing genome (compare lanes 1 and 3). In contrast, cotransfection with the Rem-null provirus did not affect AID levels (lane 5).
To confirm that Rem expression was responsible for decreased AID levels, we cotransfected HEK293 cells with N-terminally GFP-tagged Rem or untagged Rem with mAID-GFP. The presence of either form of Rem protein decreased detectable AID levels ( Fig. 6E; compare lane 1 with lane 3 or lane 5); AID levels were rescued in the presence of the proteasomal inhibitor MG-132 (Fig. 6E, even-numbered lanes). As previously reported, the GFP-Rem precursor was stabilized more than the cleaved GFP-SP product due to precursor susceptibility to endoplasmic reticulum-associated degradation (ERAD) (compare lanes 5 and 6) (6). These results are consistent with Vif-like activity of Rem. Since we also detected mutations in mA3 motifs preferentially in proviruses from Rem-null virus-induced tumors, we transfected HEK293 cells with Rem expression vectors in the presence of plasmids expressing mAID-GFP or mA3 C-terminally tagged with HA (mA3-HA). As expected, higher Rem levels led to a greater reduction of mAID levels (Fig. 6F, compare lanes 1, 2, and 3). In contrast, the same Rem concentrations had no effect on mA3 expression (Fig. 6F, compare lanes 4, 5, and 6). Thus, our data indicate that mAID is targeted for proteasomal degradation in the presence of Rem, whereas mA3 is not.

DISCUSSION
Previous data from studies of mA3-insufficient mice indicated that MMTV replication and tumorigenesis are inhibited by members of the Apobec family of cytidine deaminases (12). No MMTV-specified inhibitors of these enzymes have been reported, including in a recent report that concluded on the basis of transfection experiments in 293T cells that MMTV does not encode an mA3 inhibitor (54). Our experiments are consistent with the need for Rem C-terminal sequences to antagonize the mutagenic effects of the Apobec cytidine deaminase AID during MMTV replication in lymphocytes.
Multiple results support this conclusion. (i) Tumors induced by wild-type MMTV had low, but reproducibly higher, proviral loads than tumors induced by viral strains that lack Rem expression (MMTV-SD). We found that the difference in proviral loads was abolished in tumors induced by MMTV in Aicda Ϫ/Ϫ mice, suggesting that AID is a restriction factor for MMTV ( Fig. 1 and Fig. 4). (ii) Loss of Rem expression led to an increase in G-to-A mutations as well as in other transition mutations on the proviral plus strand (Table 1 and Table 2). (iii) Mutation frequency and numbers of mutations per proviral clone in the WRC motif (typical of AID) were elevated in tumors induced by MMTV-SD proviruses compared to tumors induced by MMTV-WT proviruses. This mutation pattern was observed in proviruses obtained from wild-type BALB/c mouse tumors, but not Aicda Ϫ/Ϫ BALB/c mouse tumors (Table 1 and Table 3) (Fig. 5). (iv) Tumors induced by a second Sag-independent MMTV strain, TBLV-WT, also showed an increase in proviral load compared to tumors induced by the Rem-defective virus (TBLV-SD) (Fig. 3). Increased mutations in WRC motifs were observed in TBLV-SD proviruses compared to TBLV-WT proviruses using Sanger sequencing of cloned proviruses (Fig. 3) (Table 2). Although MMTV-SD is defective for production of both sag mRNA from the intragenic env promoter and rem mRNA (Fig. 1C) (9), TBLV is a Sag-independent virus (41) and does not replicate in the mammary gland. Thus, loss of Rem expression, and not loss of Sag expression, is responsible for inhibition of MMTV replication by cytidine deamination in hematopoietic cells (Fig. 3) (Table 2). (v) Rem coexpression led to proteasomal degradation of AID, but not mA3 (Fig. 6). Our results suggest that Rem specifies a Vif-like factor (12) that antagonizes the AID restriction factor in hematopoietic cells prior to mammary gland transmission (3).
We observed increased mutations within proviruses obtained from MMTV-SDinduced mammary tumors or from TBLV-SD-induced lymphomas within TYC motifs (typical of mA3) as well as WRC motifs (typical of AID). Mutations in both motifs greatly decreased in tumor-derived MMTV proviruses from AID-insufficient mice relative to AID-expressing mice (compare Table 1 to Table 3 and Fig. 2 to Fig. 4). Since Rem coexpression did not affect mA3 levels, one interpretation is that AID causes both WRC-motif mutations and TYC-motif mutations. However, the number of WRC-motif mutations/clone increased in TBLV-SD recombinants from BALB/c mice, whereas the number of TYC mutations/clone decreased (compare Fig. 3E and G), suggesting that these mutations may be due to the activity of different enzymes. Furthermore, the number of TYC mutations/clone increased in TBLV-SD non-recombinants relative to TBLV-WT (Fig. 3G), but the same was not true for the WRC motif (Fig. 3E). Also, AID-mediated TYC-motif mutations have not been described in immunoglobulin genes (55), although the mechanism of AID-induced proviral mutations may be different.
Previous studies have suggested that sequence changes at the SYC motif represent AID-induced "cold spot" mutations (38,55). Since Aicda Ϫ/Ϫ mice showed an increase in the level of SYC-motif mutations in proviruses from MMTV-SD-induced tumors ( Table 3) (Fig. 4), our results imply that these mutations are due to the activity of an unidentified cytidine deaminase. In addition, proviral mutations in the ATC motif did not segregate with WRC or TYC-motif mutations either in MMTV-SD-induced mammary tumors from Aicda Ϫ/Ϫ mice (Fig. 4) or in TBLV-SD-induced T-cell tumors from wild-type BALB/c mice (Fig. 3). Possible explanations for our data include: (i) decreased AID levels directly or indirectly affect the activity of other Apobec family members, such as mA3, or (ii) the Aicda gene knockout mutation affects the types of cells infected by MMTV. Infection of additional Apobec-knockout mouse strains should address this issue.
One striking observation is the enrichment of sequence alterations within Apobecspecific motifs in proviruses that had repaired the SD mutation. SD-site reversion occurred at all 6 modified bases, probably by recombination with one of the complete endogenous proviruses in BALB/c mice as observed previously for other MMTV-induced tumors (9,56,57). The high percentage of recombinants indicates strong selection for the restoration of splicing at the env gene SD site, which results in production of either rem mRNA from the LTR promoter or sag mRNA from the intragenic env promoter (9, 44) (Fig. 1C). We believe that SD-site reversion does not select for wild-type envelope protein production since the SD mutation produces a single valine-to-leucine change. Moreover, proviral loads differed only between MMTV-induced-SD and WT-induced tumors in BALB/c mice and not between those in Aicda Ϫ/Ϫ mice ( Fig. 2 and Fig. 4), indicating the absence of an Env-specific replication effect in vivo when the selective effect of AID expression was removed. SD site reversion is not due to sag expression in T-cell tumors since TBLV does not encode a functional Sag protein (41). Interestingly, the number of recombinants regenerating the SD site correlated with the numbers of TBLV and MMTV proviral mutations in either WRC or TYC motifs in tumors in AIDsufficient mice (Fig. 3I and J), suggesting that recombination and mutation may occur in the same cells. These mutations likely occur in lymphocytes prior to mammary gland transmission since (i) Sag-mediated T-cell deletion is delayed in MMTV-SD-infected BALB/c mice, as we previously showed (9); (ii) endogenous Mtvs capable of recombination with exogenous MMTV are expressed specifically in lymphocytes and not in mammary gland cells (58); and (iii) TBLV replicates at high levels in T cells and does not induce mammary cancer (47,49). We speculate that virus replicating in B and T cells is mutated by AID and other Apobec enzymes prior to mutant RNA packaging with endogenous Mtv RNA and subsequent recombination during the next round of reverse transcription. Therefore, the appearance of SD site recombinants in both MMTVinduced and TBLV-induced tumors argues that Rem production provides a selective advantage for virus propagation of both Sag-dependent and Sag-independent viruses. The appearance of larger numbers of recombinant MMTV proviruses without stop codons in the envelope gene compared to TBLV recombinant proviruses argues that heavy selection of MMTV proviruses occurred during viral transmission to mammary tissue (see model in Fig. 7). MMTV replication-independent activation of B cells has been reported (59), and TBLV may also activate and infect B cells without superantigen.
Additional experiments analyzing proviral mutations and recombination with endogenous Mtvs in different lymphocyte subsets are needed.
Results presented here indicated that AID, but not mA3, is degraded after coexpression with GFP-tagged or untagged Rem (Fig. 6). AID degradation by Rem was rescued by the proteasomal inhibitor MG-132, a result reminiscent of HIV-1 Vif activity on human APOBEC3G (hA3G) (16)(17)(18)(19). Vif has been shown to act as an adapter between hA3G and a Cullin5 E3 ligase complex, resulting in hA3G ubiquitylation and proteasomal degradation (52,60). These results strongly suggest that Rem is a Vif-like factor that antagonizes the restriction factor AID. Unlike Vif, Rem is synthesized in association with ER membranes (6,8,61), where this precursor is cleaved by signal peptidase to produce SP and Rem-CT (Fig. 1) or retrotranslocated to the cytosol by the p97 ATPase for ERAD (6,8). Further work will be necessary to determine whether cleaved Rem-CT or uncleaved Rem is required for AID proteasomal degradation, the nature of the E3 ligase involved in mAID degradation, the cellular location of mAID during Remmediated degradation, and whether Rem and mAID have a direct interaction. In addition, the lack of mAID packaging into MMTV particles suggests that this cytidine deaminase functions differently on the MMTV genome from human APOBEC enzymes on the HIV genome (62,63). One possibility is that mAID acts on the proviral genome during preintegration complex transit into the nucleus (64).
In summary, our data are consistent with the conclusion that Rem is the first retrovirus-encoded protein antagonist of a deamination-dependent AID activity to be identified. AID is believed to be the primordial member of the Apobec family (12), which likely evolved to antagonize the mutagenic activity of retrotransposons (65), and yet has also been shown to be important for restricting replication of herpesviruses in PCR and high-throughput sequencing. DNA extracted from tumors induced by TBLV-WT and TBLV-SD (3 tumors each) was used for PCR with primers env7254(ϩ) (5=-ATC GCC TTT AAG AAG GAC GCC TTC T-3=) and LTR9604(Ϫ) (5=-GGA AAC CAC TTG TCT CAC ATC-3=) for the envelope region, whereas the primers used for the polymerase region were pol4235(ϩ) (5=-GAA GAG AGC AAT AGC CCT TG-3=) and pol5835(Ϫ) (5=-GAT GAT GTA GTG CGT GGC-3=). For DNA extracted from tumors induced by MMTV-WT and MMTV-SD, C3H LTR420Ϫ (5=-GAT TCA TTT CTT AAC ATA GTA AC-3=) was used as the reverse primer for env gene amplification. The primers for GAPDH have been described previously (1). c-Myc was amplified using primers c-Myc(ϩ) (5=-ATG CCC CTC AAC GTG AAC TT-3=) and c-Myc(Ϫ) (5=-AGG AGG TCC ATC CAA CCT CT-3=). Reactions were performed with JumpStart RED Accutaq LA polymerase (Sigma-Aldrich) in a reaction mixture consisting of the supplied buffer, 500 ng of tumor DNA, 50 pmol of each primer, and a 0.5 mM concentration of each deoxynucleoside triphosphate in 20 l. PCR parameters were: 94°C for 1 min followed by 10 cycles at 94°C for 10 s, 53°C for 30 s, and 68°C for 2 min and then by 25 cycles of 95°C for 15 s, 50°C for 30 s, and 68°C for 2 min and then a final incubation at 68°C for 7 min. Five independent reaction mixtures corresponding to each tumor DNA were pooled for each primer set (except for GAPDH) and used for Illumina sequencing at the UT Austin Genomic Sequencing and Analysis Facility. These results were confirmed by independent cloning of PCR fragments encompassing the SD site and the env gene followed by Sanger sequencing. Sequencing also enabled identification of proviruses containing SD recombinants. Semi-quantitative PCR was performed with Mtvr2 as the singlecopy gene standard using primers Mtvr2(ϩ) (5=-TCT GGG ATC CGC TTC CTC AT-3=) and Mtvr2(Ϫ) (5=-CCA GTC CTT GGC CCT CAT TTA-3=). MMTV primers pol4235(ϩ) and pol5835(Ϫ) in the viral polymerase gene were used to measure proviral sequences.
Sequencing motifs and identification software. Mutations in the WRC and SYC sequence motifs have been associated with "hot spots" and "cold spots", respectively, for AID-mediated hypermutations (36)(37)(38). Mutations in the TYC sequence motif have been linked to mA3 activity (23,39); mutations in the ATC sequence motif have also been associated with mA3 activity on synthetic templates (40). The software TransitionFinder was developed to identify sites of G-to-A and C-to-T mismatches between a given reference FASTA sequence and a user-generated set of FASTA sequences (test sequences). The script is publicly available at https://github.com/haridh/Dudley_Lab_Collab. Statistical analysis. All experiments were performed at least twice with similar results. Statistical differences between tumor development induced by the wild-type and mutant viruses and Kaplan-Meier curves were calculated using SPSS software and the log rank (Mantel-Cox) test with the consulting services of the Department of Statistics and Data Sciences at The University of Texas at Austin. Two-tailed t tests were used for pairwise comparisons. Differences in distributions of numbers of mutations/clone for scatter plots were assessed by nonparametric Mann-Whitney tests. The wide range of variations of mutations/clone prevented statistical comparison of means or medians. Correlation analysis of nonparametric data was performed by calculating Spearman's correlation coefficients. Statistical significances of differences are indicated, and a value of Ͻ0.05 was considered to be significant. Values for numbers of samples analyzed (n) are indicated within individual figures.
For high-throughput Illumina data, the aligned paired-end sequences from the env-3= LTR and pol PCR products were aligned to the reference HYB-TBLV molecular clone using BWA-MEM (79). Likely PCR or optical duplicates were marked using Picard MarkDuplicates (https://broadinstitute.github.io/picard/) and removed using SAMtools (80). The counts of reads aligning to each base in the reference were determined using bam-readcount (https://github.com/genome/bam-readcount). All positions lacking reads were filtered out and were not included in further analysis. Read counts were visualized using ggplot2 (81) and then modeled with respect to dichotomized alternative base frequency using a 3% threshold for PCR error as a function of sample group (either TBLV-WT or TBLV-SD). Modeling was performed with a mixed-effect logistic regression model for each combination of reference base (A, C, G, or T) and alternative base. A similar analysis was performed for the Mus musculus Gapdh sequences (NM_008084.2) obtained by PCR from TBLV-WT-induced or TBLV-SD-induced tumors.
Data availability. We have submitted the high-throughput data to GEO (GSE134189). We also submitted to GenBank the primary sequence data for the TBLV molecular clone used to produce tumors in mice (GenBank accession no. MN126120). The Sanger sequence data are available by request.