Protein functionality as a potential bottleneck for somatic revertant variants

CAPSULE SUMMARY
Somatic revertants frequently occur in the slipstream of pathogenic germline mutations, but protein functionality may act as a bottleneck in the selection for functional variants.


Protein functionality as a potential bottleneck for somatic revertant variants
To the Editor: Somatic genome stability has been considered the rule, rather than the exception; yet, somatic mutations are driving factors in many diseases with a genetic background. Conversely, somatic reversion of germline mutations has been shown to mitigate pathogenicity by partially or completely restoring gene function. Such somatic revertant variants have been observed across a broad spectrum of immunodeficiencies and often result in atypical phenotypes that present a diagnostic and therapeutic challenge. 1 In the majority of cases, one or a few reverting somatic variants have been identified per patient and the limitations of this ''natural form of gene therapy'' are unknown. 1 We describe a 1-year-10-month-old patient with a germlineencoded, homozygous 2 bp deletion in the CD247 gene (CD247 c.43_44delCA; p.Gln15ValfsTer72) (Fig 1, A and B), which encodes for the CD3z chain, and dramatically reduced T-cell numbers (Fig 1, C and D; see Fig E1 and Table E1 in this article's Online Repository at www.jacionline.org) combined with a relatively mild clinical phenotype (see the Case Report in this article's Online Repository at www.jacionline.org). The mutation maps to the signal peptide of the protein and results in a premature stop codon at amino acid position 72. CD3z is required for assembly of T-cell receptor (TCR) complexes and the subsequent shuttling to the cell membrane. 2 Given the essential role in TCR expression, CD3z deficiency presents as a classic T 2 /B 1 /NK 1 severe combined immunodeficiency with reduced peripheral T lymphocytes devoid of TCR/CD3 surface coexpression. 3,4 Unexpectedly, we identified a small population of T cells with TCRab/CD3 coexpression in the patient (Fig 1, E and F). We Genetic diagnosis and immunophenotyping of patient. A, Sequencing chromatogram showing homozygous and heterozygous germline mutation in patient and parents, respectively. B, Schematic representation of wild-type (WT) and patient CD3z chain with germline mutation indicated by black arrow. C-H, Flow cytometric immunophenotyping of PBMCs. C, CD3 1 T and CD19 1 B cells within lymphogate. D, Proportion of CD4 1 and CD8 1 T cells after pregating on CD3 1 , TCRab 1 cells. E and F, Expression of CD3 and TCRab after exclusion of B and natural killer cells and pregating on CD4 1 and CD8 high T lymphocytes, respectively. G and H, Histogram plots showing expression of CD3 and TCRab within CD4 1 and CD8 high populations, respectively. ITAM, Immunoreceptor tyrosine-based activation motif.
hypothesized that the mild clinical phenotype was caused by a somatic reversion of the mutation in this population and therefore performed deep sequencing of the first exon of the CD247 gene in CD3 1 , TCRab 1 T cells to identify potential revertant variants. We discovered 52 unique somatic variants of which the majority of 49 restored the reading frame of the gene and thus prevented the premature stop during translation (see Fig E2 and Table E2 in this article's Online Repository at www.jacionline.org). The variants were mostly located downstream of the CD247 c.43_44delCA mutation within the region that encodes the signal peptide and mainly encompassed in-frame additions of 2 or 5 nucleotides (Fig 2, A, Fig E2). To follow up, we repeated the analysis with a sample that had been obtained 1 year later at the age of 2 years and 10 months. At this time point, we detected 23 somatic variants, including 9 novel variants that had not been present earlier (Fig 2, A, Fig E2, Table E2). Twenty variants restored the disrupted reading frame of the protein. The disease-causing germline mutation was found in approximately one half of the reads of the CD3 1 population at both time points (Table E2), suggesting that most CD3-expressing T cells were compound heterozygous for the germline mutation and a somatic variant, which is also reflected by the reduced CD3 and TCR expression (Fig 1, G and H). Of note, we only detected a few rare somatic variants in the CD3 2 T-cell population and none reverted the germline mutation (see Table E3 in this article's Online Repository at www.jacionline.org). Furthermore, we could only identify the germline mutation in the natural killer cells of the patient and no variants in T cells of healthy controls (see Table  E4 in this article's Online Repository at www.jacionline.org).
Signal peptides serve to shuttle proteins to the secretory pathway, membranous cellular compartments, or the plasma membrane prior to their cleavage from the mature protein. 5,6 Interestingly, signal peptides usually do not have a fixed primary sequence, but begin with a few positively charged amino acids followed by a long stretch of hydrophobic residues. Most somatic variants retained such a characteristic distribution of amino acids despite a difference in the primary structure and several variants were found recurrently (Fig 2, B, Fig E2). Bioinformatic analysis of each somatic variant confirmed that the majority of variants conferred adequate signal peptide function, based on which we consider them somatic reverting variants (Fig 2, C, Fig E2). 7 These observations raise the question at which stage the revertant variants had arisen. Given that the spectrum of variants decreased after 1 year (Fig 2, D, Fig E2), one could speculate that dominant clones with the potential for self-renewal had been selected. However, the spectrum of variants alternated between the 2 time points and all except for 1 were found at a lower frequency in the second sample (Fig 2, D, Fig E2), suggesting that the generation of T cells with revertant variants was an ongoing process that occurred at later stages of development. In addition, varying antigen exposure could have shaped the distribution of functional T-cell clones with somatic revertants.
Somatic mosaicism caused by revertant variants has been observed in many cases of severe combined immunodeficiencyincluding several reports on CD247 deficiency-but never to this extreme extent, which may be explained by the lack of a deepsequencing-based approach. 3,8,9 We believe that the unique location of the disease-causing mutation in the signal peptide sequence is integral to understanding the pathogenicity of this case. Disease-causing germline mutations are mostly located in regions that are integral to protein function. In order to revert the deleterious effects of such a mutation, somatic reverting variants must therefore either directly restore the primary sequence of the protein, encode for an amino acid with similar properties, or remove an adjacent nonsense or frameshift mutation without affecting the functionality of the domain. Because signal peptides are cleaved from most mature proteins and lack a highly conserved motif, 5 they may tolerate a certain degree of mutations as long as the reading frame of the mature protein remains unaffected, which we have observed in our case for most revertant variants. We therefore postulate that the unique location of the germline mutation is key to unmasking the striking spectrum of revertant mutations within the T-cell pool.
The true degree of mosaicism may be difficult to assess as the majority of variants in coding regions could potentially alter protein function and would therefore subject the mutated cell to positive or negative selection, thereby affecting its frequency within the cell pool. However, if the affected domain can buffer a certain degree of genetic variability-such as signal peptides-the true spectrum of mosaicism may be unmasked. We hypothesize that somatic revertant variants in highly proliferative tissues-such as the hematopoietic system-are common and suggest that somatic revertant variants occur frequently in the slipstream of pathogenic germline mutations, but are limited by the necessity for protein functionality, which acts as the bottleneck.

STUDY DESIGN
Peripheral blood and clinical data were obtained according to the guidelines of the Medical Ethics Committees of Meram Medical Faculty at Selçuk University in Konya and the Erasmus Medical Center in Rotterdam. The family gave written informed consent to the study. Detailed information regarding the methodology follows.

Sanger sequencing
Candidate genes (CD3D, CD3E, CD3G, and CD247) were amplified by PCR with AmpliTaq Gold DNA Polymerase (Thermo Fisher Scientific, Waltham, Mass) and subjected to Sanger sequencing as part of routine diagnostics.

Deep sequencing of CD247
Sorted CD3 1 , TCRab 1 T cells were centrifuged in a microcentrifuge tube and the cell pellet was resuspended in 20 mL lysis buffer (10 mmol/L Tris-HCl [pH 7.6], 50 mmol/L NaCl, 6.25 mmol/L MgCl 2 , 0.045% NP40, 0.45% Tween-20). A total volume of 1 mL proteinase K (20 mg/mL) was added and samples were incubated for 1 hour at 568C before heat-inactivation for 15 minutes at 958C. Exon 1 of CD247 was amplified by PCR and subjected to deep sequencing on either a 454 GS junior instrument (Roche, Branford, Conn) or the MiSeq System (Illumina, San Diego, Calif).
For Roche 454 sequencing, exon 1 and flanking parts of the intron were amplified using primers that were adapted with Roche Lib-A adapters and sample-specific multiplex identifier (MID) tags: Forward: 59-CAGACAGATACATACACACACCCCAA-39 Reverse: 59-AAGGAGACCCCAGCCCCTCAC-39 PCR products were purified by means of gel extraction using the QIAgen Gel Extraction Kit (Qiagen, Hilden, Germany) and Agencourt AMPure XP beads (Beckman Coulter). Subsequently, DNA concentrations of the libraries were measured using the Quant-iT Picogreen dsDNA Assay Kit (Invitrogen, Thermo Fisher Scientific).
For the MiSeq System, the coding region of CD247 was amplified by PCR over 30 cycles with the following primers: Forward: 59-ACACTCTTTCCCTACACGACGCTCTTCCGATCTCAGC CTCTTTCTGAGGGAAA-39 Reverse: 59-TCGCGAGTTAATGCAACGATCGTCGAAATTCGCTCAC TTGCCCATTGATTTGA-39 Subsequently, PCR products were purified using Ampure XP beads (Beckman Coulter) followed by a nested PCR reaction (10 cycles) to include the sample-specific indices and Illumina sequencing adapters, using primers from the Truseq Custom Amplicon Index Kit (Illumina). The final concentrations of the libraries were measured using the Quant-it Picogreen dsDNA Assay Kit (Invitrogen). Libraries were paired-end sequenced (2 3 221 bps) on a MiSeq System with use of a MiSeq Reagent Kit v3, according to the manufacturer's protocol (Illumina). Paired sequences were merged using paired-end read merger (PEAR) to create a FASTQ file of each sample. E1 Sequences were filtered and analysed using Microsoft Excel 2016. Only sequences with an exact match of the first and last 8 nucleotides of the coding region of exon 1 were included for analysis. Per variant, the frequency of variant reads and the average quality score per base were calculated. Only variant reads that were present in >0.1% of reads with a minimum average quality score above 20 were included for analysis. All detected variants were compared to the reference sequence and analyzed by 2 persons individually.

Bioinformatic prediction of signal-peptide function
Prediction of signal peptide function was performed with the software SignalP 5.0 according to the authors' instructions (http://www.cbs.dtu.dk/ services/SignalP/). E2

CASE REPORT
We describe a girl of consanguineous parents with a history of recurrent infections and lymphopenia that was admitted to our hospital at the age of 1 year and 10 months with severe tachypnea, productive coughing, and fever. Prior to admission, the patient had suffered from recurring bouts of oral moniliasis, sinusitis, otitis media, and pneumonia. She regularly required intravenous antibiotic therapy for the control of her infections. After BCG vaccination, she developed suppurative lymphadenitis. Her mother and father had not shown any signs of immunodeficiency. Her paternal aunt had passed away as an infant due to unknown reasons.
Whereas the majority of CD4 1 and CD8 1 T cells showed neither CD3 nor TCR expression, a small subset expressed reduced levels of the CD3 complex and TCRab. Given the essential role of the CD3 complex in TCR surface expression and signal transduction, we performed targeted DNA sequencing of potential candidate genes and identified a homozygous 2 bp deletion in CD247 (CD247 c.43_44delCA; p.Gln15ValfsTer72) that is located in the N-terminal signal peptide of the protein and results in a premature stop codon at amino acid position 72.
Six months after admission, she developed a generalized maculopapular rash, neutropenia, and bilateral arthritis in the ankles. Her clinical presentation was accompanied by an elevated erythrocyte sedimentation rate (74 mm/h) and increased CRP levels (19.1 mg/dL). Antinuclear antibodies were not detected.
Serology unveiled Borrelia burgdorferi-specific IgM. In the bone marrow, we detected a reduction of mature neutrophils. She was treated with penicillin for 2 weeks. Given that the symptoms did not improve after 6 weeks, she was diagnosed with juvenile rheumatoid arthritis and treated with ibuprofen. Three months after initiation of treatment, her symptoms waned. Shortly thereafter, she developed arthritis in the ankles, wrists, and fingers. She was treated with ibuprofen, corticosteroids, and cyclosporine, which substantially improved her condition.
At the age of 7 years, the patient developed autoimmune hemolytic anemia and thrombocytopenia. She was diagnosed with Evans syndrome and successfully treated with corticosteroids, intravenous immunoglobulins, and cyclosporine.
At the age of 9 years, she developed cervical lymphadenopathy and was diagnosed with stage 4 non-Hodgkin lymphoma. She was treated according to the LMB89 Group B protocol. Four months after therapy, she developed a cervical mass that was confirmed as a relapse. She received treatment with rituximab, ifosfamide, carboplatin, and etoposide (ICE protocol) for 2 months and underwent bone marrow transplantation at the age of 10 years, using a nonrelated, partially HLA-mismatched (9/10 matched) donor and carmustine, etoposide, and cytarabine as the conditioning regime. She received cyclosporine and methotrexate for graft-versus-host disease prophylaxis. Two weeks after transplantation, she developed cutaneous and intestinal graft-versus-host disease and was treated with tacrolimus and corticosteroids for 6 months. Two years post bone marrow transplantation, she has 100% donor chimerism and received all appropriate vaccinations. She has remained well and did not show any signs of relapse.  Somatic variants that restore the reading frame in CD3 1 , TCRab 1 T cells. Spectrum of functional revertant variants that restore the reading frame of the protein with indicated frequency at each time point, amino acid sequence that differs from WT protein, and signal peptide analysis. Amino acids are symbolized as white circles for moderate hydropathy, blue circles for hydrophilic amino acids, and red circles for hydrophobic amino acids. Black outline denotes neutral charge, whereas blue and red outlines refer to positive and negative charges, respectively. Signal peptide analysis was performed as recommended for SignalP 5.0, including likelihood of signal peptide function, cleavage site, and probability of cleavage. Variants with signal peptide score <0.5 are not predicted to be signal peptides.