No kissing in the nucleus: Unbiased analysis reveals no evidence of trans chromosomal regulation of mammalian immune development

It has been proposed that interactions between mammalian chromosomes, or transchromosomal interactions (also known as kissing chromosomes), regulate gene expression and cell fate determination. Here we aimed to identify novel transchromosomal interactions in immune cells by high-resolution genome-wide chromosome conformation capture. Although we readily identified stable interactions in cis, and also between centromeres and telomeres on different chromosomes, surprisingly we identified no gene regulatory transchromosomal interactions in either mouse or human cells, including previously described interactions. We suggest that advances in the chromosome conformation capture technique and the unbiased nature of this approach allow more reliable capture of interactions between chromosomes than previous methods. Overall our findings suggest that stable transchromosomal interactions that regulate gene expression are not present in mammalian immune cells and that lineage identity is governed by cis, not trans chromosomal interactions.


Summary
It has been proposed that interactions between mammalian chromosomes, or transchromosomal interactions (also known as kissing chromosomes), regulate gene expression and cell fate determination. Here we aimed to identify novel transchromosomal interactions in immune cells by high--resolution genome--wide chromosome conformation capture. Although we readily identified stable interactions in cis, and also between centromeres and telomeres on different chromosomes, surprisingly we identified no gene regulatory transchromosomal interactions in either mouse or human cells, including previously described interactions. We suggest that advances in the chromosome conformation capture technique and the unbiased nature of this approach allow more reliable capture of interactions between chromosomes than previous methods. Overall our findings suggest that stable transchromosomal interactions that regulate gene expression are not present in mammalian immune cells and that lineage identity is governed by cis, not trans chromosomal interactions.

Introduction
Each chromosome contains just one DNA molecule. Recent technological advances have allowed characterisation of the elaborate three--dimensional structures that form from this DNA (1). These structures include globules or domains, which partition the chromosome, and elegant DNA loops that link gene promoters to distant enhancers. In addition to these intrachromosomal structures formed within the same DNA molecule, there are transchromosomal interactions formed between different chromosomes. Relative to intrachromosomal interactions, the frequency, nature and function of transchromosomal interactions are poorly understood (2).
In contrast to the multitude of intrachromosomal interactions known to regulate gene expression, only a handful of transchromosomal interactions have been described. For example, transchromosomal interactions were reported to be crucial for the appropriate expression of a single olfactory gene amongst the 1300 within the genome (3,4) and for X chromosome inactivation (5--7).
Interestingly, a large number of the reported transchromosomal interactions have been characterised in cells of the immune system. For example, in both mouse and human T cells the insulin like growth factor 2 (Igf2) locus was reported to interact with a number of loci on different chromosomes (8--10). Also in T cells, a regulatory region on mouse chromosome 11 (the T helper 2 locus control region; LCR) was suggested to interact with loci encoding the cytokine interferon gamma (Ifng) on chromosome 10 (11) and interleukin 17 (IL--17) on chromosome 1 (12). Perturbation of these interactions was associated with altered expression of Ifng and IL--17, respectively. In mouse B cell progenitors, was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint the interaction between the immunoglobulin heavy chain (Igh) locus on chromosome 12 and the immunoglobulin light chain (Igk) locus on chromosome 6 was important for the rearrangement of the heavy chain locus (13).
These transchromosomal interactions were all identified by either chromatin conformation capture, in which crosslinking, dilution of a ligation reaction and PCR are used to deduce the relative physical proximity of two loci in three-dimensions, or DNA FISH in which microscopy and labelled probes are used to locate loci within individual nuclei, or both. These techniques are targeted approaches. Here we aimed to use an unbiased, genome--wide approach to identify novel gene regulatory transchromosomal interactions in three distinct mouse and human immune cell populations. Unexpectedly, we found very few interactions between chromosomes, and none were gene regulatory or conserved. Overall, our findings question the existence of stable, gene--regulatory transchromosomal interactions underlying immune cell identity.

Results
To elucidate novel transchromosomal interactions, we generated in situ HiC libraries from both mouse and human B cells and CD4 + and CD8 + T cells of the immune system (Supplemental Figure 1 A, B). The resulting 200 million paired-end reads were then mapped to the appropriate genome, filtered for artefacts, such as dangling ends and self--circling reads, and counted into 50kb bins with the diffHic software package (14). After iterative correction of the generated contact matrices, DNA--DNA interactions were detected by comparing the was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint interaction intensity in each bin to those surrounding it to determine significant interactions relative to background (15).
Using this pipeline we detected hundreds of interactions between chromosomes in each cell population (Supplemental Table 1). Consistent with previous literature (16), these transchromosomal interactions are enriched in gene--rich, centrally located chromosomes (Figure  1 A, Supplemental Figure 1 C). However, closer examination of these interactions reveals that a high percentage (74--90% in mouse and 82--94% in human) contain regions recommended to be removed, or 'blacklisted', from analyses due to their high or low mappability, repeated nature, location within telomeres or centromeres, among others (17,18). After application of blacklisting the majority of transchromosomal interactions are removed (Figure 1 B--C, Supplemental Table 2). This is in stark contrast to intrachromosomal interactions, of which less than 3% contain blacklisted Telomeres and centromeres physically cluster not just during preparation for mitosis, but throughout the cell cycle (19). Thus it appears that the majority of the transchromosomal interactions detected in mammalian immune cells may be a consequence of this telomeric and centromeric clustering. Importantly, the detection of these interactions confirms that in situ HiC is able to detect interactions between chromosomes.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. To determine if any of the detected transchromosomal interactions, whether associated with telomeres or centromeres or not, have a gene regulatory function, we examined the relationship between lineage--specific transchromsomal interactions (those found in only one of the cell populations) (Supplemental Table 2) and expression of gene associated with these interactions (20). In the mouse, we found that the 15 lineage--specific  Table  2) associated with any protein--coding genes differentially expressed (>5 RPKM) in the lineage exhibiting the lineage--specific transchromosomal interaction. This suggests that none of the detected lineage--specific transchromosomal interactions perform a gene regulatory function in mouse or human B or T cells.
It has been suggested that if transchromosomal interactions were functionally important they would be evolutionarily conserved (2). Therefore, we examined the handful of genes and genomic regions associated with all transchromosomal interactions in mouse and human B and T cells. We found that none of the lineage--specific transchromosomal interactions link orthologous regions in mouse and human.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint As we were able to detect transchromosomal interactions, but none of a gene regulatory nature, we examined regions previously reported to be involved in regulatory interactions between chromosomes. We examined our CD4 + T cell data for interactions between the previously mentioned LCR region on mouse chromosome 11 and loci encoding the cytokine interferon gamma (Ifng) on chromosome 10 (11) and interleukin 17 (IL--17) on chromosome 1 (12).
Curiously, no interactions were detected between the LCR and Ifng or IL17 loci in mouse CD4 + T cells ( To determine if the depth of sequencing of our in situ HiC had inhibited detection of the previously reported transchromosomal interactions, we examined publically available promoter capture HiC data from human CD4 + T cells (21).
The LCR--Ifng or IL17 interactions were also undetectable in this extremely high-resolution data (Supplemental Figure 2 E, F).
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint We then attempted to detect another previously reported transchromosomal interaction suggested to occur between the immunoglobulin heavy (Igh) and light chain (Igk) loci in mouse B cell progenitors (13). Our transchromosomal interaction detection pipeline was applied to in situ HiC libraries generated from two B cell progenitors: pro--B cells and immature B cells. Curiously again, using our unbiased, genome--wide approach, we found no interactions between Igh on Incongruous results from FISH versus 3C within cell types, or in fact from the same technique between studies, has been a persistent issue when examining transchromosomal interactions. For example, as previously mentioned 3C was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint indicated that X chromosomes were in contact prior to X chromosome inactivation (5--7) whereas FISH revealed little to no contact (7). In another example, the two studies reporting transchromosomal interactions between Ifg2 and loci on other chromosomes in mouse T cells found no common interactions (9,10), while two further studies of the reported interaction in human T cells (8) found no evidence of interaction (24,25).
To address this vexed issue, we used the in situ HiC technique to search for transchromosomal interactions across two species and three distinct cell populations. With this unbiased, genome--wide approach, we were unable to detect any conserved, gene regulatory transchromosomal interactions. While our findings are clear and suggest gene regulatory transchromosomal interactions do not function in the mammalian immune system, it is not possible to be totally conclusive about a negative finding. For example, we cannot rule out gene regulatory interactions that are weak, transient, present in highly repetitive regions or in regions without MboI restriction sites. Furthermore, because we used only male--derived DNA we could not examine interactions reported to occur between X chromosomes during X chromosome inactivation (26).
Physiologically relevant transchromosomal interactions that are transient and/or weak may not be detectable by in situ HiC. However, this does not explain the absence of the interactions between LCR and Ifng or IL17 loci in T cells, or the immunoglobulin loci in B cell progenitors, as these interactions are reported to occur in 40--50% of cells (11,13) and the interactions are reported to be as strong as intrachromosomal interactions (11).
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Cell isolation
All animal experiments were performed using C57B/6 male mice at age 6--8 w. Flow cytometric analyses were performed on BD FACSCanto with sorting on the BD Aria or Influx (BD Bioscience). Antibodies were purchased from BD Bioscience or eBioscience (Supplemental Table 3).

HiC
HiC was performed as previously published (15). Primary immune cell libraries for both human and mouse were generated in biological duplicate. Libraries were sequenced on an Illumina NextSeq 500 to produce 75bp paired--end reads.
Between 160 million and 375 million valid read pairs were generated per sample (Supplemental Table 4). Hi--C sequencing data for mouse pro--B cells and immature B cells was obtained from gene expression omnibus accession number GSE99163.

Total RNA isolation
RNA was isolated using the miRNeasy Micro Kit (QIAGEN) following manufacturer's instructions.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

RNA--seq analysis
All samples were acquired from two male human donors. Each donor provided one sample per biological condition, giving each condition two replicates. RNA libraries were prepared using an Illumina's TruSeq Total Stranded RNA kit with Ribo--zero Gold (Illumina) according to the manufacturer's instructions. The rRNA--depleted RNA was purified, and reverse transcribed using SuperScript II reverse transcriptase (Invitrogen). Total RNA--Seq libraries were sequenced on the Illumina NextSeq 500 generating 80 base pair paired end reads. The reads were aligned to the human genome (GRCh38/hg38) using the Rsubread aligner (33). The number of fragments overlapping Ensembl genes were summarized using featureCounts (34).
Differential expression analyses were undertaken using the edgeR (35) and limma (36) software packages. Any gene which did not achieve a count per million mapped reads (CPM) of 0.1 in at least 2 samples was deemed to be unexpressed and subsequently filtered from the analysis. Compositional differences between libraries were normalized using the trimmed mean of log expression ratios (TMM) (37) method. Counts were transformed to log2--CPM with associated precision weights using voom (38). Differential expression was assessed using linear models and robust empirical Bayes moderated t--statistics (39). P--values were adjusted to control the false discovery rate (FDR) below 5% using the Benjamini and Hochberg method. To increase precision, the linear model incorporated a correction for a donor batch effect.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

HiC data processing
Read processing and alignment --Reads from each sample were aligned using the presplit_map.py script in the diffHic package v1.4.0 (14). Briefly, reads were split into 5' and 3' segments if they contained the MboI ligation signature (GATCGATC), using cutadapt v0.9.5 (40) with default parameters. Segments and unsplit reads were aligned to the GRCm38/mm10 build of the Mus musculus genome or the GRCh38/hg38 build of the Homo sapiens genome using bowtie2 v2.2.5 (41) in single--end mode. All alignments from a single library were pooled together and the resulting BAM file was sorted by read name. The FixMateInformation command from the Picard suite v1.117 (https://broadinstitute.github.io/picard/) was applied to synchronise mate information for each read pair. Alignments were resorted by position and potential duplicates were marked using the MarkDuplicates command, prior to a final resorting by name. This was repeated for each library generated from each sample in the data set. Each BAM file was further processed to identify the MboI restriction fragment that each read was aligned to. This was performed using the preparePairs function in diffHic, after discarding reads marked as duplicates and those with mapping quality scores below 10. Thresholds were applied to remove artefacts in the libraries, (Supplemental Table 4). Read pairs were ignored if one read was unmapped or discarded, or if both reads were assigned to the same fragment in the same orientation. Pairs of inward--facing reads or outward--facing reads on the same chromosome separated by less than a certain distance (min.inward and min.outward respectively) were also treated as dangling ends was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint and were removed. For each read pair, the fragment size was calculated based on the distance of each read to the end of its restriction fragment. Read pairs with fragment sizes above 1200 bp (max.frag) were considered to be products of off--site digestion and removed. In this manner, approximately 70--75% of read pairs were successfully assigned to restriction fragments in each library. An estimate of alignment error was obtained by comparing the mapping location of the 3' segment of each chimeric read with that of the 5' segment of its mate. If the two segments were not inward--facing and separated by less than 1200 bp (chim.dist), then a mapping error was considered to be present. Of all the chimeric read pairs for which this evaluation could be performed, around 1--5% were estimated to have errors, indicating that alignment was generally successful. Technical replicates of the same library from multiple sequence runs were then merged with the mergePairs function of diffHic.
Data correction and detecting loop interactions --Loop interactions were detected using methods in the diffHic package. Read pairs were counted into 50 kbp bin pairs (with bin boundaries rounded up to the nearest MboI restriction site) using the squareCounts function. Only read pairs mapped to a placed scaffold were included therefore unlocalized and unplaced scaffolds were not included.
Mitochondrion read pairs were also excluded. Using the diffHic function correctContact, we applied an iterative correction procedure (42) with some modifications to account for differences in sequencibility, mappability and restriction site frequency in the libraries. The modification to the iterative correction procedure involves performing an additional correction to a bin pairs by the negative binomial mean of the replicate libraries.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint Looping interactions were detected using a method similar to that described previously (15). Specifically, read pairs were counted in bin pairs for all libraries of a given cell type or condition. For each bin pair, the log--fold change over the average abundance of each of several neighbouring regions was computed.
Neighbouring regions in the interaction space included a square quadrant of sides 'x+1' that was closest to the diagonal and contained the target bin pair in its corner; a horizontal stripe of length '2x+1' centred on the target bin pair; a vertical stripe of '2x+1', similarly centred; and a square of sides '2x+1', also containing the target bin pair in the centre. The enrichment value for each bin pair was defined as the minimum of these log--fold changes, i.e., the bin pair had to have intensities higher than all neighbouring regions to obtain a large enrichment value. These enrichment values were calculated using the enrichedPairs function in diffHic, with 'x' set to 5 bin sizes (i.e., 250 kbp). Putative loops were then defined as those with enrichment values above 0.5, with average count across libraries greater than 10, and that were more than 1 bin size away from the diagonal.
Blacklisted regions and removal of centromere and telomere loops --Blacklisted genomic regions were obtained from ENCODE for hg38 and mm10 (18). Loops that that had at least one anchor in a blacklisted genomic region were removed.
Additionally, loops found with an anchor found within a centromere or telomere region as defined by UCSC genome annotation were removed.
Finding overlaps between bin pairs --Overlaps between bin pairs were performed using the overlapsAny function in the InteractionSet package with type = equal was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint and maxgap = 100kb (43). This considers an overlap to be present if anchors have a separation of less than the maxgap value and if both anchors of the bins pairs overlap.

Promoter capture Hi--C data processing
Promoter capture Hi--C sequencing data for human naive CD4 + T cells was obtained from EGA (https://www.ebi.ac.uk/ega) accession number EGAS00001001911. The read processing and alignment was with the same methods as the Hi--C data except, as the restriction enzyme HindIII was used in the assay, the reads were split with a ligation signature of AAGCTAGCTT.

Visualization of results
Plaid plots were constructed using the plotPlaid function from the diffHic package. The range of colour intensities in each plot was scaled according to the library size of the sample, to facilitate comparisons between plots from different samples. Heatmaps of the loops between chromosomes where generated using the R package gplots with the function heatmap.2. Circos plots were generated with the R package RCircos (44).
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. ; https://doi.org/10.1101/212985 doi: bioRxiv preprint was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (which this version posted November 2, 2017. was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.