Animal models
All animals used in this study were handled in accordance with protocols approved by the Institutional Animal Care and Use Committees (IACUC) at the University of California San Francisco. All protocols were conducted in accordance with guidelines from the National Institutes of Health Guide for the Care and Use of Laboratory Animals. Wild-type C57Bl6/J (Stock:000664) transgenic mice were obtained by The Jackson Laboratory. Dp1Tyb transgenic mice were provided by V. Tybulewicz, and Hmgn1-knockout mice were provided by M. Bustin and T. Furusawa18,53. All mice were housed in a temperature- and humidity-controlled facility under a 12 h:12 h light:dark cycle and given ad libitum access to standard chow and water unless otherwise documented.
hiPSC culture and cardiomyocyte differentiation
Trisomy 21 lines DS1 (UWWC1-DS1) and M (WC-24-02-DS-M), and their isogenic control disomic lines DS2U (UWWC1-DS2U) and B (WC-24-02-DS-B), were obtained through WiCell28. Cell lines were maintained with mTeSR1 Plus medium (Stemcell Technologies, 85850) or E8 medium (ThermoFisher Scientific, A1517001) and grown on hESC-Qualified Matrigel (Millipore, CLS354277). Cells were passaged by enzymatic dissociation using Accutase (ThermoFisher Scientific, 00-4555-56) when 70–80% confluent in E8 medium containing Y-27632 for the first day of passaging (Stemcell Technologies, 72308). Cardiomyocyte differentiation was performed using the Stemdiff Cardiomyocyte Differentiation Kit (Stemcell Technologies, 05010) according to the manufacturer’s protocols. First, cells were seeded in 12-well plates at ~180,000–200,000 cells per well in medium with ROCK inhibitor (Stemcell Technologies, Y-27632). After two days, differentiation medium with supplement A and 1:100 diluted Matrigel was added to each plate. Subsequent medium changes were performed every two days as follows: differentiation medium with supplement B (day 2), supplement C (days 4 and 6) and, finally, maintenance medium, which was replaced every two days, for the remainder of the differentiation.
Karyotyping
DS1 and DS2U hiPSCs were tested by G-band karyotyping by Cell Line Genetics. Results did not detect microdeletions in DS1 or DS2U cells and confirmed trisomy 21 within all cells in DS1 samples. Because trisomic cells have been shown previously to lose the third copy of HSA21 over multiple cell culture passages, we tested both DS1 and DS2U lines at a consistent interval of about once a year.
Whole-mount RNAscope of E10.5 mouse embryos
Timed mating of wild-type C57Bl6/J male mice with C57Bl6/J female mice was set up and the females were checked for the presence of vaginal plugs the next morning (embryos would be considered E0.5). Females were checked for pregnancy at E6.5 by echocardiography (Vevo 3100, Visual Sonics). Embryos were collected at E10.5 by euthanizing pregnant females according to approved protocols and collecting embryos in ice-cold PBS-FBS (Life Technologies, 14190250) and 1% Fetal Bovine Serum (Thermo Fisher Scientific, 10439016) on ice. After dissection, embryos were fixed overnight with 4% paraformaldehyde (50-980-487, Fisher Scientific). The next day, embryos were washed in a series of methanol and PBST, PBS and 0.1% Tween (ThermoFisher Scientific, 28320) series containing 0, 25%, 50%, 75% and 100% methanol (Millipore Sigma, 34860) and stored at −20 °C overnight. Rehydration steps were performed in reverse order of methanol and PBST. Processing for in situ hybridization followed RNAscope manufacturer’s protocols (323100, ACD Biosciences). Embryos were incubated with Protease Plus solution for 15 min, washed in PBST 3 times and then fixed in 4% PFA for 20 min. Embryos were washed again in PBST and then incubated overnight in probes for Bmp2 and Irx4 (ACD Biosciences, 406661-C2 and 504831). The remaining protocol was performed according to the manufacturer’s protocols supplemented with Opal 480 and Opal 620 fluorescent dyes that bound to probes in channel 1 and channel 2 (Akoya Biosciences, FP1495001KT and FP1500001KT). Embryos were then kept in DAPI solution overnight.
Light sheet microscope imaging
After the RNAscope procedure was completed, stained E10.5 embryos were embedded in 1.5% low melt agarose gel (ThermoFisher Scientific, 16520050) resuspended in ddH2O and loaded into a 1 ml syringe. Embryos were allowed to solidify in low melt agarose inside of the syringe for at least 1 h at room temperature. A volume of 5 ml of clearing solution, EasyIndex (LifeCanvas Technologies, NC2379687), was added to a separate 15 ml conical tube. The 1 ml syringe containing the embryo in solidified agarose gel was carefully placed in the 15 ml tube, ensuring the samples were partially submerged in the EasyIndex solution. To shield the samples from light exposure and evaporation, the 15 ml tube with sample were wrapped with foil paper and kept at 4 °C for 48 h to ensure proper clearing. Following the clearing process, the samples were ready for imaging using the SmartSPIM Light Sheet Microscope (LifeCanvas Technologies). Samples were glued to a sample holder and immersed in an index-matched imaging solution, with EasyIndex having an index of 1.52. Acquired images were analysed using Imaris software (Oxford Instruments).
Generation of CRISPRa knock-in hiPSCs
The pC13N-dCas9-BFP-KRAB transgenic construct was modified for CRISPRa42,43 (Addgene Plasmid #127968). To generate dCas9-VPR, the pC13N-dCas9-BFP-KRAB plasmid was digested with SpeI and MluI (New England Biolabs, R3133S and R3198S, respectively). PCR primers were designed to amplify the dCas9-VPR fragment from SP-dCas9-VPR (Addgene Plasmid #63798) with 21-bp overlapping regions to the digested vector pC13N-dCas9-BFP-KRAB for cold fusion cloning (System Biosciences, MC010B-1). Successful clones were verified by restriction digest and Sanger sequencing.
Nucleofection of DS2U hiPSCs for dCas9-VPR integration was performed using the P3 Primary Cell 4D-Nucleofector X Kit S (Lonza, V4XP-3032). TALEN vectors with upstream and downstream sites of homology to the CLYBL locus were used as previously described. Three days after nucleofection, medium containing 50 µg ml−1 G418 (ThermoFisher, 10131035) was added to select for clones containing any integration of donor plasmid. Single cell colonies were manually picked and screened for correct integration into the CYLBL locus for 5′, 3′ and internal sites using PCR primers: 5Arm-Ver-R, gaacctgcgtgcaatccatctt; 5Arm-Ver-F, catctccacaccctcctgtagt; 3Arm-Ver-F, ccctcttctcttatggagatcaccggt; 3Arm-Ver-R, ggaagtgactagaggatgtact; Seq-7394-F, gatccgagacaagcagagtggaaa; Seq-8798-R, ggagaatcgaatccgccgtatttc.
Clonality and discrimination of heterozygous or homozygous integration of transgenes were assessed by digital droplet PCR using ddPCR Supermix for Probes (Bio-Rad, 1863026). Probes for wild-type and knock-in alleles were generated with Fam and Hex fluorescence reporters (Integrated DNA Technologies). Sequences of primers and probes are: Common Fwd, GAGGATTGAGCTCTCTTACCC; WT Rev, ACATGGCTCAGTTGTGAAAAT; KI Rev, CAGATCTCTCGAGGCCCT; WT ddPCR probe, tc+c+ca+t+ttcc+t+ca (FAM dye); KI ddPCR probe, CGAAGTTATCTGACCTCTTCTCTTCCTCCC (Hex).
Validation of CRISPR-A functionality was assessed by Taqman qPCR analysis of target gene perturbations using sgRNAs delivered at hiPSC stage through lentivirus infection. In brief, sgRNA sequences were designed using either the ChopChop algorithm or the Broad CRISPick web portal54,55. Primers were designed to clone sgRNA sequences into the pU6-sgRNA EF1Alpha-puro-T2A-BFP vector (Addgene Plasmid #60955). Cloned sgRNA-containing vectors were transfected into HEK293T cells using Fugene HD (Promega, E2311) in combination with lentivirus packaging vectors pMD2G (Addgene Plasmid #12259) and psPAX2 (Addgene Plasmid #12260). Two days after transfection, the lentivirus-containing medium was filtered, concentrated with Lenti-X (Takara, 631232) and resuspended in PBS. Purified virus was then added at various densities to DS2U-dCas9-VPR cells and uninfected cells were eliminated by the addition of 0.5 µg ml−1 puromycin (ThermoFisher, A1113803). Target gene expression change at hiPSC stage was assessed by RNA extraction using Direct-zol RNA miniprep kit (Zymo Research, R2050) and cDNA synthesis using SuperScript III First-Strand Synthesis SuperMix for IVT (ThermoFisher, 11752250) prior to qPCR with reverse transcription.
CROP-seq experimental design and computational analyses
Modification of CROP-seq vector
To facilitate estimation of lentivirus integration events per cell, we modified the previously described CROP-seq vector (Addgene, Plasmid #86708) to add a T2A-mCherry reporter gene (pHR-SFFV-KRAB-dCas9-P2A-mCherry, Addgene Plasmid #60954) directly downstream of the puromycin resistance gene45. In brief, the original CROP-seq plasmid was digested with FseI and MluI, and the following primers were used to generate a fused transgene with cold fusion cloning: T2A-mCherry-S, CCgGATCCgagggcagaggaagtcttctaacatgcggtgacgtggaggagaatcccggccctatggtgagcaagggcgagga; mCherry-MluI-AS: GTTGATTGTCGACTTAACGCGTttacttgtacagctcgtccat.
Design of sgRNA library
We selected genes on chromosome 21 to target based on: (1) presence in three copies in Dp1Tyb mice; and (2) expression in a previously reported dataset of mouse heart development (average log2 normalized expression >0, and present in more than 4% of cells in atlas)24. For all genes, three sgRNA sequences were chosen using the Broad CRISPick web portal (n = 198 guides). We then selected 20 sgRNA control sequences from a previously validated CROP-seq library45 (~10% of the library, n = 218 guides total). All guide sequences were submitted to a commercial source (VectorBuilder) to generate pooled lentivirus sgRNA-containing libraries. Amplicon sequencing of lentivirus libraries was performed by VectorBuilder and purified libraries (>108 TU ml−1) were generated.
Infection of sgRNA lentivirus library to DS2U-dCas9-VPR
To maximize single lentivirus integration events while maintaining the diversity of the sgRNA library, DS2U-dCas9-VPR cells were seeded at 1 × 105 density and infected with varying concentrations of pooled lentivirus libraries. Three days post infection, cells were dissociated and 1 × 104 cells were set aside and analysed by flow cytometry for mCherry reporter gene expression using the BD LSRFortessa X-20 and analysed with FlowJo (v.9.2) software. The remaining cells were plated on a 10 cm2 tissue culture dish and selected with 0.5 μg ml−1 puromycin.
Embryo collection and genotyping
For all timed mating experiments, male and female mice were housed overnight and female mice were visually inspected for a vaginal plug in the morning. Embryos at date of plug were considered E0.5. Female mice were checked for pregnancy at E6.5 by echocardiography (Vevo 3100, Visual Sonics). Embryos were collected at E15.5 by euthanizing pregnant females according to approved protocols and collecting embryos in ice-cold PBS-F, PBS (Life Technologies, 14190250) and 1% fetal bovine serum on ice (Thermo Fisher Scientific, 10439016). For experimental crosses leading to microCT analysis, yolk sacs were dissected and a small fraction was placed in QuickExtract DNA Extraction Solution (Lucigen, QE09050) and processed according to the manufacturer’s protocols.
Genotyping was performed by PCR using Phire Green Hot Start II DNA Polymerase (Thermo Fisher Scientific, F124L), according to the manufacturer’s protocols for the following lines.
Hmgn1: Hmgn_1, CCC CGC GCC GCC ACG ATG CCC AAG AGG AAG; Hmgn_2, CCA TCC GCG CTA ACC TGC ACG AGA AAG CAC; Hmgn_3, CGA CTG CAT CTG CGT GTT CGA AT.
Genotyping of Dp1Tyb animals was performed by Taqman qPCR using the TaqMan Fast Advanced Master Mix for qPCR (ThermoFisher, 4444557). Primers for the duplicated transgenic region on chromosome 16 as well as a wild-type control region were run on all samples to identify wild-type or het Dp1Tyb animals. Oligo sequences for both wild-type and Tg alleles are: Zfp295_1, CTAACCCTAACCCTAAGTCCTTGTC; Zfp295_2, TGAGGAGAGTTTTCTGGGAGAA; Dot1l_1, GCCCCAGCACG ACCATT; Dot1l_2, TAGTTGGCATCCTTATGCTTCATC.
qPCR probes were synthesized as PrimeTime 5′ 6-FAM/ZEN/3′ IBFQ or PrimeTime 5′ 6-FAM/ZEN/3′ IBFQ (Integrated DNA Technologies): Zfp295, /56-FAM/CTCACAGCA/ZEN/GTGCAGATCACGGC/3IABkFQ/; Dot1l, /56-FAM/CCAGCTCTC/ZEN/AAGTCG/3IABkFQ/.
MicroCT
E15.5 embryonic hearts were collected and fixed in 4% paraformaldehyde overnight at 4 °C. Samples were rinsed in cold 1× PBS twice followed by dehydration in cold 70% ethanol. Samples were stored at 4 °C until further processing. Samples were incubated in 1% phosphotungstic acid (PTA, Sigma P4006-10G) in 70% ethanol solution at 4 °C for at least 72 h. Samples were then washed with 70% ethanol to remove excess PTA and then were embedded in 0.5% agarose gel (UltraPure Agarose, Invitrogen, 15510-027) in 15 ml Falcon tubes. Samples were kept on ice during transportation and pursue with scanning at the UCSF Skeletal Biology and Biomechanics Core facility (µCT 50 cabinet microCT scanner, SCANCO Medical). The images were processed with OsiriX MD (Pixmeo SARL) and were examined for clear identification of a ‘hole/gap’ in the region of the ventricular septum. All conclusions for presence of septal defects were performed double-blinded by two researchers independently.
Postnatal survival assay
To assess postnatal survival, offspring from Dp1Tyb × Hmgn1 heterozygous crosses were monitored from birth (P0) through postnatal day 7 (P7). Pups were toed and genotyped at P0 using the HMGN1 PCR primers detailed above. Litters were monitored daily, and the number of surviving pups of each genotype was recorded at P7. Any pups found deceased prior to P7 were included in the lethality analysis. Survival rates were calculated as the proportion of live pups per genotype at P7 relative to the number of pups identified at P0. Statistical significance of genotype-specific survival differences was assessed using the chi-square test.
scRNA-seq
Sample preparation for day 20 cardiomyocytes
DS1 and DS2U cells at day 10 and day 20 of cardiomyocyte differentiation were first prepared for scRNA-seq by enzymatic dissociation with 0.25% Trypsin (ThermoFisher Scientific, 25200056). For all scRNA-seq replicates, 10,000 cells per sample were loaded onto the 10X Genomics Chromium instrument according to the manufacturer’s protocols (Chip G, PN-1000120). All experiments were conducted with v.3.1 NEXT GEM reagents (PN-1000121), using 9 cycles of cDNA amplification for the GEM kit (PN-1000123) and 9 cycles of library amplification for library kit (PN-1000157). After library preparation, samples were sequenced on Illumina NextSeq 500 (Illumina, software 4.0.2), Novaseq S4 and/or Novaseq X (Illumina, software v.1.5), according to the manufacturer’s guidelines for scRNA-seq sequencing protocols (read 1: 28 cycles, i7 and i5: 10 cycles; and read 2: 90 cycles).
Sample preparation for CROP-seq experiment
Dissociation of DS1 and DS2U at hiPSC stage (day 0) was achieved using Accutase, whereas day 2, day 4, day 6, day 8, day 10 and day 20 cardiomyocytes were treated with 0.25% Trypsin. Preparation of CROP-seq samples was similar at both hiPSC and day 20 cardiomyocyte stages and CROP-seq amplicon library preparation was prepared as previously described45,56. In brief, after the cDNA isolation step of 10X scRNA-seq library preparation, 10 ng of cDNA was separated from the remaining library and subjected to three rounds of nested PCR steps using primers listed below. Amplicon library purity was assessed by Bioanalyzer HS DNA kit and purified using SPRI clean up beads. Amplicon libraries were pooled and sequenced on Illumina NextSeq 500 (Illumina, software 4.0.2) using similar cycling conditions as the endogenous transcriptome libraries mentioned above.
CROP-seq nested PCR sequences: Amp_sgRNA_1F TTTCCCATGATTCCTTCATATTTGC; Amp_sgRNA_1R ACACTCTTTCCCTACACGACG; Amp_sgRNA_2F GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCTTGTGGAAAGGACGAAACAC; Amp_sgRNA_2Rand3R AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC; Amp_sgRNA_3F-Index1 CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index2 CAAGCAGAAGACGGCATACGAGATCTAGTACGGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index3 CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index4 CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index5 CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index6 CAAGCAGAAGACGGCATACGAGATCATGCCTAGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index7 CAAGCAGAAGACGGCATACGAGATGTAGAGAGGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index8 CAAGCAGAAGACGGCATACGAGATCCTCTCTGGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index9 CAAGCAGAAGACGGCATACGAGATAGCGTAGCGTCTCGTGGGCTCGG; Amp_sgRNA_3F-Index10 CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTCTCGTGGGCTCGG.
Sample preparation for E11.5 Dp1Tyb × Hmgn1 embryonic hearts experiment
E11.5 mouse embryos were collected from timed pregnant females, and genotyping was conducted using yolk sac DNA to identify wild-type, Dp1Tyb, Hmgn1+/− and Dp1Tyb;Hmgn1+/− genotypes. For each embryo, the heart was micro-dissected in ice-cold PBS under a stereomicroscope. Individual hearts were enzymatically dissociated using TrypLE Express (Thermo Fisher Scientific), with samples incubated at 37 °C in a water bath for 10 min. Gentle trituration was performed using a 200 µl pipette at both the 5 min and 10 min time points to promote dissociation. The enzymatic reaction was quenched by adding ice-cold PBS containing 1% fetal bovine serum (FBS). Each sample was then passed through a 40 µm cell strainer to obtain a single-cell suspension, followed by centrifugation at 400g for 10 min at 4 °C. The cell pellet was washed once with 1% FBS in PBS and resuspended in 50 µl of the same buffer. Cell viability and concentration were assessed using trypan blue exclusion. Samples with greater than 85% viability were used for library preparation. Approximately 10,000 cells per sample were loaded onto the 10X Genomics Chromium platform using the 3′ v.4 chemistry.
Data preparation
The Cell Ranger pipeline was used for processing all samples post-sequencing (10X Genomics, v.3.1–5.0). Sample demultiplexing of bcl outputs to fastq files was performed using cellranger mkfastq. All samples were then individually aligned to the human reference genome Hg38 using cellranger count with the intron = true flag. We then ran cellranger aggr to normalize all samples to mapped read depth of the least sequenced sample, as previously described44. A single counts matrix directory containing all samples for a particular analysis was used as input for Seurat analysis in R (4.1.1)31,57. Output quality control metrics from Cellranger count and Cellranger aggr for all samples are found in Supplementary Table 2.
Data analysis
The read depth-normalized aggregated counts matrix output from cellranger aggr was inputted to Seurat (v.4.1.1) using the functions Read10X and CreateSeuratObject. Each sample used in aggregation was identified by time point and replicate and assigned a unique name in metadata as ‘gem.group’. For day 10 and day 20 differentiation samples, quality control filtering included removal of outliers due to number of unique molecular identifies and genes (nFeature_RNA > 2,000 and nFeature_RNA < 8,000), unique molecular identifier (UMI) counts (<30,000) and mitochondrial percentage (<15%). Cell cycle scores were added using the function CellCycleScoring. SCT normalization was then performed with regression based on cell cycle scores (vars.to.regress = c(“S.Score”, “G2M.Score”)). Principal components analysis and batch correction using FastMNN was then performed on sample basis using split.by = “gem.group”. Clustering was then run using the functions RunUMAP, FindNeighbors and FindClusters and the output UMAP graphs were generated by DimPlot. Marker genes were identified by the function FindAllMarkers with standard settings. After initial processing, we performed iterative rounds of filtering poor quality clusters and re-running clustering workflows. Cluster annotation, based on expression of known marker genes, was performed manually to assign clusters with an identity. Differential gene expression of disomic and trisomic cells was performed using the Wilcoxon test between two groups with the function FindMarkers (logfc.threshold = 0.25 and min.pct = 0.1). DEGs were assessed for biological function enrichment using GO analysis via Panther DB58. Dot plot of curated genes based on GO analysis was performed using the function DotPlot. Venn diagrams were generated using the R package VennDiagram and volcano plots were created with the R package EnhancedVolcano. Comparisons of multiple samples with sets (or modules) or genes were performed with function AddModuleScore in Seuart as previously described48.
CROP-seq amplicon library analysis
Amplicon libraries were first demultiplexed through Cellranger mkfastq and count to generate fastq files and bam files, respectively, connected to each sample’s endogenous transcriptome. Following the generation of bam alignment files as output of cellranger count, we used the preconfigured function get_barcodes.py as well as a whitelist of all 218 sgRNA sequences contained within our library to first generate a table of cells, barcodes (sgRNA sequences), read counts and UMI counts56. We then filtered the final output for a minimum and maximum number of read counts to eliminate noise in barcode assignment, allowing us greater confidence in attributing cells with single guide integration. All cells with no guides or more than one guide assignment were not considered for further analysis. Lastly, we then matched cell barcodes in the Seurat object to the output of filtered amplicon barcodes and assigned sgRNA sequence to a specific cell as a metadata value. All cells were associated with one of the 66 candidate chromosome 21 genes and all 20 ‘non-targeting control’ sequences were considered together as control cells. Fold changes were calculated by averaging the exported the log-normalized values for target genes containing all cells with cognate sgRNA sequence compared to all control cells. For the main CROP-seq screen results, clustering was first performed to identify the AVCM population and then log-normalized values for all cells containing a single guide RNA were exported for statistical modelling to develop a trisomy versus disomy classifier. To calculate the Gini coefficient for evaluation of sgRNA evenness, we used the R package DescTools and applied the Gini function.
Initial attempts to perform hit selection in CROP-seq involved using edgeR, WGCNA and traditional differential expression tests in Seurat (Wilcoxon rank-sum test)59,60,61. We also performed Euclidean distance measurements: to compare CRISPRa perturbations with disomic and trisomic reference states, we projected cells into a one-dimensional axis defined by centroids of disomic and trisomic controls using log-normalized expression values of DEGs. For each cell, we computed a Euclidean similarity score bounded between −1 and +1, where −1 indicates maximal similarity to the disomic centroid, +1 indicates maximal similarity to the trisomic centroid and 0 indicates equidistance between the two. Density distributions of these scores were generated for all CRISPRa guides (grey) overlaid with control groups (green, disomic; red, trisomic).
Owing to limitations in cell number and likely contributions of dropout effects, we were unable to achieve significant results in differential testing. Therefore, we adopted an approach that involved a statistical classifier described in the next section.
Statistical model generation
A penalized generalized linear model (logistic regression with L1 penality) selected genes with expression predictive of labelled disomic or trisomic cells (scikit-learn pipeline: SelectKBest(n = 5000), RobustScaler(), LogisticRegression(C = 0.1, penalty = ‘l1’, solver = ‘liblinear’, class_weight = ‘balanced’, random_state = 0)). Training data was log-normalized scRNA-seq counts combined from three separate iPSC differentiations to increase power. The model identified nine genes (TIMP1, NDUFV3, ATP5PF, BANCR, APP, REEP5, CHCHD2, DGKB and BRWD1) sufficient for accurate prediction on held-out cells. A shallow tree-based gradient boosting model obtained similar results. The third iPSC differentiation was most similar to the CROP-seq cells; therefore, an unpenalized linear model was trained on only the third differentiation using these nine genes and scored how trisomic each cell was in unlabelled CROP-seq data.
Each CRISPRa target gene had three different guide RNAs. Cells received a low, disomic-like score if: (1) they were negative controls; (2) their CRISPRa target gene did not drive the cell into a trisomic-like state; or (3) their CRISPRa target gene drove the cell into a trisomic-like state, but the effect had diminished. To identify candidates in the presence of noise caused by case 3, we removed all cells with a score less than the 75th percentile of negative controls and then removed guides having fewer than 20 cells after filtering. We sorted remaining guides by the median trisomic score of their cells and tested the top five candidates.
A UMAP projection was fit using expression of model-selected genes in differentiation 3 (umap-learn; n_neighbors = 15, metric = ‘chebyshev’, init = ‘spectral’). We then projected labelled cells from differentiation 3 as well as unlabelled CROP-seq cells into 2D space, showing which CROP-seq cells co-clustered with disomic and trisomic cells, as well as those in a partially trisomic state.
scATAC–seq
Sample preparation
Day 20 cardiomyocytes from DS1 and DS2U cells were prepared for scATAC–seq according to the manufacturer’s protocols (Chromium Next GEM Single Cell ATAC Library & Gel Bead Kit, 16 rxns PN-1000175). In brief, day 20 cardiomyocytes were isolated by dissociation with 0.25% trypsin as previously described. Nuclei isolation was performed according to the manufacturer’s protocols (CG000169, Rev E). For each replicate, 10,000 cells were loaded per sample and, after library preparation, samples were sequenced on Illumina Novaseq S4 and/or Novaseq X (Illumina, software v.1.5), according to the manufacturer’s guidelines for scATAC–seq sequencing protocols (reads 1 and 2: 50 cycles, i7 and i5: 8 and 16 cycles, respectively).
Data preparation and analysis
scATAC–seq analysis was previously described using the R package ArchR24,62,63. In brief, indexed fragments files for all samples were first used as inputs to generate a sample-specific Arrow file, including both sequencing-derived data and sample metadata. An aggregated project (ArchRProject) was then created after filtering for quality control (transcription start site (TSS) score and minimum and maximum number of fragments per cell). Clustering using iterative latent sematic indexing (LSI) was then performed on the 500 bp TileMatrix followed by batch correction with Harmony and visualization with UMAP. GeneScore plots incorporating information derived from accessible fragments contained within the gene body and surrounding loci were used for cluster annotation, based on previously obtained scRNA-seq data. Peak calling was performed using MACS2 and differentially accessible peaks marking each cluster was obtained64. These peaks were then extracted and used to refine the peak sets used for CUT&RUN analysis by removing those marker peaks associated with contaminating endoderm and fibroblasts clusters.
Generation of HMGN1–3×Flag knock-in line
As the endogenous HMGN1 stop site (TAA) is followed by a T, it provides a perfect TTAA site for a piggyBac transposon insertion and excision. Using this advantage, a piggyBac transposon-based targeting construct was designed to insert a C-terminal 3×Flag tag before the HMGN1 stop site in exon 6 of the HMGN1 locus in the DS2U-dCas9-VPR line. The targeting construct had a piggyBac transposon contained a CAG promoter driving a puromycin-thymidine kinase (PuroΔTK) as positive and negative selection cassette inserted between 477 bp of 5′ homology arm conjugated with 3×Flag tag and 466 bp of 3′ homology arm. The 3×Flag tag and a piggyBac transposon selection cassette were installed into the HMGN1 locus via CRISPR–Cas9–gRNA (gRNA targeting sequences: ATATGGTTATTAATCAGACTTGG)-based homologous recombination. Successfully targeted clones were selected using puromycin (0.8 µg ml−1, Thermo Fisher Scientific) and verified by PCR (primer pair for 5′ arm verification: 5arm-ver-F: CACATCGAACTCACTACCCAGT and 5arm-Ver-R: GCGTACTTGGCATATGATACACTT; primer pair for 3′ arm verification: 3arm-ver-F: GATGCGGTGGGCTCTATGGCTT and 3arm-ver-R: GCACTCTCTATGATGTTCACGCA) and Sanger sequencing. Following confirmation of targeting, the correct targeted clone was subsequently transfected with piggyBac transposase mRNA and subjected for a negative selection with 1 mM ganciclovir and 0.25 μM FIAU (Both from Sigma Aldrich) for transposon excision. Successful transposon excised clones with a seamless 3×Flag integration upstream of the HMGN1 stop site were verified by PCR (forward primer: GTGACCTCAGCTTGGAGTGTACA; reverse primer: CTGACACCCGAGACAGTCAGAG) and Sanger sequencing. 3×Flag-tagged HMGN1 expression was further confirmed by cDNA PCR with reverse transcription (forward primer: GCCCAAGAGGAAGGTCAGCT, reverse primer: CGAGACAGTCAGAGCCTCCCAT) and Sanger sequencing.
CUT&RUN genome occupancy analysis of day 20 cardiomyocytes
Cardiomyocytes differentiated from hiPSCs were collected at day 20 for CUT&RUN analysis. Cells were dissociated using 0.25% Trypsin-EDTA (1×) (Gibco, 25200-072) and neutralized with 1% FBS in PBS. For each condition, 150,000 cells per sample were processed according to the manufacturer’s instructions using the CUTANA CUT&RUN Kit (Epicypher, 14-1048). Cells were first washed and bound to activated Concanavalin A (ConA)–coated magnetic beads in the bead activation buffer. Nuclei were permeabilized with digitonin-containing buffer, and all buffers were supplemented with cOmplete Mini, EDTA-free Protease Inhibitor Cocktail (Millipore Sigma, 11836170001).
Samples were incubated with 0.5 µg of H3K4me3 primary positive control antibody (Epicypher 13-0060), 1 µl of IgG control antibody, and 1 µg of monoclonal anti-Flag M2 (Sigma Aldrich, F1804-200µG) per sample at 4 °C overnight on a rotating mixer. After antibody binding, samples were washed by cell permeabilization buffer. Protein A-MNase fusion enzyme was added to each sample and incubated at 4 °C for 1 h with mixing. Targeted chromatin digestion was initiated by adding calcium-containing digestion buffer and incubating at 0 °C for 30 min. Reactions were stopped with stop buffer supplemented with glycogen and RNase A, followed by incubation at 37 °C for 10 min to release digested chromatin fragments.
DNA was purified using phenol-chloroform extraction or DNA Clean & Concentrator kit (Zymo Research), followed by SPRIselect bead cleanup (Beckman Coulter, B23317) and 80% ethanol washes. Elution was performed with 0.1× TE buffer, and DNA concentration was measured using the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Q32851). Libraries were prepared using the NEBNext Ultra II DNA Library Prep Kit (NEB, E7645), indexed with i5/i7 dual indices, and amplified by PCR. Library DNA size distribution was assessed using the Agilent Bioanalyzer High Sensitivity DNA kit (Agilent, 5067-4626). Final libraries were pooled and sequenced on an Illumina NextSeq 500/550 using the Mid Output v.2 kit (Illumina, 20024904) with the following cycle configuration: read 1: 75 bp, index 1: 8 bp, index 2: 8 bp, read 2: 75 bp. CUT&RUN was performed in two biological replicates for all three sample types: disomic control, disomic with HMGN1 upregulation, and trisomic control.
CUT&RUN data analysis
After sequencing, fastq files were first demultiplexed with sample-specific index primers using bcl2fastq and aligned to GRCh38 using the nf-core/cutandrun pipeline v.3.2.1 with standard settings, Nextflow v.23.10.1.589165. Peak calling was performed using the SEACR algorithm for all samples and IgG controls using the stringent peak calling format with no normalization and an FDR threshold66 of 4 × 10−2. Peaks from each sample replicate IgG as well as marker peaks from scATAC–seq fibroblast and endoderm clusters were subtracted for peak calls for each sample using bedtools subtract -A67. Bigwig files from two biological replicates for each sample were averaged together to yield a single bigWig file using bigWigCompare v.3.5.0 –operation mean. De-duplicated IgG bigWig signal was then subtracted from averaged sample bigWig files using bigWigCompare v.3.5.0 –operation subtract. Box and whisker plots were generated by first converting gene names into region bed files using the R packages biomaRt, data.table, dplyr and GenomicRanges. Gene regions of interest were manually constructed where chr = getBM chromosome name, start = getBM_start_position-10000, and end = getBM_end_positon+10000. Exported bed files were used as input to deeptools multiBigWig summary, yielding coverage intensity matrix file sin.npz format, which were then read into R using numpy and reticulate. Statistical significance in difference of coverage intensity means was calculated using th aov() and tukeyHSD(aov_result) functions from Stats v.4.3.3 (ANOVA with Tukey’s post hoc test). Compact-letter display annotations of statistical significance were created using the multcompLetters4() function from multcompView v.0.1-10 package. For plotting, geom_boxplot() function from Ggplot2 v.3.50 was used to plot coverage values and their respective statistical significance annotations.
Whole-genome sequencing
To define the sgRNA sites on chromosome 21 flanking HMGN1 in disomic and trisomic cells, whole-genome sequencing was performed through Azenta Life Sciences (Chelmsford, MA). In brief, 500,000 cells were pelleted, flash frozen in liquid nitrogen and stored on dry ice for shipping to Azenta. DNA extraction, quality control assessment, quantification and library preparation were performed to the manufacturer’s protocols. Next-generation sequencing and bioinformatics analysis were also performed by Azenta. Sequencing adapters and low-quality bases in raw reads were trimmed using Trimmomatic 0.39. Cleaned reads were then aligned to the Homo sapiens GRCh38 reference genome using Sentieon 202112.01. Alignments were then sorted and PCR and optical duplicates were marked. Single-nucleotide variants and small insertion–deletions were called by using Sentieon 202112.01 (DNAscope algorithm). The VCF files generated by the pipeline were then normalized (left alignment of insertion–deletions and splitting multiallelic sites into multiple sites) using bcftools 1.13. Overlapped transcripts were identified for each variant and the effects of the variants on the transcripts were predicted by Ensembl VEP 104. BAM and VCF files were then inputted to IGV for assessment in the locus surrounding HMGN1.
Allele-specific HMGN1 editing
Allele-specific editing of the HMGN1 gene was performed using the CRISPR–Cas9 system as described68. In brief, two synthetic sgRNAs (Synthego) were designed to mediate a 5.7-kb deletion on a single allele. One sgRNA (5′-AAGGGAAACCAACTGTAATT-3′) targeted a PAM site created by a single-nucleotide variant unique to one allele; the second sgRNA (5′-GTCGGCAAGAAAGGCTATCC-3′) targeted all three alleles, albeit in an intronic region. Ribonucleoprotein complexes were nucleofected into DS1 human iPSCs (passage 52) using the P3 Primary Cell 4D-Nucleofector X Kit (Lonza, V4XP-3032). Cells were maintained on hESC-qualified Matrigel in E8 Flex medium (Gibco) and, 48 h after nucleofection, single cells were sorted by fluorescence-activated cell sorting into 96-well plates using a FACSAria III Cell Sorter (BD Biosciences) to establish clonal cell lines.
Genomic DNA was extracted using QuickExtract DNA Extraction Solution (Lucigen), and a 3-primer PCR was used to screen for heterozygous deletions (primers: 5′-CCCCACGCATCTTGTTCAAG-3′, 5′-CGGGGTGAATTGGGTTCGA-3′, 5′-CCAGAAACGACCCTCGTGCA-3′). Positive clones were validated by long-read sequencing (Oxford Nanopore Technology; Plasmidsaurus) with custom analysis. Restoration of HMGN1 expression to biallelic levels was confirmed by quantitative PCR with reverse transcription using TaqMan Fast Advanced Master Mix (Applied Biosystems) with the HMGN1 probe Hs01633572_g1 and normalization to GAPDH (probe Hs02786624_g1).
Ethics statement
No human research participants, human embryos, gametes or human stem cells in contexts requiring ethical oversight were included in this study.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
نشر لأول مرة على: www.nature.com
تاريخ النشر: 2025-10-22 03:00:00
الكاتب: Sanjeev S. Ranade
تنويه من موقع “yalebnan.org”:
    تم جلب هذا المحتوى بشكل آلي من المصدر:
    www.nature.com
    بتاريخ: 2025-10-22 03:00:00.
    الآراء والمعلومات الواردة في هذا المقال لا تعبر بالضرورة عن رأي موقع “yalebnan.org”، والمسؤولية الكاملة تقع على عاتق المصدر الأصلي.
  
ملاحظة: قد يتم استخدام الترجمة الآلية في بعض الأحيان لتوفير هذا المحتوى.
