-
Bug fixes
- Fixes a bug where an upgrade to Illumina NovaSeq control software v1.8 (reagent name change in recipe XML file) resulted in a silent cellranger-arc mkfastq error and a significant number of reads going into Undetermined/ because the orientation of i5 (Index2) could not be autodetected
-
Bug fixes
- Fixes an issue where the metric "post-normalization number of reads" in the
aggr
web summary was incorrect. - Fixes an issue where integer overflow could occur on large
aggr
orreanalyze
runs. - Fixes an issue where sample indices in the FASTQ header line were ignored unless the indices were reverse complemented.
- Fixes an issue where gex_umis_count in per_barcode_metrics.csv were all zeros.
- Fixes an issue where the metric "post-normalization number of reads" in the
-
New wavelet-based peak caller:
- Eliminates large peaks much larger than 5kb. Peaks have a tighter size distribution around ~ 1kb.
- Improves detection of cluster-specific peaks.
- Improves reproducibility between technical replicates.
- Consistent performance across a range of cell loads.
- Fixes crashes in the signal-background fitting procedure.
-
Changed the duplicate marking algorithm:
- Two read pairs are duplicates if they share the same start, end and cell barcode. Previously, only the start and end were used.
- Boosts median fragments per cell by as much as 25% at high cell loads.
- PCR and sequencer duplicates are no longer distinguished.
-
Improved computational performance:
- Up to 4x faster, 0.5x disk requirements.
- Complete rewrite of read processing and differential accessibility analysis in Rust.
- Minimize disk I/O.
-
Added the
aggr
pipeline to aggregate multiple libraries from multiple GEM wells. -
Added the
reanalyze
pipeline to enable customized reanalysis. -
Changed cell caller override: when
--min-atac-count=X
and--min-gex-count=Y
is specified we choose all barcodes with atac count >= X and (previously or) gex count >= Y. -
The raw feature barcode matrix now only contains barcodes with at least one ATAC count or one GEX count; previously, all allowed barcodes were present.
-
Change in barcode error correction: correct Hamming distance 1 errors (previously 2).
-
Added header lines beginning with
#
to the fragments.tsv.gz and peaks.bed files that contain version, reference and sample information. -
Change to ATAC peak annotation TSV format:
- The
peak
column is now split into three columns:chrom
,start
,end
. - When a peak has multiple gene annotations, the same peak appears in multiple rows with each annotation. Previously, each row represented one peak and multiple annotations were expressed using
;
separators in the same row.
- The
-
Improvements to reference construction:
- Restrictions on the number of contigs or gene-containing contigs (primary contigs) in the reference have been eliminated.
- Bug fix for GTFs that do not contain a
gene_name
attribute. - Contig names are allowed to contain
-
or_
characters. - Eliminated discrepancies between reference checks in
mkref
and preflight checks incount
. Previously, it was possible to pass checks inmkref
and fail checks incount
.
-
Eliminated secondary alignments from the position-sorted BAM.
-
Loupe browser files generated by the pipeline can only be opened by Loupe browser version 5.0 or later.
- Increases the upper limit on primary contigs (those that have at least one gene annotation) from 100 to 500. The pipeline will error out if more than 1000 total contigs are present in the reference.
- Disables multithreading in mkref to address an issue where mkref would fail on hardware without AVX support. This will be fixed in a future release.
- Creates
cellranger-arc count
for the analysis of 10x Chromium Single Cell Multiome ATAC + Gene Expression data generated from a single GEM well. - Creates
cellranger-arc mkfastq
for demultiplexing of ATAC or GEX flow cell data for analysis. - Creates
cellranger-arc mkref
to construct reference packages for use withcellranger-arc count
starting with a reference FASTA file and a gene annotations GTF file. - Note: the software cannot be used for the analysis of 10x Chromium Single Cell ATAC, 10x Chromium Single Cell 3' Gene Expression data, or for any kind of joint analysis of the two.