Clonal analysis performance:
IUPAC parameter to hierarchicalClones. When IUPAC=TRUE, sequences
containing IUPAC ambiguity codes are retained and distances are computed using an
IUPAC-aware substitution matrix (via alakazam::pairwiseDist). This was
the default behavior of previous versions of hierarchicalClones.hierarchicalClones with a new C++
Hamming distance implementation (fastDist_rcpp), used when
IUPAC=FALSE (default) and method="nt". With IUPAC=FALSE, sequences containing
IUPAC ambiguity codes (e.g., R, Y, W, S, M, K) are rejected; only standard
bases (A, T, C, G), N, and ? are allowed.Bug fixes:
hierarchicalClones with average
linkage that caused hclust to fail. Distances are now rounded before
retrying if the initial call fails.General:
scoper has moved to GitHub: https://github.com/immcantation/scoper.hclust function from the fastcluster package instead of the one from stats to perform hierarchical clustering on large vjl groups (greater than 65,536 unique sequences).Clonal analysis:
only_heavy and split_light. All clonal identification methods now cluster by heavy chain only. To split clones by light chain groups use dowser::resolveLightChains.hierarchicalClones, the default value of summarize_clones is now FALSE for improved performance. Users can set summarize_clones=TRUE when needed.Documentation:
General:
locus column to the package's example data, to satisfy
alakazam >= 1.3.0 requirements.Bug fixes:
Fixed a bug defineClonesScoper where the bulk clonal clustering was using
light chain sequences. Now, light chain sequences are removed based on the
locus information. If the column locus does not exist, it is created with
alakazam::getLocus(v_call).
Fixed a bug in defineClonesScoper where the split_light part of the
algorithm was reanalyzing heavy chain v_calls and using this information to
sometimes split clone_id groups into subgroups. Only light chain v_calls
should be used for this. The bug could be observed in situations where
first=FALSE and the 'linker' ambiguous heavy chain v_calls were left out of
the same clone_id group because of the junction distance threshold.
Fixed parallelization setup for defineClonesScoper.
General:
General:
N to any
characters except [ATCG].Cloning:
fields argument to identicalClones, hierarchicalClones and
spectralClones to allow for data partitioning prior to clonal assignment.cell_id column.identicalClones, hierarchicalClones and
spectralClones when specifying nproc > 1.Backwards Incompatible Changes:
V_CALL (Change-O) as the default to identify the field that stored
the V gene calls, they now use v_call (AIRR). That means, scripts that
relied on default values (previously, v_call="V_CALL"), will now fail if
calls to the functions are not updated to reflect the correct value for the
data. If data are in the Change-O format, the current default value
v_call="v_call" will fail to identify the column with the V gene calls
as the column v_call doesn't exist. In this case, v_call="V_CALL" needs
to be specified in the function call.ExampleDb converted to the AIRR Rearrangement standard and examples updated
accordingly.defineClonesScoper function to three functions: identicalClones,
hierarchicalClones, and spectralClones.General:
Cloning:
Deprecated:
analyzeClones is deprecated. The clonal analysis has been added
to the main function defineClonesScoper as an argument analyze_clones.analyze_clones set to be true, otherwise a single dataframe is
returned.plot_neighborhoods from clonal analysis has been deprecated.neighborhoods from clonal analysis has been deprecated.General:
hierarchical for hierarchical-clustering based, and identical
for clustering among identical junction sequences are added.vj in argument method.Clonal analysis:
calculateInterVsIntra function. Now, "inter" is the label used to form
distances that mean between clones, and "intra" is the label used to form
distances that mean on the inside, within each clone.plotInterVsIntra output from a density plot to a histogram.Initial public release.