Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,8 @@ See [#2611](https://github.com/Rdatatable/data.table/issues/2611) for details. T

26. Grouping by a factor with many groups is now fast again, fixing a timing regression introduced in [#6890](https://github.com/Rdatatable/data.table/pull/6890) where UTF-8 coercion and level remapping were performed unnecessarily, [#7404](https://github.com/Rdatatable/data.table/issues/7404). Thanks @ben-schwen for the report and fix.

27. `dogroups()` no longer reads beyond the resized end of over-allocated data.table list columns, [#7486](https://github.com/Rdatatable/data.table/issues/7486). While this didn't crash in practice, it is now explicitly checked for in recent R versions (r89198+). Thanks @TimTaylor and @aitap for the report and @aitap for the fix.

### NOTES

1. The following in-progress deprecations have proceeded:
Expand Down
8 changes: 4 additions & 4 deletions src/dogroups.c
Original file line number Diff line number Diff line change
Expand Up @@ -308,17 +308,16 @@ SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols, SEX
error(_("RHS of := is NULL during grouped assignment, but it's not possible to delete parts of a column."));
int vlen = length(RHS);
if (vlen>1 && vlen!=grpn) {
SEXP colname = isNull(VECTOR_ELT(dt, INTEGER(lhs)[j]-1)) ? STRING_ELT(newnames, INTEGER(lhs)[j]-origncol-1) : STRING_ELT(dtnames,INTEGER(lhs)[j]-1);
SEXP colname = INTEGER(lhs)[j] > LENGTH(dt) ? STRING_ELT(newnames, INTEGER(lhs)[j]-origncol-1) : STRING_ELT(dtnames,INTEGER(lhs)[j]-1);
error(_("Supplied %d items to be assigned to group %d of size %d in column '%s'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code."),vlen,i+1,grpn,CHAR(colname));
// e.g. in #91 `:=` did not issue recycling warning during grouping. Now it is error not warning.
}
}
int n = LENGTH(VECTOR_ELT(dt, 0));
for (int j=0; j<length(lhs); ++j) {
int colj = INTEGER(lhs)[j]-1;
target = VECTOR_ELT(dt, colj);
RHS = VECTOR_ELT(jval,j%LENGTH(jval));
if (isNull(target)) {
if (colj >= LENGTH(dt)) {
// first time adding to new column
if (R_maxLength(dt) < colj+1) internal_error(__func__, "Trying to add new column by reference but tl is full; setalloccol should have run first at R level before getting to this point"); // # nocov
target = PROTECT(allocNAVectorLike(RHS, n));
Expand All @@ -332,7 +331,8 @@ SEXP dogroups(SEXP dt, SEXP dtcols, SEXP groups, SEXP grpcols, SEXP jiscols, SEX
UNPROTECT(1);
SET_STRING_ELT(dtnames, colj, STRING_ELT(newnames, colj-origncol));
copyMostAttrib(RHS, target); // attributes of first group dominate; e.g. initial factor levels come from first group
}
} else
target = VECTOR_ELT(dt, colj);
bool copied = false;
if (isNewList(target) && anySpecialStatic(RHS, specials)) { // see comments in anySpecialStatic()
RHS = PROTECT(copyAsPlain(RHS));
Expand Down
Loading