Skip to content

MDEV-22269 GROUPING function#5314

Draft
jaeheonshim wants to merge 2 commits into
MariaDB:mainfrom
jaeheonshim:MDEV-22269-grouping
Draft

MDEV-22269 GROUPING function#5314
jaeheonshim wants to merge 2 commits into
MariaDB:mainfrom
jaeheonshim:MDEV-22269-grouping

Conversation

@jaeheonshim

Copy link
Copy Markdown

MDEV-22269: Add support for the GROUPING() function (SQL:2008 feature T431).

@jaeheonshim

Copy link
Copy Markdown
Author

Note: GROUPING is currently broken and segfaults if nested inside a function (e.g.: select name, size, bin(grouping(name, size)), sum(quantity) from t group by name, size with rollup;). This is because in JOIN::rollup_init, we only check the outer-level select list Items, so Item_func_grouping::setup_rollup doesn't get called on any nested GROUPING functions. Working on a fix.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the SQL GROUPING() function, introducing the Item_func_grouping class, adding parser support for the GROUPING keyword, and integrating it with rollup processing. The feedback highlights critical safety improvements: initializing member variables in the Item_func_grouping constructor to prevent undefined behavior, limiting the number of arguments to 64 to avoid bit-shifting overflow, and adding a null-pointer check along with using bitwise OR operations in val_int() for safer bitmask construction.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sql/item_func.h
uint current_level;

public:
Item_func_grouping(THD *thd, List<Item> &args) : Item_int_func(thd, args) {}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The member variables min_rollup_levels and current_level are not initialized in the constructor. In C++, primitive and pointer member variables are not automatically initialized and will contain garbage values. If val_int() or other methods are called before these are explicitly set, it will lead to undefined behavior or a server crash. Please initialize them in the constructor's initializer list.

  Item_func_grouping(THD *thd, List<Item> &args)
    : Item_int_func(thd, args), min_rollup_levels(NULL), current_level(0) {}

Comment thread sql/item_func.cc
Comment on lines +840 to +846
bool Item_func_grouping::setup_rollup(THD *thd, ORDER *group_list,
uint send_group_parts)
{
if (!(min_rollup_levels= thd->alloc<uint>(arg_count)))
{
return true;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The GROUPING() function returns a 64-bit integer bitmask. If arg_count exceeds 64, constructing the bitmask in val_int() will cause undefined behavior due to shifting by 64 or more bits (e.g., 1ULL << (arg_count - i - 1)). To prevent this, please add a check to limit the number of arguments to at most 64.

bool Item_func_grouping::setup_rollup(THD *thd, ORDER *group_list,
                                      uint send_group_parts)
{
  if (arg_count > 64)
  {
    my_error(ER_WRONG_PARAMCOUNT_TO_NATIVE_FCT, MYF(0), "grouping");
    return true;
  }

  if (!(min_rollup_levels= thd->alloc<uint>(arg_count)))
  {
    return true;
  }

Comment thread sql/item_func.cc
Comment on lines +870 to +881
longlong Item_func_grouping::val_int()
{
longlong result= 0;
for (uint i= 0; i < arg_count; i++)
{
if (current_level <= min_rollup_levels[i])
{
result+= 1ULL << (arg_count - i - 1);
}
}
return result;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To ensure defensive programming and avoid potential null pointer dereferences (e.g., if setup_rollup was not called or failed), we should check if min_rollup_levels is null. Additionally, using bitwise OR (|=) on an unsigned ulonglong accumulator is more idiomatic for constructing bitmasks and avoids signed overflow undefined behavior when the MSB (bit 63) is set.

longlong Item_func_grouping::val_int()
{
  if (!min_rollup_levels)
    return 0;

  ulonglong result= 0;
  for (uint i= 0; i < arg_count; i++)
  {
    if (current_level <= min_rollup_levels[i])
    {
      result|= 1ULL << (arg_count - i - 1);
    }
  }
  return (longlong) result;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant