Skip to content

Conversation

@e-ago
Copy link
Collaborator

@e-ago e-ago commented Jul 2, 2018

Libmp:

  • mp_register() now uses ibv_exp_reg_mr instead of ibv_reg_mr in order to take the advantage of the new IBV_EXP flags
  • mp_register() now enables implicit ODP in case of (exp_flag & IBV_EXP_ACCESS_ON_DEMAND)

Comm:

  • new function comm_register_odp() useful to call mp_register with IBV_EXP_ACCESS_ON_DEMAND
  • comm_pingpong test reworked with input options instead of env vars

e-ago added a commit to gpudirect/gdasync that referenced this pull request Jul 2, 2018
@e-ago e-ago requested a review from drossetti July 2, 2018 20:34
return ret;
}

int comm_register_odp(comm_reg_t *creg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this API is not expressive enough.
You are effectively exposing all/most of the local process memory.

Explicit/implicit ODP support has limitations, see https://community.mellanox.com/docs/DOC-2898, e.g. have to check for capability.


if (!*reg) {
DBG("registering implicit ODP MR\n");
MP_CHECK(mp_register(NULL, 0, reg, IBV_EXP_ACCESS_ON_DEMAND));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentations says:
"To register an Implicit ODP MR, in addition to the IBV_EXP_ACCESS_ON_DEMAND access flag, use in->addr = 0 and in->length = IBV_EXP_IMPLICIT_MR_SIZE."
so 0 is not a good size.
I would provide a high level API in libmp, one which does all the appropriate tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the size parameter is ignored by mp_register() when ODP is enabled. So technically this code is correct

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what happens if the app calls comm_register_odp() multiple times. Does ibv_reg_mr() succeed? Does it return the same mr / lkey ?

static void usage()
{
printf("Options:\n");
printf(" -g allocate GPU intead of CPU memory buffers\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "intead"

tot_iters = MAX_ITERS;
int c;

while (1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation broken here

void mp_finalize();

int mp_register(void *addr, size_t length, mp_reg_t *reg_t);
int mp_register(void *addr, size_t length, mp_reg_t *reg_t, uint64_t exp_flags);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non need for 64bit flags, so just use the int type as in other APIs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are effectively changing both the API and the ABI of libmp, so we'll need to bump the major version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would define here a MP_ flag to express the concept of "register all process memory", instead of forwarding exp_flags to ibv_exp_reg_mr()


int mp_register(void *addr, size_t length, mp_reg_t *reg_)
int mp_register(void *addr, size_t length, mp_reg_t *reg_, uint64_t exp_flags)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned above, please change the type of exp_flags to int
and introduce a new enum { MP_REGISTER_FULL_VIRTUAL_ADDRESS_SPACE }; or similar


if(exp_flags & IBV_EXP_ACCESS_ON_DEMAND)
{
dattr.comp_mask = IBV_EXP_DEVICE_ATTR_ODP | IBV_EXP_DEVICE_ATTR_EXP_CAP_FLAGS;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not query the caps all the time, but rather do that lazily once and cache the result

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok to address this performance issue in a future fix

mp_err_msg("ibv_reg_mr returned NULL for addr:%p size:%zu errno=%d(%s)\n",
addr, length, errno, strerror(errno));

#ifdef DADO_DEBUG
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel free to remove that stale code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants