prov/lnx: add FI_MSG and FI_RMA support#12209
Open
aingerson wants to merge 13 commits intoofiwg:mainfrom
Open
Conversation
Contributor
Author
|
@amirshehataornl @jfillers FYI this is what I have so far |
This patch doesn't include any functional changes. Just cleans up the code in various ways: - rename lnx_ops.c to lnx_srx.c to include only shared receive code - move tag ops to lnx_msg.c to prepare for adding more functionality while not having lnx_ops.c get too out of hand - reduce extern functions declared in lnx.h/add static to appropriate functions - remove unused functions and definitions - add missing definition of LNX_SUB_ID_BITS instead of hardcoded value - standardize function declaration formatting - fix formatting changes (line length, alignment, trailing whitespace) - remove a few unnecessary comments - cleanup unnecessary headers Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
…cess Consolidate environment variables into a single struct that is initialized in one place on getinfo. This helps the code organization and makes the environment variables easily findable. This also removes redundant environment variable look ups (like the lookup happening on everything single fi_av_insert). Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
fi_mr_test requires FI_RMA which lnx does not support yet Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
LNX has limitations in capabilities (for example, does not support FI_RMA, FI_RMA, etc). Even if the linked providers all support a requested capability, it does not mean the lnx provider can be used. We need to check against the supported lnx capabilites and properly return -FI_ENODATA if the application requested something the provider does not support. This adds a call to ofi_check_info and updates the lnx capabilities to the correct subset of supported capabilites for validation. It also modifies the shared tx/rx ctx attributes since lnx does not support those as well as the mr mode because lnx does not require FI_MR_RAW. In addition to the improper checking, lnx was not setting the returned info->caps, tx_attr->caps, and rx_attr->caps to the application and always returning 0 for all capabilities. This also adds checking of linked provider capabilites during generation to properly set the returned capabilities to the application. Request FI_PEER and FI_AV_USER_ID for linked providers. Support for the peer API is required to be linked together using the lnx provider. Switch lnx_generate_link_info params to match convention of (input, output) Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
The current method for registering MRs with the core providers works if there is only one domain or if the domains can somehow use each other's keys but the keys for the domains could be different and, since there is only one stored core mr fid, lnx will always use the mr fid from the first domain it was used on. Change the core mr fids into an array so we can register on every domain. The ep/domain will contain the index so we can make sure to register it on and return the correct core fid. Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
lnx was just taking the first iov/descriptor but advertising support for multiple IOVs. Support for multiple IOVs requires translating the array of descriptors into an array of core provider descriptors Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
We shouldn't be relying on global resources. There's no reason to have the entries come from different locations. We can just use the lep receive bufpool. We also don't need a separate lock for accessing the bufpool; we can just use the util_ep lock which has the bonus of being able to be optimized out when not necessary Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
Add support for the FI_RMA APIs. This is done by requiring FI_MR_RAW if FI_RMA support is requested. The keys for all underlying core providers are stored in an array (accessed by domain index) so the application key is 8 * num_domains (thus requiring the larger key). The app will exchange the raw key and then map it on the remote side to get a local uint64_t key for use in the RMA calls. This key will be a pointer to an internal structure (lnx_mr_key) which will hold all the core provider keys for use in the actual RMA calls Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
Add FI_MSG support by using (existing) regular message queues Consolidates and refactors some code to be used in both sets of functions Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
Ubertest was sometimes skipping the MR raw attr/map steps for FI_MR_RAW causing a map failure with providers that required the mapping Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
This lets the fi_av_xfer test pass which was failing on reinsert because the buffer was already allocated and could not be allocated for the re-insert Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
Remove exclusions for tests now valid with addition of FI_MSG and FI_RMA Add fi_ubertest configurations for new functionality Signed-off-by: Alexia Ingerson <alexia.ingerson@intel.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is on top of the refactor in #12188
Fixes various bugs in lnx in addition to adding FI_MSG and FI_RMA support. Opening up for CI testing and initial comments but this is not finalized. There are still some lingering holes (for example supporting FI_MR_VIRT_ADDR properly)