Open
Description
Currently, masking on these is done after the fact, meaning a lot of compute power could have been wasted calculating things that won't survive the mask. A better approach would be to pass the mask into the compiled code, iterate over it in linalg.generic
as a third input, and apply the mask inside the loop.
Metadata
Metadata
Assignees
Labels
No labels