Conversation
Co-authored-by: Danny Friar <dannyfriar@hotmail.co.uk>
Co-authored-by: Danny Friar <dannyfriar@hotmail.co.uk>
…make it clear what we are skipping.
|
@Balandat After more discussion with @hughsalimbeni and @dannyfriar here is our initial proposal to expand the library as per the previous discussion. |
|
Sorry for the delay @corwinjoy - this PR is on my todo list for Monday. |
…re advanced tests. This allows us to create and test operators that only support core operations.
|
@gpleiss Thanks for all the great feedback! I believe I have addressed all these and would appreciate a second look when you have time. |
|
@corwinjoy after playing around with this PR some more, and dealing with the typeguard errors, I made 2 commits that change around some of the internals. The first is hopefully not to controversial, the second one might be :)
I rearranged some of the logic, in what I hope won't affect anything that you've already written.
My proposed solution: the user passes in a flattened list of Anyways, both commits are now part of this branch. Let me know your thoughts, and we can always revert and figure out different solutions. |
|
@gpleiss Thanks for the detailed review l. It all sounds good except for possibly the constructor which I need to think about. I am traveling right now but should be able to take a detailed look by Wednesday. @dannyfriar any thoughts? @hughsalimbeni |
I also think that passing in a list of lists is more natural. One thing we'd like to do with this is have the ability to rotate the blocks - this is simpler with the list of lists but I'm sure it can be made to work with a flat list too. @gpleiss out of curiosity what are the hacks that you mentioned that require LOs/tensors? |
|
@gpleiss Thanks for the code changes! Looking at these, they make sense. I like the better matmul function. The flattened list of operators for the constructor is not my first choice, but if that is what needs to happen for compatibility I can live with it. We do have a from_tensor helper constructor already and if we can add a nested list helper later if we find out we need it. Like @dannyfriar I am a bit curious about the hacks that require this format if you do have time to explain. At any rate, I am happy with the changes as they are. |
|
This looks pretty close to something that I also need: a block-Toeplitz linear operator, which arises in multi-output GPs with kernels that are stationary over their 1-D input space. This is relevant to LLM work, where the input space is sequence location and the output space is token categorical probability. If I were to simply replicate LazyTensor blocks as required in the list of N**2 blocks to make a block-Toeplitz structure, would that work, or would it break something (like autograd treating the blocks as independent variables when they are not)? |
Idea
Represent [TN, TM] tensors by TxT blocks of NxM lazy tensors. While block matrices are currently supported, the efficient representation is only when there is a diagonal structure over the T dimensions.
Pitch
Add a block linear operator class that can keep track of the [T, T] block structure, represented as T^2 lazy tensors of the same shape. Implement matrix multiplication between block matrices as the appropriate linear operators on the blocks.
Previous Discussion
Issue #54
Additional Considerations
In pursuing this, it seems that the base test class checks for many operations beyond what is required to create a LinearOperator. I propose a refactoring of the test class into required / core operations and optional operations. For now, I have created a new core test class CoreLinearOperatorTestCase and have shown what has been excluded by commenting out the relevant code. This idea could also use a review for accuracy.