Sparse matrix rebase #52

sneakers-the-rat · 2026-01-22T21:04:00Z

This is #40 rebased against the reset main branch for a clean diff. I didn't want to just force push against @daharoni 's branch since it's not mine.

the prior merge of #34 handled a large amount of the removals which made the prior diff so large, so this branch now includes all the other open branches and is caught up to main:

see the sparse-matrix-compression... branch in cyan, and then how add-profiler-and-vis, refactor-pipeline, refactor-deconf and dataclass-refactor are all just references to different commits also contained within sparse-matrix...

we could do each of those one by one, rebasing each time, but because each of them is built on top of the others, rather than each being independently branched from main, it would be much more work to actually review and make changes to each of them: if we did code review and made changes to refactor-deconv, then we would need to rebase refactor-pipeline, add-profiler-and-vis, and sparse-matrix... on top of those changes.

so the goal now is to get the repo caught up and in such a state where we can make incremental improvements to it as independent PRs that are reviewed and merged without getting too far away from main.

sneakers-the-rat · 2026-01-22T22:21:22Z

OK first we made the CI tests actually run all the way through - since they are split out into different steps, if one fails then the others formerly wouldn't run, so we wouldn't have a complete picture of all our test failures so we could actually act on them. Adjusted the workflow to run each of those (which should probably just be one step but whatevs).

So then the first run here: https://github.com/Aharoni-Lab/indeca/actions/runs/21265407119/job/61203238082

unit: 1 failed/42 passed
integration: 228 failed/0 passed
other: 36 failed/18 passed

so that's a lot of failing tests, so much so that the github interface can't really load the whole stdout.

the problem for pretty much all of those is totally trivial: the pydantic model expects a float, the method that wraps creation of the pydantic model is annotated as having a float, but it actually (incorrectly) sets the default as a None, so by passing an explicit None we fail validation. one line fix: d32d635

Now in the latest run:

unit: 0 failed/43 passed
integration: 8 failed/220 passed (with a stunning 10,000 warnings)
other: 0 failed/54 passed

so 265 failing tests down to 8 with a single line change, not bad.

The remaining 8 tests seem like "actual" failures, in the sense that they're failing for a nontrivial reason. Normally what we would want to do is a) be running and looking at the tests as we commit code so that we resolve bugs either before they are commited or immediately after, or b) if there were bugs introduced in some series of commits, do a git bisect to to a binary tree search in the commit history, looking for the commit that introduced the bug.

however the bug was introduced in 58fc9a5 which is the first commit in this branch (which corresponds to #35 ), and that commit is enormous. so to bisect we would have to again rebase this branch, cherry picking the line change into that commit, and do a bunch of other fun git things to retroactively rewrite history.

so now the situation is that

the first commit off the trunk branch introduced the bug which was not fixed through the series of commits.
all the prior runs are uninformative about when this bug was introduced because a) they would trivially fail because of above, and b) the CI failed early and didn't run the integration tests, which is where the current failures are
someone could rebase this branch on a commit history where the trivial bug was resolved, but i'll leave that for someone else
failing that, the bug could have been introduced in any of these commits and the +6.4k/-2.9k line changes, so it's time for some good old fashioned debugging

in the future to avoid this we should

avoid long-running branches that branch off other long running branches and make incremental changes to main
run the tests and look at the test results as we work, addressing bugs as they are introduced

i think one of the problems here is that the tests take eons to run locally, and that's actually a real problem worth resolving: fast tests are tests that are run frequently, and tests that are run frequently are tests that help us write good code. when i'm working on a package i expect to just be constantly running the tests to check my work, and so having the test take so long actually materially contributes to the package being difficult to work on.

daharoni and others added 16 commits January 22, 2026 11:49

Initial refactor attempt

58fc9a5

small fixes based on tests errors

e4a63d9

add option to include legacy bug for comparing to old deconv code

7686dea

black formatting

0d3abd3

refactor pipeline

f1dc38c

black format

fc48a81

add yappi and snakeviz for function level profiling

2e20151

precompute square of Wt matrix

3e6318e

optimize all instances of self.Wt.T @ self.Wt

8b3d122

add result plotting of pipeline after profile run

d5151b8

optimize more by doing row-scalling of Mw

814bab6

run all tests, split plotting step to separate job, fix pdm cache

194f24f

actually ignore test outputs

badd841

rm pytest args that are redundant with pyproject.toml config

700fd3e

match default for wt_trunc_thres

d32d635

do not default to verbose pytest output

563dd5d

make 'format' pdm script

ab624b3

sneakers-the-rat mentioned this pull request Jan 23, 2026

simplify config use/creation #54

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse matrix rebase #52

Sparse matrix rebase #52

Uh oh!

sneakers-the-rat commented Jan 22, 2026

Uh oh!

sneakers-the-rat commented Jan 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Sparse matrix rebase #52

Are you sure you want to change the base?

Sparse matrix rebase #52

Uh oh!

Conversation

sneakers-the-rat commented Jan 22, 2026

Uh oh!

sneakers-the-rat commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sneakers-the-rat commented Jan 22, 2026 •

edited

Loading