Skip to content

Implement geom_beeswarm using a quasi-random algorithm#1068

Open
const-ae wants to merge 1 commit into
has2k1:mainfrom
const-ae:beeswarm
Open

Implement geom_beeswarm using a quasi-random algorithm#1068
const-ae wants to merge 1 commit into
has2k1:mainfrom
const-ae:beeswarm

Conversation

@const-ae
Copy link
Copy Markdown

@const-ae const-ae commented Jun 7, 2026

Fixes #318.

This PR implements a beeswarm algorithm that produces a (in my opinion) visually more appealing output than the random sampling implemented in geom_sina.

n = 200
df = pd.concat([
    pd.DataFrame({'y': np.random.RandomState(125).normal(0, 10, n), 'x': "beeswarm"}),
    pd.DataFrame({'y': np.random.RandomState(125).normal(0, 10, n), 'x': "sina"}),
])
(ggplot(df, aes(x = 'x', y = 'y')) +
    geom_beeswarm(data = lambda df: df[df['x'] == 'beeswarm'], maxwidth = 0.4) +
    geom_sina(data = lambda df: df[df['x'] == 'sina'], maxwidth = 0.4)
)
image

The code and unit testes are mostly copied from [geom|stat]_sina. I removed the random_state parameter as it isn't needed anymore (except when jitter is called). I wasn't sure what the best way to do here is, so any feedback would be appreciated.

I could also refactor the PR, so that stat_sina and stat_beeswarm use a common code base.

The method uses the van der Corput algorithm to produce a low discrepancy sequence between 0 and 1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

swarmplot

1 participant