General changes to generate negative samples #1

yutuyt01 · 2025-06-10T15:10:22Z

Wrote this quickly before I left Friday afternoon and forgot that I didn't submit the pull request, sorry

We talked about making a more general solution possible, so I believe the changes should allow for the original functionality and the functionality that I need in BEELINE. I also bumped down the python dependencies to a version BEELINE would allow - I'm not sure if this breaks any other function, to my knowledge it doesn't and through some limited testing of the splitting, verification and negative sample generation. I removed the parameters about a graph being undirected/source column, but can put those back in if you're planning to implement that as a feature.

Also, the changes fix (what I believe to be) a minor bug in the random selection of negative samples. Specifically, that for two edges with the same exact target set, that they will always choose the same targets since the sampling is based off a set seed for reproducibility that does not change. For example, TF a, b occur once in a dataset and target the same gene c - the negative sample generated will be the same always - (a, random) = (b, random). Unlikely to be a problem at all in most datasets, but I simply changed the seed per gene pair iterated over. This will still result in reproducibility, should just ensure "randomness".

Let me know if this works, can make any changes. I was also thinking it may be a good idea to set up an auto export to PyPI with a GitHub action, and I can look into doing that if you think it would be easier to maintain. Thanks for the help with these scripts!
Tim

yutuyt01 added 7 commits June 6, 2025 19:04

changes to pyproject.toml

2fe3178

negative_samples consolidation

2a67faf

fixes after testing to generate_negative_samples

b155611

whoops

fda0314

changes to random

0c4fe8c

changes rng seed to randomstate

5cfae3a

change variable name for clarity (minor change)

0832b54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

General changes to generate negative samples #1

General changes to generate negative samples #1

Uh oh!

yutuyt01 commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

General changes to generate negative samples #1

Are you sure you want to change the base?

General changes to generate negative samples #1

Uh oh!

Conversation

yutuyt01 commented Jun 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants