-
Notifications
You must be signed in to change notification settings - Fork 3
Description
-
Select all binding samples (set up all binding samples in VirtualDB config)
If there is a dataset that cannot be configured (no annotatedfeature dataset),
put that in the issue discussion. @cmatKhan will address it. Do the analysis
with available binding data. -
After selecting all binding samples from all datasets with DTO P<=0.01 compared
to either Hackett-2020-ZEV or Kemmeren-2014-TFKO, investigate whether some TFs
pass in most datasets while others pass in almost none.Challenge: Deciding which Hackett condition requires examining Hackett data
and setting filters such that there is 1 hackett sample per regulator, OR explaining how multiple
conditions per regulator affects results. In particular, we are interested in the effect over time. Which
timepoint is best? What is the effect of time? DTO distribution over time is a good output here. Additionally,
it should be possible to set different filters on a per regulator basis such that if there is a ZEV and GEV sample,
then we can choose the one that performs best.Analysis steps:
i. Select binding samples from all datasets with DTO vs. Hackett-2020-ZEV P<=0.01
ii. Select binding samples from all datasets with DTO vs. Kemmeren-2014-TFKO P<=0.01
iii. Intersect the previous two sets (this is probably a composition of the filter above)
iv. For regulators in any active-set sample, present number of active samples:
- As a table: one row per regulator + count
- As a distribution: across TFs of the count above