You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The way we handle ROIs in the neuprint data model has always rubbed me the wrong way. There's got to be a better way.
Disclaimer: I wasn't involved in the development of the original data model, so perhaps the tweaks I'd like to propose were already considered and rejected.
The Problems
There are two independent reasons handling ROI information is awkward in neuprint:
ROIs are permitted to overlap. That sounds flexible, but it introduces unnecessary complications and inconveniences. This forces us to distinguish between "primary" and non-primary ROIs. If one is interested in a non-primary ROI, one must be very careful when constructing queries and interpreting the results to ensure that duplicate results are accounted for properly. I'm certain this is not intuitive to newcomers.
ROI information is hidden away in a JSON property (roiInfo) within certain nodes. This makes filtering or otherwise manipulating the ROI information awkward. One must rely on special-purpose Cypher functions (e.g. apoc.convert.fromJsonMap()) or simply download a lot of little roiInfo JSON objects on the client and perform the filtering/manipulation on the client side. Yuck.
Possible Remedies
To address problem 1, I think we should simply require that ROIs are strictly hierarchical, and track each :Element, :Synapse, etc. according to the single bottom-level ROI it is contained in. When fully-qualified, ROI names will include the complete hierarchy, e.g. CX.PB.PB(L1), or possibly even hemibrain.CX.PB.PB(L1). Where necessary, convenience functions can be provided to map from an simple ROI name to its fully-qualified name. In client libraries such as neuprint-python, we'll make use of Cypher's regular expression features to refer to higher-level ROIs, e.g. CX.PB.* to capture everything in the PB. Note that under this scheme, there is no need to assign some ROIs a "primary" status.
To address problem 2, I think we can encode ROI information as additional nodes or edges in the data graph. There are probably multiple ways to do that, but the simplest that comes to mind is to add parallel nodes or edges in every place where we'd normally use an roiInfo.
For :Element nodes (including :Synapse), non-overlapping ROIs as described above would allow us to replace roiInfo with a simple string property.
Looking at the data model diagram, I think we might want to add parallel :ConnectsTo edges (one per ROI) between :Neuronnodes and also parallel edges between :SynapseSet nodes. Alternatively, we could add parallel :SynapseSet nodes themselves, but I'm not sure if that would make things more or less confusing.
[Edit: There are other possibilities. One is to add properties to each :ConnectsTo edge for the ROI synapse totals for each ROI of the connection.]
The text was updated successfully, but these errors were encountered:
The way we handle ROIs in the neuprint data model has always rubbed me the wrong way. There's got to be a better way.
Disclaimer: I wasn't involved in the development of the original data model, so perhaps the tweaks I'd like to propose were already considered and rejected.
The Problems
There are two independent reasons handling ROI information is awkward in neuprint:
ROIs are permitted to overlap. That sounds flexible, but it introduces unnecessary complications and inconveniences. This forces us to distinguish between "primary" and non-primary ROIs. If one is interested in a non-primary ROI, one must be very careful when constructing queries and interpreting the results to ensure that duplicate results are accounted for properly. I'm certain this is not intuitive to newcomers.
ROI information is hidden away in a JSON property (
roiInfo
) within certain nodes. This makes filtering or otherwise manipulating the ROI information awkward. One must rely on special-purpose Cypher functions (e.g.apoc.convert.fromJsonMap()
) or simply download a lot of littleroiInfo
JSON objects on the client and perform the filtering/manipulation on the client side. Yuck.Possible Remedies
To address problem 1, I think we should simply require that ROIs are strictly hierarchical, and track each
:Element
,:Synapse
, etc. according to the single bottom-level ROI it is contained in. When fully-qualified, ROI names will include the complete hierarchy, e.g.CX.PB.PB(L1)
, or possibly evenhemibrain.CX.PB.PB(L1)
. Where necessary, convenience functions can be provided to map from an simple ROI name to its fully-qualified name. In client libraries such asneuprint-python
, we'll make use of Cypher's regular expression features to refer to higher-level ROIs, e.g.CX.PB.*
to capture everything in thePB
. Note that under this scheme, there is no need to assign some ROIs a "primary" status.To address problem 2, I think we can encode ROI information as additional nodes or edges in the data graph. There are probably multiple ways to do that, but the simplest that comes to mind is to add parallel nodes or edges in every place where we'd normally use an
roiInfo
.For
:Element
nodes (including:Synapse
), non-overlapping ROIs as described above would allow us to replaceroiInfo
with a simple string property.Looking at the data model diagram, I think we might want to add parallel
:ConnectsTo
edges (one per ROI) between:Neuron
nodes and also parallel edges between:SynapseSet
nodes. Alternatively, we could add parallel:SynapseSet
nodes themselves, but I'm not sure if that would make things more or less confusing.[Edit: There are other possibilities. One is to add properties to each
:ConnectsTo
edge for the ROI synapse totals for each ROI of the connection.]The text was updated successfully, but these errors were encountered: