│ ├── links/0/<i.j.k> # intra-level link rows (delta=0)
we had a conversation about whether or not to allow for multiple links, possibly with different cardinality
│ ├── links/0/<i.j.k> # intra-level link rows (delta=0)
we had a conversation about whether or not to allow for multiple links, possibly with different cardinality
│ ├── links/+1/<i.j.k> # optional: fine→coarse pyramid edges │ │ # (only when cross_level_storage != "none")
this might be semantics but somehow it feels odd having links for pyramid coarsening and those for say mesh faces living in the same namespace
9.2 Downsampling Strategies
i wonder about writing down in pseudo-code what a reader has to do here, e.g. coming from an object query, how it finds what spatial chunks/fragments, order of the different range requests etc.
This unprecedented scale
wow!
Edge-wise Properties
i guess i fundamentally didnt understand what you were doing with these two kernels. i thought it was the l1 and l2 of the adjacency matrices elementwise
ig. 2
could you label what the networks are? like which species? and likewise for the mice, which genotypes
ASE and LSE each captured a different, but still true, truth about the underlying network. { requestKernel: true, binderOptions: { repo: "binder-examples/jupyter-stacks-datascience", ref: "master", }, codeMirrorConfig: { theme: "abcdef", mode: "python" }, kernelOptions: { kernelName: "python3", path: "./representations/ch6" }, predefinedOutput: true } kernelName = 'python3'
i would use a different color map for core periphery
Discriminability Plot¶
Fig. 9
something like look at distribution of odds ratios
Fig. 4 Th
need left/right caption
reminds some people of something else... makes people think of post hoc pairing...
𝐻𝑘𝑙0:𝑏𝑙𝑒𝑓𝑡,𝑘𝑙=𝑏𝑟𝑖𝑔ℎ𝑡,𝑘𝑙,𝐻𝑘𝑙𝐴:𝑏𝑙𝑒𝑓𝑡,𝑘𝑙≠𝑏𝑟𝑖𝑔ℎ𝑡,𝑘𝑙
bad notation
Plot the number of edges for each lateral type with 99% confidence intervals¶
run fisher's exact test?
Plot the network split out by left/right¶
I don't like the way this looks currently, kinda overemphasizes the contralateral connections because they by definition end up longer
Plot neurons and connectivity¶
Test for bilateral symmetry - latent distribution test¶
this can be part of figure 2
Plot the 2-sample test p-values by varying dimension¶
does this suggest that they're matched?
Look at what the components mean in the probability space¶
sort it the same way as phat above
nan
not sure about this either. it is the only undirected graph though
G = G.to_undirected()
not what we want for symmetrizing. i know it isn't used anymore but would like to clean up this code
From these plots, we can tell that the degree distribution remains relatively static across age.
will need to think about whether I agree here
def vis_degree_dist(dd, max_degree, ages, dd_name):
same comment about using seaborn to simplify
vis_degree_dist(stats.in_degree, stats.graph_origin, stats.sex, "In Degree")
how can count be < 1 ?
vis_stat(stats.density, "Density", use_log=False)
maybe do want to see log, or at least I want to know density of neuprint
ax.xaxis.set_ticklabels(syntypes, rotation=45)
need to set the anchor mode to get these to look right, i can dig up some code somewhere
def vis_stat(stat, stat_name, use_log=True):
would like this to use seaborn, will make code easier to read/modify later
Again the lowest -BIC for each developmental stage is set to 0, allowing for easy comparison.
i really don't like putting BIC for different datasets on the same set of axes, even when setting the min to 0 these are not comparable
The lowest -BIC is set to 0, since the scale doesn’t really matter here.
in some sense, sure, but if the data changes, no longer true. I find this a bit misleading
Drosophila Hemibrain Connectome
we may want to restrict to neurons, not just all fragments. that is why the number of nodes is quite high, I was expecting ~25,000
We can’t visualize this connectome using grapological layouts becuase it is too large! We can still get the basic stats though:
I was able to run this in <30 seconds with some of my stuff https://docs.neurodata.io/notebooks/pedigo/graspologic/connectome/2021/05/06/hemibrain-layout.html I am guessing the bottleneck in graspologic is the no-overlap portion, which is also now optional
C. Elegans
lower case, italics
Elegens
elegans
1707
want to double check this value, I don't remember this being non-monotonic when I looked at the data a long time ago
for i in range(len(layouts)):
we should figure out a way do do a consistent layout between timepoints (a fine way may just be get node positions by embedding the average over time points, and then plotting edges for each graph. there are also aligned-umap approaches).
data = {'Number of Nodes': [], 'Number of Edges': [], 'Density': [], 'Max Out Degree': [], 'Max In Degree': []} num_nodes = len(connectome.nodes) data['Number of Nodes'].append(num_nodes) num_edges = len(connectome.edges) data['Number of Edges'].append(num_edges) data['Density'].append(num_edges / (num_nodes * (num_nodes - 1))) data['Max Out Degree'].append(max(connectome.out_degree, key=lambda x: x[1])[1]) data['Max In Degree'].append(max(connectome.in_degree, key=lambda x: x[1])[1]) df = pd.DataFrame.from_dict(data) df.index = ['Ciona Intestinalis',] df.head()
something like this needs to be a function because it is used in many places in this notebook
connectome.to_undirected()
to_undirected() randomly selects one of the edges (to or from) to be the new edge weight, which is basically never what we want
side_map = {'L': '#FF0000', 'DL': '#B22222', 'VL': '#F08080', 'R': '#0000FF', 'DR': '#00008B', 'VR': '#87CEEB', 'DM': '#800080', 'D': '#800080', 'VM': '#BA55D3', None: '#A9A9A9'}
this color map needs to be stored in a central location in the repo and consistent between notebooks
Now we generate a 2-dimmensional node layout using Graspologic’s layout_tnse.
TODO I'd like to compare to UMAP o Spectral embeddings also
Generalized Random Dot Product Graph (GRDPG)¶
this isn't GRDPG as far as I know https://arxiv.org/pdf/1709.05506.pdf
I think you're just talking about directed RDPG
Recreate fig 4 from FAQ Paper¶
I vote add a shaded background above and below zero for this figure with different colors, and text that says (GM better/GOAT better)
Plot the adjacency matrices sorted by fit partition¶
TODO add lines showing the fit partitions
label
this is essentially a cell type or as close as we can get
Matching objective function: 6229.611251407659 # collapse dataset2_intra_matched = dataset2_intra[perm_inds][:, perm_inds][: len(dataset1_ids)] dataset2_meta_matched = dataset2_meta.iloc[perm_inds][: len(dataset1_ids)]
this indicates that the matching when using the lineage annotations as a soft prior (at least the way we are doing it) isn't much better than the 6227 we got for the uninformed method
Graph matching methods figure (I don’t think these results from Youngser/Carey ever went to a paper anywhere? so I assume CEP would be okay with them here? And we should be able to replicate/improve in python now.)
maybe we should have a version of Ali's GM score figure here
Create a color palette by department/institute
thoughts on using most of the color wheel for depts at hopkins, and then reserving a few colors for the external partners? or even just one distinct color?
heatmap(A[[vtx_perm]] [:,vtx_perm])
you want A[vtx_perm][: vtx_perm]
Plot the alignment for d=7 dimensions¶
not sure what it means here that the seedless procrustes one looks somewhat better to me than the orthogonal procrustes. does this suggest that some of the pairs might be pretty wrong (at least from a graph point of view)?
Graph matching methods figure (I don’t think these results from Youngser/Carey ever went to a paper anywhere? so I assume CEP would be okay with them here? And we should be able to replicate/improve in python now.)
carey is doing something on this - we should talk to him about it
Figure Panels
add a model that brings all of this together?
A priori modeling
graspy paper was also a priori
How similar are the RDPG models (nonpar/semipar)?
have some code for this, need to port
A posteriori modeling
i am not sure what to show here as a lot of this will be in the main paper - perhaps if we figure out the model complexity stuff it'd be interesting?
(Do we know how to do this?)
question
Discriminability
Could run this on the full data. Compute similar metrics to what I normally do on the full data clustering, like
Neuron 16977881 has more than one soma, removing
this is a random KC (1 claw)
Neuron 15571194 has more than one soma, removing
this is APL left
Discriminability plotted as a function of stage 1 and stage 2 dimension
there's a bug in this cell, the cosine/morphology results look just like the euclidean (have since fixed)