Hypothesis

272 Matching Annotations

Mar 2022
docs.neurodata.io docs.neurodata.io

1. Preface — Network Machine Learning in Python

1
1. aloftus2 28 Mar 2022
  
  in Public
  
  Well, at some level, every aspect of reality seems to be made of interconnected parts. Atoms and molecules are connected to each other with chemical bonds. Your neurons connect to each other through synapses, and the different parts of your brain connect to each other through groups of neurons interacting with each other. At a larger level, you are interconnected with other humans through social networks, and our economy is a global, interconnected trade network. The Earth’s food chain is an ecological network, and larger still, every object with mass in the universe is connected to every other object through a gravitational network.
  
  pagerank algorithm?
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/introduction/preface.html
docs.neurodata.io docs.neurodata.io

1. Why embed networks? — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

1
1. aloftus2 28 Mar 2022
  
  in Public
  
  6.2. Why embed networks?
  
  redo
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/why-embed-networks.html
Feb 2022
docs.neurodata.io docs.neurodata.io

5.6. Network Models with Covariates — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

1
1. aloftus2 15 Feb 2022
  
  in Public
  
  rk 𝐀(𝑚)
  
  , and that this random network was the data-generating process that we were observing $A^{(m)}$ from.
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/models-with-covariates.html
docs.neurodata.io docs.neurodata.io

9.1. Two-Sample Hypothesis Testing — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

1
1. aloftus2 11 Feb 2022
  
  in Public
  
  orthogonal procrustes problem
  
  dont use the terminology if you're not gonna explain it, talk about this in terms of rotations id say
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch9/two-sample-hypothesis.html
docs.neurodata.io docs.neurodata.io

10.3. Testing for Significant Vertices — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

1
1. aloftus2 09 Feb 2022
  
  in Public
  
  Vertices
  
  nodes
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch10/significant-vertices.html
Dec 2021
docs.neurodata.io docs.neurodata.io

9.1. Two-Sample Hypothesis Testing — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

12
1. aloftus2 03 Dec 2021
  
  in Public
  
  s
  
  $n$ observations, each observation being one of the $X_i$
2. aloftus2 03 Dec 2021
  
  in Public
  
  Our model could say that this distribution of a Gaussian, for example, in which case we would have to estimate the parameters of this Gaussian
  
  rephrase
  
  "We could assume, for example, that our data are normally distributed. Then, our only goal would be to estimate the parameters of this distribution -- this is called a parametric model. On the other hand, we might not know what the distribution is at all, and we're just trying to fit our data the best we can. This situation is called a nonparametric model.
3. aloftus2 03 Dec 2021
  
  in Public
  
  We consider the distribution where our observations are sampled from as really anything
  
  The distribution our observations are sampled from could be anything
4. aloftus2 03 Dec 2021
  
  in Public
  
  Gaussian
  
  always say normal distribution instead of gaussian
5. aloftus2 03 Dec 2021
  
  in Public
  
  that
  
  delete
6. aloftus2 03 Dec 2021
  
  in Public
  
  that means making a few assumptions about how the data we observed came to be
  
  I like this sentence
7. aloftus2 03 Dec 2021
  
  in Public
  
  B
  
  New line
8. aloftus2 03 Dec 2021
  
  in Public
  
  The first is given two graphs, and their corresponding latent positions, are their positions the same?
  
  Both tests compare the latent positions of the two networks, in slightly different ways. The first type of test is intended to figure out if the latent positions themselves are exactly the same between the two networks. The second is intended to determine whether the distributions of the latent positions between the two networks are the same.
9. aloftus2 03 Dec 2021
  
  in Public
  
  graphs
  
  networks
10. aloftus2 03 Dec 2021
  
  in Public
  
  scans
  
  consider saying "networks created by the MRI scans" instead of "scans"
11. aloftus2 03 Dec 2021
  
  in Public
  
  there is a difference
  
  you can distinguish between
12. aloftus2 03 Dec 2021
  
  in Public
  
  thus
  
  so far
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch9/two-sample-hypothesis.html
docs.neurodata.io docs.neurodata.io

9.2. Graph Matching and Vertex Nomination — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

4
1. aloftus2 03 Dec 2021
  
  in Public
  
  𝑟ℎ𝑜
  
  reference chapter 5 here
2. aloftus2 03 Dec 2021
  
  in Public
  
  𝑛𝑝𝑟ℎ𝑜=[100,100,100]=⎡⎣⎢⎢0.70.30.40.30.70.30.40.30.7⎤⎦⎥⎥=0.9
  
  I don't like how this looks aesthetically
3. aloftus2 03 Dec 2021
  
  in Public
  
  <AxesSubplot:title={'center':'A-B [Unshuffled]'}>
  
  dont show
  
  add ; to the end of matplotlib code
4. aloftus2 03 Dec 2021
  
  in Public
  
  𝑎→𝑎a→aa \rightarrow a 𝑏→𝑏b→bb \rightarrow b 𝑐→𝑑c→dc \rightarrow d 𝑑→𝑐
  
  i'd just write this out instead of doing this arrow stuff
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch9/graph-matching-vertex.html
Nov 2021
docs.neurodata.io docs.neurodata.io

4.4. Regularization — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

6
1. aloftus2 20 Nov 2021
  
  in Public
  
  choose a particular percentile, and divide it by 100100100 to obtain the quantile
  
  it's a little unclear to me why we're dividing by 100.
  
  if have edge weights 3, 5, 7, and we pick 20%, then we'd get 0.2 when dividing by 100. But we wouldn't set any edge weights below 0.2 to 100, because there are no edge weights below 0.2. I dont think that's what you mean to do, but I think the phrasing implies that?
2. aloftus2 20 Nov 2021
  
  in Public
  
  it might in fact overfit the training data and model spurious noise, which raises the variance
  
  A low-bias model, for instance might fit to our training data too well. This fit would just model noise, raising the variance when applied to new data.
3. aloftus2 19 Nov 2021
  
  in Public
  
  ′=12(𝐴+𝐴⊤)=12⎛⎝⎜⎜⎜⎡⎣⎢⎢⎢𝑎11⋮𝑎𝑛1...⋱...𝑎1𝑛⋮𝑎𝑛𝑛⎤⎦⎥⎥⎥+⎡⎣⎢⎢⎢𝑎11⋮𝑎1𝑛...⋱...𝑎𝑛1⋮𝑎𝑛𝑛⎤⎦⎥⎥⎥⎞⎠⎟⎟⎟=⎡⎣⎢⎢⎢12(𝑎11+𝑎11)⋮12(𝑎𝑛1+𝑎1𝑛)...⋱...12(𝑎1𝑛+𝑎𝑛1)⋮12(𝑎𝑛𝑛+𝑎𝑛𝑛)⎤⎦⎥⎥⎥=⎡⎣⎢⎢⎢𝑎11⋮12(𝑎𝑛1+𝑎1𝑛)...⋱...12(𝑎1𝑛+𝑎𝑛1)⋮𝑎𝑛𝑛⎤⎦⎥⎥⎥
  
  feels a little bulky, maybe unecessary
4. aloftus2 19 Nov 2021
  
  in Public
  
  𝑎𝑖𝑗
  
  delete
5. aloftus2 19 Nov 2021
  
  in Public
  
  node 𝑖ii being stimulated leading to node 𝑗jj does not necessarily mean that node 𝑗jj being stimulated leads to node 𝑖ii being stimulated.
  
  rephrase
6. aloftus2 19 Nov 2021
  
  in Public
  
  On wikipedia,
  
  i'd maybe delete?
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch4/regularization.html
docs.neurodata.io docs.neurodata.io

8.4. Single-Network Vertex Nomination — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

5
1. aloftus2 12 Nov 2021
  
  in Public
  
  individual
  
  for the second method
2. aloftus2 12 Nov 2021
  
  in Public
  
  purple
  
  why not blue
3. aloftus2 12 Nov 2021
  
  in Public
  
  higher
  
  ranks closer to 1
4. aloftus2 12 Nov 2021
  
  in Public
  
  euclidean
  
  E
5. aloftus2 12 Nov 2021
  
  in Public
  
  likelihood-maximization approach, or a bayes-optimal approach
  
  maximum likelihood
  
  cite relevant papers
  
  Bayes is caps
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch8/single-vertex-nomination.html
docs.neurodata.io docs.neurodata.io

1. Why embed networks? — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

2
1. aloftus2 11 Nov 2021
  
  in Public
  
  What The Heck Is Th
  
  TODO: probably move this to ch 4
2. aloftus2 03 Nov 2021
  
  in Public
  
  text(".8", .25, .75) text(".8", .75, .25) text(".1", .25, .25) text(".1", .75, .75)
  
  change to using sns.heatmap version of text
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/why-embed-networks.html
docs.neurodata.io docs.neurodata.io

4.3. Properties of Networks — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

12
1. aloftus2 08 Nov 2021
  
  in Public
  
  Next, we can select a different neighbor of node 𝑖ii, for which there are 𝑑𝑖−1di−1d_i - 1 total. This gives us a triplet consisting of node 𝑖ii, one of 𝑑𝑖did_i possible nodes, and one of 𝑑𝑖−1di−1d_i - 1 possible nodes, since there will exist at least two edges between them (one edge from node 𝑖ii to one of its 𝑑𝑖did_i neighbors, and the other edge from node 𝑖ii to one of its other 𝑑𝑖−1di−1d_i - 1 neighbors). Therefore, the number of open and closed triplets is the quantity ∑𝑖𝑑𝑖(𝑑𝑖−1)∑idi(di−1)\sum_i d_i (d_i - 1).
  
  didnt really understand this description too well
  
  Here's how you can find an arbitrary triplet:
  
  Pick a neighbor for node $i$
  
  Pick a different neighbor for node $i$
  
  Since node $i$ has edges with both of these neighbors, the triplet consisting of $i$ and its two neighbors will have at least two edges.
  
  If those neighbors are connected, the triplet will be open, and if they aren't, the triplet will be closed
  
  can figure out how many total triplets there are by counting the number of times we can go through this process
  
  (maybe revise point 5, or reorganize into paragraphs, and/or add more numerical specifics)
2. aloftus2 08 Nov 2021
  
  in Public
  
  nodes from our example network. To do this, we will look only at the boroughs Staten Island, Manhattan, Brooklyn, and Queens. Our network looks like this:
  
  Let's look at only Staten Island, Manhattan, Brooklyn, and Queens in our example network.
3. aloftus2 08 Nov 2021
  
  in Public
  
  Here, we cover some useful quantities we might want to compute about a network.
  
  These properties are called network summary statistics. Although this book will be more focused on finding and using representations for networks than using summary statistics, they're useful to know about.
4. aloftus2 08 Nov 2021
  
  in Public
  
  where 𝑎𝑖𝑗aija_{ij} takes the value of 111 if nodes 𝑖ii and 𝑗jj are connected, and the value 000 if nodes 𝑖ii and 𝑗jj are not connected.
  
  ?
5. aloftus2 08 Nov 2021
  
  in Public
  
  /opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/site-packages/sklearn/utils/validation.py:585: FutureWarning: np.matrix usage is deprecated in 1.0 and will raise a TypeError in 1.2. Please convert to a numpy array with np.asarray. For more information see: https://numpy.org/doc/stable/reference/generated/numpy.matrix.html warnings.warn(
  
  delete
6. aloftus2 08 Nov 2021
  
  in Public
  
  summing all of the adjacencies corresponding to a potential edge incident a node 𝑖ii:
  
  overcomplicated sentence
  
  "We can get the degree of a $i$ by counting all the edges incident to it. To do this, we can just sum along the $i_{th}$ row (or column) of the adjacency matrix:
  
  (equation)
7. aloftus2 08 Nov 2021
  
  in Public
  
  From the description above, we learned that every edge incident a node 𝑖ii will have 𝑎𝑖𝑗aija_{ij} take the value of one. Therefore,
  
  Since every edge incident to $i$ will have $a_{ij}$ take the value of 1, we can count...
8. aloftus2 08 Nov 2021
  
  in Public
  
  , so we figured we would make it look extra special with a box.
  
  delete
9. aloftus2 08 Nov 2021
  
  in Public
  
  For most purposes, we will largely be considered with binary networks, which are also more traditionally called
  
  For most purposes, we'll primarily consider unweighted or binary networks.
10. aloftus2 08 Nov 2021
  
  in Public
  
  We will now discuss some properties about the network that somewhat simplify the possibilities for 𝐴AA.
  
  consider just deleting this sentence
11. aloftus2 08 Nov 2021
  
  in Public
  
  To ease this discussion, we will begin with an example that we will use throughout this section.
  
  delete
12. aloftus2 08 Nov 2021
  
  in Public
  
  simplify the possibilities for 𝐴AA.
  
  vague
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch4/properties-of-networks.html
docs.neurodata.io docs.neurodata.io

9.2. Graph Matching and Vertex Nomination — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

6
1. aloftus2 05 Nov 2021
  
  in Public
  
  Thus, 𝐴AA and 𝐵BB are said to be isomorphicisomorphic\textit{isomorphic}.
  
  don't think introducing the term "isomorphic" is necessary here - can just say "So A and B are the same network, but the nodes just have different indices"
2. aloftus2 05 Nov 2021
  
  in Public
  
  fig, axs = plt.subplots(1, 3, figsize=(20, 20)) heatmap(A, ax=axs[0], cbar=False, title = r'$A_T$') heatmap(B, ax=axs[1], cbar=False, title = r'$A_F$') heatmap(P@B@P.T, ax=axs[2], cbar=False, title = r'$A_F$
  
  same deal as before, separate out plotting code
3. aloftus2 05 Nov 2021
  
  in Public
  
  The permutation matrix
  
  The permutation matrix above represents...
4. aloftus2 05 Nov 2021
  
  in Public
  
  fig, axs = plt.subplots(1, 3, figsize=(20, 20)) heatmap(B, ax=axs[0], cbar=False, title = r'Original Matrix $B$') heatmap(P@B, ax=axs[1], cbar=False, title = r'Row Permutation $PB$:') heatmap(B@P.T, ax=axs[2], cbar=False, title = r'Row Permutation $BP^T$:')
  
  separate plotting stuff out into a new codeblock and hide the cell
  
  add ; at the end of the code block to prevent the <AxesSubplot:title={'center':'Row Permutation $BP^T$:'}>
5. aloftus2 05 Nov 2021
  
  in Public
  
  where 𝐴=𝐵A=BA=B, that is, the networks are identical
  
  the best possible case, where the two networks are identical: $A = B$.
6. aloftus2 05 Nov 2021
  
  in Public
  
  As you can imagine, there are a very large number of these possible mappings
  
  There are a ton of ways to match the nodes in F to the nodes in T. In fact...
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch9/graph-matching-vertex.html
Oct 2021
docs.neurodata.io docs.neurodata.io

8.6. Out-of-sample Embedding — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

8
1. aloftus2 25 Oct 2021
  
  in Public
  
  X v_
  
  should be T
2. aloftus2 20 Oct 2021
  
  in Public
  
  Using Graspologic¶
  
  crappy title
3. aloftus2 20 Oct 2021
  
  in Public
  
  v_1
  
  too many sig figs
4. aloftus2 20 Oct 2021
  
  in Public
  
  hat’s what the pseudoinverse does: it reverses what it can, and accepts that some info
  
  say what T^+ is explicitly
5. aloftus2 20 Oct 2021
  
  in Public
  
  v_est_proba = X @ v_1
  
  discussion about needing to restrict to [0, 1] for this to really be an estimator for a probability vector
6. aloftus2 20 Oct 2021
  
  in Public
  
  plot_latents
  
  change this code so that it shows numbers
7. aloftus2 20 Oct 2021
  
  in Public
  
  # and an out-of-sample node with the same SBM call
  
  "making an extra node with the sbm call, then popping that node out"
8. aloftus2 15 Oct 2021
  
  in Public
  
  first
  
  out-of-sample node
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch8/out-of-sample.html
docs.neurodata.io docs.neurodata.io

Single-Network Models — Network Machine Learning in Python

1
1. aloftus2 01 Oct 2021
  
  in Public
  
  arred sections, these sections will assume familiarity with more advanced mathematical and probability concepts. { requestKernel: true, binderOptions: { repo: "binder-examples/jupyter-stacks-datascience", ref: "master", }, codeMirrorConfig: { theme: "abcdef", mode: "python" }, kernelOptions: { kernelName: "python3", path: "./representations/ch5" }, predefinedOutput: true } kernelName = 'python3'
  
  consider a paragraph that just introduces all the models with like a sentence-description
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/single-network-models.html
Sep 2021
docs.neurodata.io docs.neurodata.io

6.4. Estimating Parameters in Network Models via Spectral Methods — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

29
1. aloftus2 29 Sep 2021
  
  in Public
  
  a posteriori Stochastic Block Model, Recap We just covered many details about how to perform statistical inference with a realization of a random network which we think can be well summarized by a Stochastic Block Model. For this reason, we will review some of the key things that were covered, to better put them in context: We learned that the Adjacency Spectral Embedding is a key algorithm for making sense of networks we believe may be realizations of networks which are well-summarized by Stochastic Block Models, as inference on the the estimated latent positions is key for learning about community assignments. We learned how unsupervised learning allows us to use the estimated latent positions to learn community assignments for nodes within our realization. We learned how to align the labels produced by our unsupervised learning technique with true labels in our network, using remap_labels. We learned how to produce community assignments, regardless of whether we know how many communities may be present in the first place. { requestKernel: true, binderOptions: { repo: "binder-examples/jupyter-stacks-datascience", ref: "master", }, codeMirrorConfig: { theme: "abcdef", mode: "python" }, kernelOptions: { kernelName: "python3", path: "./representations/ch6" }, predefinedOutput: true } kernelName = 'python3'
  
  I think this recap should be the introductory paragraph, and should be expanded
2. aloftus2 16 Sep 2021
  
  in Public
  
  Unlike the SBM example, the scatter plots for the adjacency spectral embedding of a realization of an ER network no longer show the distinct separability into individual communities.
  
  Unlike with the SBM, we can't see any obvious clusters in this pairs plot
3. aloftus2 16 Sep 2021
  
  in Public
  
  we would not expect the pairs plot to show discernable clusters.
  
  the pairs plot shouldn't tell us anything about ...
4. aloftus2 16 Sep 2021
  
  in Public
  
  histograms of the indicated values for the indicated dimension.
  
  "the indicated values for the indicated dimension" I don't feel like I understand the histograms better after reading this sentence
5. aloftus2 16 Sep 2021
  
  in Public
  
  we will find reasonable “guesses” at community assignments further down the line.
  
  that we'll be able to guess our community assignments reasonably well
6. aloftus2 16 Sep 2021
  
  in Public
  
  One technique to do so that is particularly useful
  
  one particularly useful way to figure this out
7. aloftus2 16 Sep 2021
  
  in Public
  
  it is critical to investigate the quality of an embedding
  
  it's important to figure out how good that embedding is
8. aloftus2 16 Sep 2021
  
  in Public
  
  Simulated SBM($\pi$, B)"
  
  I'd say somewhere in the title that we're calling this "A"
9. aloftus2 16 Sep 2021
  
  in Public
  
  Remember that as we learned in the single network models section, even though the communities eachh node is assigned to look obvious, this is an artifact of the ordering of the nodes.
  
  consider:
  
  Remember that if we reorder the nodes, the community each node is assigned to won't be as visually obvious
10. aloftus2 16 Sep 2021
  
  in Public
  
  ⃗ π→\vec \pi and 𝐵BB
  
  again, I still don't really remember what \pi is, and lots of readers will probably have forgotten what B is
11. aloftus2 16 Sep 2021
  
  in Public
  
  a similar example to the scenario we had above
  
  let's consider something similar to the road example
  
  (or whatever the example above ends up being)
12. aloftus2 16 Sep 2021
  
  in Public
  
  Number of communities 𝐾KK is known
  
  What do we do when we know the number of communities?
13. aloftus2 16 Sep 2021
  
  in Public
  
  the approach we will take will be to use 𝐴AA to produce a best guess as to which community each node of 𝐴AA is from, and then use our best guesses as to which community each node is from to learn about 𝜋⃗ π→\vec \pi and 𝐵BB.
  
  I have had to linger on this sentence for the past 15 seconds to understand it - rewrite
14. aloftus2 16 Sep 2021
  
  in Public
  
  To estimate 𝜋⃗ π→\vec \pi and 𝐵BB,
  
  for the record, I still have to think for a second to remember what \vec \pi is
15. aloftus2 16 Sep 2021
  
  in Public
  
  l = RDPGEstimator(loops=False) # number of latent dimensions is not known model.fit(A) Phat = model.p_mat_
  
  i'd make a "d_hat" variable somewhere in this code
16. aloftus2 16 Sep 2021
  
  in Public
  
  What if we did not know that 𝑑dd was 222 ahead of time
  
  what if we didn't know that there were two latent dimensions ahead of time?
  
  I would also visualize the latent position matrix somewhere
17. aloftus2 16 Sep 2021
  
  in Public
  
  from graphbook_code import plot_latents fig, axs = plt.subplots(1, 2, figsize=(12, 6)) heatmap(Phat, vmin=0, vmax=1, font_scale=1.5, title="$\hat P_{RDPG}$", ax=axs[0]) heatmap(P, vmin=0, vmax=1, font_scale=1.5, title="$P_{RDPG}$", ax=axs[1]) fig;
  
  hide this?
18. aloftus2 16 Sep 2021
  
  in Public
  
  We will evaluate the performance of the RDPG estimator agaigraspologic.plot the estimated probability matrix, 𝑃̂ =𝑋̂ 𝑋̂ ⊤P^=X^X^⊤\hat P = \hat X \hat X^\top, to the true probability matrix, 𝑃=𝑋𝑋⊤P=XX⊤P = XX^\top.
  
  this sentence is confusing
19. aloftus2 16 Sep 2021
  
  in Public
  
  Let’s simulate an example network:
  
  I'd simulate the street here instead of making up a new example
20. aloftus2 16 Sep 2021
  
  in Public
  
  st 111) and when people are very far apart, we think that they will have a very low probability of being friends (almost 000). We define 𝑋XX to have rows given by: 𝑥⃗ 𝑖=⎡⎣⎢⎢(60−𝑖60)2(𝑖60)2⎤⎦⎥⎥
  
  this latent position matrix doesn't really line up with the street situation you're describing - I'd either change the latent position matrix or change the example
21. aloftus2 16 Sep 2021
  
  in Public
  
  We define 𝑋XX to have rows given by: 𝑥⃗ 𝑖=⎡⎣⎢⎢(60−𝑖60)2(𝑖60
  
  this doesn't feel like it transitions from the street example well enough: I don't immediately see the connection between X and the street and this equation
22. aloftus2 16 Sep 2021
  
  in Public
  
  Let’s assume that we have 606060 people who live along a very long road that is 606060 miles long, and each person is 111 of a mile apart.
  
  I would literally have this be the first sentence of the section
23. aloftus2 16 Sep 2021
  
  in Public
  
  The ⋅̂ ⋅^\hat \cdot symbol just means that 𝑑̂ d^\hat d is an estimate of the number of latent dimensions 𝑑dd, and not necessarily the actual number of latent dimension
  
  consider rewriting to something like
  
  "the hat symbol above the $d$ means that it's our best guess for the number of dimensions (using some reasonable estimation method), rather than it being the actual number of dimensions.
24. aloftus2 16 Sep 2021
  
  in Public
  
  We might have a reasonable ability to “guess” what 𝑑dd is ahead of time, but this will often not be the case
  
  in some situations we can guess $d$, but we want some way of picking it automatically
25. aloftus2 14 Sep 2021
  
  in Public
  
  he estimate of 𝑋XX is produced by using the Adjacency Spectral Embedding, by embedding the observed network 𝐴AA into 𝑑dd (if the number of latent dimensions is known) or 𝑑̂ d^\hat d (if the number of latent dimensions is not known) dimensions.
  
  "... by embedding the observed network A into d or d if the number of latent dimension is known or not known."
26. aloftus2 14 Sep 2021
  
  in Public
  
  We estimate 𝑋XX extremely simply for a realization 𝐴AA of a random network 𝐴𝐴AA\pmb A which is characterized using the a priori Random Dot Product Graph.
  
  I didn't understand this sentence on first read, would rephrase
27. aloftus2 14 Sep 2021
  
  in Public
  
  That expression, it turns out, is a lot more complicated than what we had to deal with for the a priori Stochastic Block Model. Taking the log gives us that:
  
  give the reader warning in advance
28. aloftus2 14 Sep 2021
  
  in Public
  
  , in fact, additional steps on top of how we estimate parameters for an a priori Random Dot Product Graph (RDPG).
  
  run-on sentence and confusing phrasing
29. aloftus2 14 Sep 2021
  
  in Public
  
  ℙ𝜃(𝐴)=∑𝜏⃗ ∈∏𝑘=1𝐾[𝜋𝑛𝑘𝑘⋅∏𝑘′=1𝐾𝑏𝑚𝑘′𝑘𝑘′𝑘(1−𝑏𝑘′𝑘)𝑛𝑘′𝑘−𝑚𝑘′𝑘]
  
  I would organize this whole section like:
  
  "look how complicated the thing below is:
  <the equation>
  this equation is way too complicated, so we're going to have to get a bit creative here"
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/estimating-parameters_spectral.html
docs.neurodata.io docs.neurodata.io

9.2. Graph Matching and Vertex Nomination — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

5
1. aloftus2 23 Sep 2021
  
  in Public
  
  original matrix B [[ 1 2 3 4] [ 5 6 7 8] [ 9 10 11 12] [13 14 15 16]] row permutation: [[ 5 6 7 8] [ 9 10 11 12] [13 14 15 16] [ 1 2 3 4]] column permutation: [[ 2 3 4 1] [ 6 7 8 5] [10 11 12 9] [14 15 16 13]]
  
  the formatting is a bit visually chaotic
  
  I might replace with a figure that uses color to show row/column movement & a heatmap
  
  with all 0's and a row of 1
2. aloftus2 23 Sep 2021
  
  in Public
  
  Due to the one-to-one nature of these matchings, they are also known as bijectionsbijections\textit{bijections}
  
  this sentence will be meaningless to many readers
3. aloftus2 23 Sep 2021
  
  in Public
  
  𝑓(𝐴,𝐵)=0
  
  If you can make a figure that just shows the process of f(A, B) that might help a lot
4. aloftus2 23 Sep 2021
  
  in Public
  
  As we can see they are very simple, and they are clearly equal to each other.
  
  The two networks are clearly equal to each other
5. aloftus2 23 Sep 2021
  
  in Public
  
  𝑓(𝐴,𝐵)=||𝐴−𝐵||2𝐹f(A,B)=||A−B||F2f(A, B) = ||A - B||_F^2
  
  add description
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/applications/ch9/graph-matching-vertex.html
docs.neurodata.io docs.neurodata.io

5.1. Why Use Statistical Models? — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

1
1. aloftus2 16 Sep 2021
  
  in Public
  
  Stated another way, our observed network is assumed to be a realization of a governing random network. From now on, when we say the word network without the word random in front of it, we are referring to the realizations of random networks.
  
  dax: this sentence is doing a ton of heavy lifting and needs more pomp+circumstance.
  
  maybe in its own section, and another paragraph (or two)?
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/why-use-models.html
docs.neurodata.io docs.neurodata.io

5.3. Erdös-Rényi (ER) Random Networks — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

3
1. aloftus2 13 Sep 2021
  
  in Public
  
  are binary
  
  I'd say "that the networks don't have edge weights"
2. aloftus2 13 Sep 2021
  
  in Public
  
  the diagonal is entirely 0
  
  "edges can't connect to themselves" - I wouldn't talk about the diagonal, that's not about the network, its about the adjacency matrix
3. aloftus2 13 Sep 2021
  
  in Public
  
  *
  
  in general I think there is a very high chance that most readers will skip all starred sections, so good to bear that in mind
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/single-network-models_ER.html
docs.neurodata.io docs.neurodata.io

6.2. Estimating Parameters in Network Models — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

17
1. aloftus2 07 Sep 2021
  
  in Public
  
  et’s try an example of an a priori RDPG. We will use the same example that we used in the single network models section, where we
  
  I'd add a big header here that says something like "Fitting Models For Random Dot Product Graphs"
2. aloftus2 07 Sep 2021
  
  in Public
  
  𝑏𝑙′𝑙logℙ𝜃(𝐴)⇒𝑏∗𝑙′𝑙=0+∂∂𝑏𝑙′𝑙[𝑚𝑙′𝑙log𝑏𝑙′𝑙+(𝑛𝑙′𝑙−𝑚𝑙′𝑙)log(1−𝑏𝑙′𝑙)]=𝑚𝑙′𝑙𝑏𝑙′𝑙−𝑛𝑙′𝑙−𝑚𝑙′𝑙1−𝑏𝑙′𝑙=0=𝑚𝑙′𝑙𝑛𝑙′𝑙
  
  too mathy
3. aloftus2 07 Sep 2021
  
  in Public
  
  ∂∂𝑏𝑙′𝑙logℙ𝜃(𝐴)=∂∂𝑏𝑙′𝑙∑𝑘,𝑘′∈[𝐾]𝑚𝑘′𝑘log𝑏𝑘′𝑘+(𝑛𝑘′𝑘−𝑚𝑘′𝑘)log(1−𝑏𝑘′𝑘)=∑𝑘,𝑘′∈[𝐾]∂∂𝑏𝑙′𝑙[𝑚𝑘′𝑘log𝑏𝑘′𝑘+(𝑛𝑘′𝑘−𝑚𝑘′𝑘)log(1−𝑏𝑘′𝑘)]
  
  too mathy
4. aloftus2 07 Sep 2021
  
  in Public
  
  ℙ𝜃(𝐴)=∏𝑘,𝑘′∈[𝐾]𝑏𝑚𝑘′𝑘𝑘′𝑘⋅(1−𝑏𝑘′𝑘)𝑛𝑘′𝑘−𝑚𝑘′𝑘Pθ(A)=∏k,k′∈[K]bk′kmk′k⋅(1−bk′k)nk′k−mk′k\begin{align*} \mathbb P_\theta(A) &= \prod_{k, k' \in [K]}b_{k'k}^{m_{k'k}} \cdot (1 - b_{k'k})^{n_{k'k - m_{k'k}}} \end{align*} where 𝑛𝑘′𝑘=∑𝑖<𝑗𝟙𝜏𝑖=𝑘𝟙𝜏𝑗=𝑘′nk′k=∑i<j1τi=k1τj=k′n_{k'k} = \sum_{i < j}\mathbb 1_{\tau_i = k}\mathbb 1_{\tau_j = k'} was the number of possible edges between nodes in community 𝑘kk and 𝑘′k′k', and 𝑚𝑘′𝑘=∑𝑖<𝑗𝟙𝜏𝑖=𝑘𝟙𝜏𝑗=𝑘′𝑎𝑖𝑗mk′k=∑i<j1τi=k1τj=k′aijm_{k'k} = \sum_{i < j}\mathbb 1_{\tau_i = k}\mathbb 1_{\tau_j = k'}a_{ij} was the number of edges in the realization 𝐴AA between nodes within communities 𝑘kk and 𝑘′k′k'.
  
  way too mathy, nobody in industry is gonna know what an indicator function is
5. aloftus2 07 Sep 2021
  
  in Public
  
  ve, which we omit since it is rather mathematically tedious, we see that the second derivative at 𝑝∗p∗p^* is negative, so we indeed have found an estimate of the maximum, and will be denoted by 𝑝̂ p^\hat p. This gives that the Maximum Likelihood Estimate (or, the MLE, for short) of the probability 𝑝pp for a random network 𝐀A\mathbf A which is ER is: 𝑝̂ =𝑚(𝑛2)
  
  I don't think we should be talking about MLEs right when we introduce ER, way too in-depth
6. aloftus2 07 Sep 2021
  
  in Public
  
  xt, we take the derivative with respect to 𝑝pp, set equal to 000, and we
  
  I don't think there's any reason to talk about derivatives in the first couple paragraphs of the introduction to an ER model
7. aloftus2 07 Sep 2021
  
  in Public
  
  𝑑𝑑𝑝logℙ𝜃(𝐴)⇒𝑝∗=𝑚𝑝−(𝑛2)−𝑚1−𝑝=0=𝑚(𝑛2)
  
  too mathy
8. aloftus2 07 Sep 2021
  
  in Public
  
  logℙ𝜃(𝐴)=log[𝑝𝑚⋅(1−𝑝)(𝑛2)−𝑚]=𝑚log𝑝+((𝑛2)−𝑚)log(1−𝑝)
  
  too mathy
9. aloftus2 07 Sep 2021
  
  in Public
  
  𝑑𝑑𝑝logℙ(𝐱1=𝑥1,...,𝐱𝑛=𝑥𝑛;𝑝)⇒∑𝑛𝑖=1𝑥𝑖𝑝⇒(1−𝑝)∑𝑖=1𝑛𝑥𝑖∑𝑖=1𝑛𝑥𝑖−𝑝∑𝑖=1𝑛𝑥𝑖⇒𝑝∗=∑𝑛𝑖=1𝑥𝑖𝑝−𝑛−∑𝑛𝑖=1𝑥𝑖1−𝑝=0=𝑛−∑𝑛𝑖=1𝑥𝑖1−𝑝=𝑝(𝑛−∑𝑖=1𝑛𝑥𝑖)=𝑝𝑛−𝑝∑𝑖=1𝑛𝑥𝑖=1𝑛∑𝑖=1𝑛𝑥𝑖
  
  way way way too mathy, anyone who isnt an academic will not read any of this
10. aloftus2 07 Sep 2021
  
  in Public
  
  ℙ𝜃(𝐱1=𝑥1,...,𝐱𝑛=𝑥𝑛;𝑝)=∏𝑖=1𝑛ℙ(𝐱𝑖=𝑥𝑖)=∏𝑖=1𝑛𝑝𝑥𝑖(1−𝑝)1−𝑥𝑖=𝑝∑𝑛𝑖=1𝑥𝑖(1−𝑝)𝑛−∑𝑛𝑖=1𝑥𝑖
  
  too mathy imo, a non-academic's eyes will glaze over
11. aloftus2 02 Sep 2021
  
  in Public
  
  along our assumed street
  
  on the road
12. aloftus2 02 Sep 2021
  
  in Public
  
  as described in the section on spectral embedding
  
  hasnt been put in yet
13. aloftus2 02 Sep 2021
  
  in Public
  
  an estimate of 𝑑
  
  the "hat" above d means that $\hat{d}$ is an estimate for the dimensionality, rather than the actual dimensionality.
14. aloftus2 02 Sep 2021
  
  in Public
  
  which is a real matrix with 𝑛nn rows (one for each node) and 𝑑dd columns (one for each latent dimension).
  
  just use a figure here
15. aloftus2 02 Sep 2021
  
  in Public
  
  We estimate 𝑋XX extremely simply for a realization 𝐴AA of a random network 𝐴𝐴AA\pmb A which is characterized using the a priori Random Dot Product Graph.
  
  no idea what this means
16. aloftus2 02 Sep 2021
  
  in Public
  
  Whereas the log of a product of terms is the sum of the logs of the terms, no such easy simplification exists for the log of a sum of terms. This means that we will have to get a bit creative here. Instead, we will turn first to the a priori Random Dot Product Graph, and then figure out how to estimate parameters from a a posteriori SBM using that.
  
  I'm lost here because I didn't read the 'why use statistical models' section recently
  
  (which implies that someone would have had to read and understand the 'why use statistical models' section to understand this)
17. aloftus2 02 Sep 2021
  
  in Public
  
  ℙ𝜃(𝐴)=∑𝜏⃗ ∈∏𝑘=1𝐾[𝜋𝑛𝑘𝑘⋅∏𝑘′=1𝐾𝑏𝑚𝑘′𝑘𝑘′𝑘(1−𝑏𝑘′𝑘)𝑛𝑘′𝑘−𝑚𝑘′𝑘]Pθ(A)=∑τ→∈T∏k=1K[πknk⋅∏k′=1Kbk′kmk′k(1−bk′k)nk′k−mk′k]\begin{align*} \mathbb P_\theta(A) &= \sum_{\vec \tau \in \mathcal T} \prod_{k = 1}^K \left[\pi_k^{n_k}\cdot \prod_{k'=1}^K b_{k' k}^{m_{k' k}}(1 - b_{k' k})^{n_{k' k} - m_{k' k}}\right] \end{align*} That expression, it turns out, is a lot more complicated than what we had to deal with for the a priori Stochastic Block Model. Taking the log gives us that: logℙ𝜃(𝐴)=log(∑𝜏⃗ ∈∏𝑘=1𝐾[𝜋𝑛𝑘𝑘⋅∏𝑘′=1𝐾𝑏𝑚𝑘′𝑘𝑘′𝑘(1−𝑏𝑘′𝑘)𝑛𝑘′𝑘−𝑚𝑘′𝑘])
  
  I dont think anyone who is non-mathy will get anything from this
  
  (my eyes glazed over when I saw those equations)
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/estimating-parameters.html
Aug 2021
docs.neurodata.io docs.neurodata.io

1. Why embed networks? — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

1
1. aloftus2 24 Aug 2021
  
  in Public
  
  array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1], dtype=int32)
  
  fix colors
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/why-embed-networks.html
Jul 2021
docs.neurodata.io docs.neurodata.io

6.2. Estimating Parameters in Network Models — Hands-on Network Machine Learning with Scikit-Learn and Graspologic

14
1. aloftus2 30 Jul 2021
  
  in Public
  
  𝐴
  
  our adjacency matrix
2. aloftus2 30 Jul 2021
  
  in Public
  
  𝐴
  
  our adjacency matrix A
3. aloftus2 30 Jul 2021
  
  in Public
  
  In particular, what this result says is that if we were to look at the sum of 𝐴AA expressed above, and only look at the sum of the first 𝑘kk of those terms, that rank 𝑘kk matrix 𝐴𝑘AkA_k is the most similar rank 𝑘kk matrix to 𝐴AA, according to the Frobenius norm.
  
  This is telling us that $A_k$ is the matrix which similar as possible to A, but is less complicated (meaning, it's only rank k instead of being full-rank).
4. aloftus2 30 Jul 2021
  
  in Public
  
  What does this result say? This result expresses that the matrix 𝐴
  
  What is this result telling us? Well, it says that ...
5. aloftus2 30 Jul 2021
  
  in Public
  
  Discussing the below,
  
  maybe not neessary words
6. aloftus2 30 Jul 2021
  
  in Public
  
  Discussing the below, we will delve briefly into the concept of matrix rank, which we will define now.
  
  needs more of a transition: why are we talking about matrix rank now? What's the connection to what we were just talking about?
7. aloftus2 30 Jul 2021
  
  in Public
  
  s
  
  if possible, maybe we can use a different svd function (scipy has a million) that has S being a diagonal matrix instead of a vector
8. aloftus2 30 Jul 2021
  
  in Public
  
  What does 𝑈UU look like?
  
  Most of U looks like random noise, as you can see below, but the first two columns actually tell you <bla bla abla>
9. aloftus2 30 Jul 2021
  
  in Public
  
  np.dot(U, np.dot(np.diag(s), Ut))))
  
  I would always use
  
  `U @ (np.diag(s)@Ut))``
10. aloftus2 30 Jul 2021
  
  in Public
  
  Let’s illustrate
  
  Add a sentence describing orthogonality geometrically
  
  so, " it's useful to think about the fact that the columns are orthogonal geometrically. If you think of each column as a vector in space, those vectors will all be at 90 degree angles from each other - and so every pair of columns will have a dot product of 0." (maybe you can go into what having a dot product of 0 implies about the information that the columns contain)
11. aloftus2 30 Jul 2021
  
  in Public
  
  hat is, 𝐴∈ℝ𝑛×𝑛A∈Rn×nA \in \mathbb R^{n \times n}, and for any 𝑖,𝑗∈[𝑛]i,j∈[n]i, j \in [n], 𝑎𝑖𝑗=𝑎𝑗𝑖aij=ajia_{ij} = a_{ji}
  
  I would put a figure here that shows a small symmetric matrices, with an arrow pointing to the symmetry and text that says something like "the upper and lower portions of the matrix are the same"
12. aloftus2 30 Jul 2021
  
  in Public
  
  we would recommend a Linear Algebra textbook [Trefethan, LADR].
  
  you should read a linear algebra textbook - we recommend this one <book>
13. aloftus2 30 Jul 2021
  
  in Public
  
  This description of the SVD has been modified to fit our purposes: particularly, the description we provide applies only
  
  This description only works for square matrices because it's easier and more intuitive to think through, but you can (and should!) find other descriptions that generalize to nonsquare matrices
14. aloftus2 30 Jul 2021
  
  in Public
  
  Note that for these and successive sections, we will present a simplified, and non-rigorous, review of the SVD and many results that are important for developing intuition around this decomposition.
  
  feels pretty formal
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/estimating-parameters.html
Jun 2021
docs.neurodata.io docs.neurodata.io

Single-Network Models — Network Machine Learning in Python

11
1. aloftus2 24 Jun 2021
  
  in Public
  
  ax=adjplot(A[tuple([vtx_perm])] [:,vtx_perm], meta=meta, color="School", palette="Blues")
  
  I might add something that explains what the blue lines mean
2. aloftus2 24 Jun 2021
  
  in Public
  
  it will be obvious that the network will have a modular structure?
  
  do you think that the structure that exists in this network will be obvious?
3. aloftus2 24 Jun 2021
  
  in Public
  
  ting_context("talk", font_scale=1): ax = sns.heatmap(X, cmap="Purples", ax=ax, cbar_kws=dict(shrink=1), yticklabels=False, xticklabels=False, vmin=0, vmax=1) ax.set_title(titl
  
  have text annotations in the center of this matrix and then remove the colorbar
4. aloftus2 24 Jun 2021
  
  in Public
  
  import matplotlib.pyplot as plt import seaborn as sns import numpy as np
  
  I'd change the figure a bit to map the colors to numbers a bit better so that the reader understands how the figure represents a vector
5. aloftus2 24 Jun 2021
  
  in Public
  
  Next, let’s plot what 𝜏⃗ τ→\vec \tau and 𝐵BB look like: import matplotlib.pyplot as plt import seaborn as sns import numpy as np import matplotlib def plot_tau(tau, title="", xlab="Node"): cmap = matplotlib.colors.ListedColormap(["skyblue", 'blue']) fig, ax = plt.subplots(figsize=(10,2)) with sns.plotting_context("talk", font_scale=1): ax = sns.heatmap((tau - 1).reshape((1,tau.shape[0])), cmap=cmap, ax=ax, cbar_kws=dict(shrink=1), yticklabels=False, xticklabels=False) ax.set_title(title) cbar = ax.collections[0].colorbar cbar.set_ticks([0.25, .75]) cbar.set_ticklabels(['School 1', 'School 2']) ax.set(xlabel=xlab) ax.set_xticks([.5,149.5,299.5]) ax.set_xticklabels(["1", "150", "300"]) cbar.ax.set_frame_on(True) return n = 300 # number of students # tau is a column vector of 150 1s followed by 50 2s # this vector gives the school each of the 300 students are from tau = np.vstack((np.ones((int(n/2),1)), np.full((int(n/2),1), 2))) plot_tau(tau, title="Tau, Node Assignment Vector", xlab="Student")
  
  I would move image to right where you first explained it
6. aloftus2 24 Jun 2021
  
  in Public
  
  To describe the a priori SBM, we will use a latent variable model. To do so, we will assume there is some vector-valued random variable, 𝜏𝜏→ττ→\vec{\pmb \tau}, which we will call the node assignment vector. This random variable takes values 𝜏⃗ τ→\vec\tau which are in the space {1,...,𝐾}𝑛{1,...,K}n\{1,...,K\}^n. That means that each element of a realization 𝜏⃗ τ→\vec\tau takes one of 𝐾KK possible values. Each node receives a community assignment, so we say that 𝑖ii goes from 111 to 𝑛nn. Stated another way, each node 𝑖ii of our network receives an assignment 𝜏𝑖τi\tau_i to one of the 𝐾KK communities. This model is called the a priori SBM because we use it when we have a realization 𝜏⃗ τ→\vec\tau that we know ahead of time. In our social network example, for instance, 𝜏𝑖τi\tau_i would reflect that each student can attend one of two possible schools. For a single node 𝑖ii that is in community ℓℓ\ell, where ℓ∈{1,...,𝐾}ℓ∈{1,...,K}\ell \in \{1, ..., K\}, we write that 𝜏𝑖=ℓτi=ℓ\tau_i = \ell.
  
  We want to assign each node to one of some number of communities, which we're calling K.
  
  We can designate which community each node belongs to with a big list. The list will have the same length as the number of nodes. If there are three communities, for instance, and the first two nodes were in community 1, the second was in community 2, and the third was in community 3, then the list would look like this: [1, 1, 2, 3]
  
  This list doesn't have to be set in stone. In fact, we can draw our community values randomly bla bla bla probability distributions etc
7. aloftus2 24 Jun 2021
  
  in Public
  
  school 111 or school 222. Our network has 100100100 nodes, and each node represents a single student. The edges of this network represent whether a pair of students are friends. Intuitively, if two students go to the same school, it might make sense to say that they have a higher chance of being friends than if they do not go to the same school. If we were to try to characterize this network using an ER network, we would run into a problem very similar to when we tried to capture the two cluster coin flip example with only a single coin. Intuitively, there must be a better way! The Stochastic Block Model, or SBM, captures this idea by assigning each of the 𝑛nn nodes in the network to one of 𝐾KK communities. A community is a group of
  
  I would add a layouts-type figure to show the schools visually here
8. aloftus2 24 Jun 2021
  
  in Public
  
  If we were to try to characterize this network using an ER network
  
  If we were to say that every single student has the same probability of being friends, like we would have to if we described this situation with an erdos-renyi model, we'd be wrong
9. aloftus2 24 Jun 2021
  
  in Public
  
  we would run into a problem very similar to when we tried to capture the two cluster coin flip example with only a single coin.
  
  delete if coin stuff is getting deleted
10. aloftus2 24 Jun 2021
  
  in Public
  
  . Our
  
  You could view this setup as a network! Our network would have 100 nodes, with each node representing a particular student.
11. aloftus2 24 Jun 2021
  
  in Public
  
  Add something like:
  
  "You could view this setup as a network!. In this network, there would be 100 nodes, and each node would correspond to a particular student"
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/single-network-models.html
May 2021
docs.neurodata.io docs.neurodata.io

Multigraph Representation Learning — Network Machine Learning in Python

2
1. aloftus2 03 May 2021
  
  in Public
  
  Any particular score m
  
  talk about R_i being value as a low-representation per graph, and sqrt(R) V
  
  all three things are useful things
2. aloftus2 03 May 2021
  
  in Public
  
  So now we have a combined representation for our separate embeddings, but we have a new problem: our latent positions suddenly have way too many dimensions. In this example they have eight (the number of columns in our combined matrix), but remember that in general we’d have 𝑚×𝑑m×dm \times d. This somewhat defeats the purpose of an embedding: we took a bunch of high-dimensional objects and turned them all into a single high-dimensional object. Big whoop. We can’t see what our combined embedding look like in euclidean space, unless we can somehow visualize 𝑚×𝑑m×dm \times d dimensional space (hint: we can’t). We’d like to just have d dimensions - that was the whole point of using d components for each of our Adjacency Spectral Embeddings in the first place!
  
  talk about rotational invariance
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/multigraph-representation-learning.html
docs.neurodata.io docs.neurodata.io

Single-Network Models — Network Machine Learning in Python

2
1. aloftus2 02 May 2021
  
  in Public
  
  class Axes():
  
  maybe change the name of this class? easily confused with https://matplotlib.org/stable/api/axes_api.html#the-axes-class
  
  (I realize these code suggestions don't matter too much since these are all for hidden cells, but I think it's still good to follow good practices)
2. aloftus2 02 May 2021
  
  in Public
  
  __drawArrow__
  
  let's not use this __method__ syntax in class methods?
  
  that's usually reserved for magic/dunder methods which change stuff like what happens when you use operators on instances of the class
  
  https://www.tutorialsteacher.com/python/magic-methods-in-python
  
  if you want a method to be private, let's use the _method syntax, or if you really want it to be private, the __method syntax (the double-underscore at the beginning actually causes python to change the name of the method in the namespace)
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/single-network-models.html
Apr 2021
docs.neurodata.io docs.neurodata.io

Single-Network Models — Network Machine Learning in Python

44
1. aloftus2 26 Apr 2021
  
  in Public
  
  reflect
  
  dont like this word, maybe just "we might want to say that..."
  
  also, I think could restructure the group of sentences, might make more sense to say that Alice has a lower probability of being friends with Bob than a random person, since Bob is unpopular
2. aloftus2 24 Apr 2021
  
  in Public
  
  Formally
  
  I would push back against ever using this word in this book outside of a starred section
3. aloftus2 24 Apr 2021
  
  in Public
  
  coordinates:
  
  coordinates. Remember, each point represents the latent position for a single node.
4. aloftus2 24 Apr 2021
  
  in Public
  
  def plot_lp(X, title="", ylab="Student"):
  
  I like the idea, but it looks like the plot is continuous rather than discrete. Again a big fan of having only 30 students (or less), and making lines under each row, that way the reader can think through this in terms of smaller numbers
5. aloftus2 24 Apr 2021
  
  in Public
  
  So, intuitively, it s
  
  "Again, remember that all of these fractions represent the probability that there will be an edge between two nodes"
6. aloftus2 24 Apr 2021
  
  in Public
  
  𝑥⃗ 1=[10]x→1=[10]\vec x_1 = \begin{bmatrix}1 \\ 0\end{bmatrix},
  
  I think it's a bit of a cognitive load on readers to see column vectors and then have to realize that those column vectors correspond to the row vectors of X
7. aloftus2 24 Apr 2021
  
  in Public
  
  each 𝑥⃗ 𝑖x→i\vec x_i
  
  "to let the latent positions of each node (remember, the latent position for the i_{th} node is called x_i)"
8. aloftus2 24 Apr 2021
  
  in Public
  
  Let’s assume, for instance,
  
  i'd put the "let's assume" part as the first sentence, remove the "for instance", and restructure to account for that, that way we begin with a scenario
9. aloftus2 24 Apr 2021
  
  in Public
  
  Code Examples
  
  same thing as before, I'd have this section be at the top, and just use it as the first explanation, and then the theory stuff after
10. aloftus2 24 Apr 2021
  
  in Public
  
  We write that X∈Rn×dX \in \mathbb R^{n \times d}, which means that it is a matrix with real values, nn rows, and dd columns.
  
  hopefully we also explain this notation earlier
11. aloftus2 24 Apr 2021
  
  in Public
  
  What is so special about this formulation of the SBM problem?
  
  latent positions are really, really important, and this feels kind of tucked away, I'd give this part its own heading, and multiple figures, and say something at the beginning of the section like, "and now we get to one of the most important ideas in network modeling: that there's a way to give the nodes of networks a location in the coordinate space that other machine learning algorithms use".
12. aloftus2 24 Apr 2021
  
  in Public
  
  𝐚𝑖𝑖=0aii=0\mathbf a_{ii} = 0.
  
  instead of statements like "a_{ii} = 0", in general, I'd say something like "there are zeroes along the diagonal"? seems more clear to non-advanced readers
13. aloftus2 24 Apr 2021
  
  in Public
  
  𝜏𝑖=ℓτi=ℓ\tau_i = \ell and 𝜏𝑗=𝑘τj=k\tau_j = k, that 𝐚𝑖𝑗∼𝐵𝑒𝑟𝑛(𝑏ℓ𝑘)aij∼Bern(bℓk)\mathbf a_{ij} \sim Bern(b_{\ell k}).
  
  not convinced that someone without a math background would understand this statement, I'd write it with words instead
14. aloftus2 23 Apr 2021
  
  in Public
  
  represent each of the 300300300 students in our network
  
  represent students, and there are 300 students total.
15. aloftus2 23 Apr 2021
  
  in Public
  
  def plot_block(
  
  I dig this figure
16. aloftus2 23 Apr 2021
  
  in Public
  
  Say we have 300300300 students, and we know that each student goes to one of two possible schools.
  
  i'd put the figure directly below this sentence to illustrate
17. aloftus2 23 Apr 2021
  
  in Public
  
  subgraph
  
  readers dont know what a subgraph is
18. aloftus2 23 Apr 2021
  
  in Public
  
  by
  
  with
19. aloftus2 23 Apr 2021
  
  in Public
  
  In the first set of twenty coin flips, all of the coin flips are performed with the same coin. Stated another way, we have a single cluster, or a set of coin flips which are similar. On the other hand, in the second set of twenty coin flips, twenty of the coin flips are performed with a fair coin, and ten of the coin flips are performed with a different coin which is not fair. Here, we have two clusters of coin flips, those that occur with the first coin, and those that occur with the second coin. Since the first cluster of coin flips are with a fair coin, we expect that coin flips from the first cluster will not necessarily have an identical number of heads and tails, but at least a similar number of heads and tails. On the other hand, coin flips from the second cluster will tend to have more heads than tails. What does this example have to do with networks? In the above examples, the two sets of coin flips differ in the number of coins with different probabilities that we use for the example. The first example has only one unique coin, whereas the second example has two unique coins with different probabilities of heads or tails. If we were to assume that the second example had been performed with only a single coin when in reality it was performed with two different coins, we would be unable to capture that the second ten coin flips had a substantially different chance of landing on heads than the first ten coin flips. Just like coin flips can be performed with fundamentally different coins, the nodes of a network could also be fundamentally different. The way in which two nodes differ (or do not differ) sometimes holds value in determining the probability that an edge exists between them.
  
  this should all probably be slimmed down a lot, I read the whole thing out of willpower but my brain was telling me to skim it
20. aloftus2 23 Apr 2021
  
  in Public
  
  Imagine that we are flipping a fair single coin.
  
  ngl this is a bit of a wall of text and made me wanna just skim it. I'd break up with figures
21. aloftus2 23 Apr 2021
  
  in Public
  
  which had a different probability of seeing heads or tails
  
  a fair and an unfair coin
22. aloftus2 23 Apr 2021
  
  in Public
  
  Code Examples
  
  I'd move this whole section to the very top of the ER section, expand it a bit, and use it as the initial explanation
23. aloftus2 23 Apr 2021
  
  in Public
  
  𝑗>𝑖j>ij > i
  
  maybe I'd say "whenever a_{ij} is above the diagonal on the adjacency matrix", more words but easier to visualize
24. aloftus2 23 Apr 2021
  
  in Public
  
  Basically, the approach is to look at each entry of 𝐴AA which can take different values, and multiply the total number of possibilities by 222 for every element which can take different values
  
  i'd approach this kinda like the intro does it, using example of 2, then 4, etc
25. aloftus2 23 Apr 2021
  
  in Public
  
  The answer is to use something called combinatorics.
  
  if readers understood this section so far, they know what combinatorics is
26. aloftus2 23 Apr 2021
  
  in Public
  
  If 𝐴AA is 2×22×22 \times 2, there are (22)=1(22)=1\binom{2}{2} = 1 unique entry of 𝐴AA, which takes one of 222 values. There are 222 possible ways that 𝐴AA could look: [0110] or [0000][0110] or [0000]\begin{align*} \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}\textrm{ or } \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \end{align*} If 𝐴AA is 3×33×33 \times 3, there are (32)=3×22=3(32)=3×22=3\binom{3}{2} = \frac{3 \times 2}{2} = 3 unique entries of 𝐴AA, each of which takes one of 222 values. There are 888 possible ways that 𝐴AA could look: ⎡⎣⎢⎢011101110⎤⎦⎥⎥ or ⎡⎣⎢⎢010101010⎤⎦⎥⎥ or ⎡⎣⎢⎢001001110⎤⎦⎥⎥ or ⎡⎣⎢⎢011100100⎤⎦⎥⎥ or ⎡⎣⎢⎢001000100⎤⎦⎥⎥ or ⎡⎣⎢⎢000001010⎤⎦⎥⎥ or ⎡⎣⎢⎢010100000⎤⎦⎥⎥ or ⎡⎣⎢⎢000000000⎤⎦⎥⎥
  
  I like this stuff
27. aloftus2 23 Apr 2021
  
  in Public
  
  ≜
  
  no non-advanced user will know what this symbol means
28. aloftus2 15 Apr 2021
  
  in Public
  
  n = 300 xs_1 = np.random.normal(size=n) pi = 0.5 ys = np.random.binomial(1, pi, size=n) xs_2 = np.random.normal(loc=ys*5) sex=["Male", "Female"]
  
  might actually rename variables and show this block of code
29. aloftus2 15 Apr 2021
  
  in Public
  
  by way of your node metadata,
  
  maybe should be removed, I don't think we've talked about node metadata, but also adds information, so idk
30. aloftus2 15 Apr 2021
  
  in Public
  
  defined
  
  remove
31. aloftus2 15 Apr 2021
  
  in Public
  
  it has clearly delineated communities, which are the vertices that comprise the obvious “squares” in the above adjacency matrix.
  
  it has distinct communities. Remember that the connections for each node is represented by a row of this heatmap. The first half of the rows have strong connections with the first half of the columns, so the students in the first school are likely close friends
  
  ^ I don't like this much, would like to see more simplified, but probably how I would describe it
32. aloftus2 15 Apr 2021
  
  in Public
  
  vertices
  
  nodes
33. aloftus2 15 Apr 2021
  
  in Public
  
  We notice this from the fact that there are more connections between people from school 111 than from school 222.
  
  unecessary sentence (basically the same as the previous one, but reworded)
34. aloftus2 15 Apr 2021
  
  in Public
  
  =False) meta = pd.DataFrame( data = {"School": tau.reshape((n)).astype(int)} ) ax=adjplot(A, meta=meta, color="School", palette="Blues")
  
  add legend that says which color belongs to which school
35. aloftus2 15 Apr 2021
  
  in Public
  
  meta = pd.DataFrame( data = {"School": tau.reshape((n)).astype(int)} ) ax=adjplot(A, meta=meta, color="School", palette="Blues")
  
  this code is confusing for a reader who hasn't played with graspologic's adjplot section
  
  I'd break into two sections and hide
36. aloftus2 15 Apr 2021
  
  in Public
  
  B = np.zeros((K, K)) B[0,0] = .5 B[0,1] = B[1,0] = .2 B[1,1] = .3
  
  dont construct like this, just write out the whole thing as a numpy array
37. aloftus2 15 Apr 2021
  
  in Public
  
  0.20.2.
  
  Now, let's turn this scenario into a network, which we can model with a Stochastic Block Model. Students can represent nodes, and their friendships can represent edges.
38. aloftus2 15 Apr 2021
  
  in Public
  
  school 222
  
  the second
39. aloftus2 15 Apr 2021
  
  in Public
  
  111 of 222
  
  might replace this with "one of two"
40. aloftus2 15 Apr 2021
  
  in Public
  
  M),
  
  where "Gaussian" just means "normal"
41. aloftus2 15 Apr 2021
  
  in Public
  
  𝑖ii is male, is a realization of a Gaussian random variable with mean 000 and variance 111, and if 𝑖ii is female, is a realization of a Gaussian random variable with mean 555 and variance 111.
  
  sentence structure is confusing to me
42. aloftus2 15 Apr 2021
  
  in Public
  
  were 𝐸𝑅𝑛(𝑝)ERn(p)ER_n(p)?
  
  is Erdos-Renyi?
43. aloftus2 15 Apr 2021
  
  in Public
  
  For instance, if we see a half of the nodes have a very high degree, and the rest of the nodes with a much lower degree, we can reasonably conclude the network might be more complex than can be described by the ER model.
  
  I'd replace this with "For instance, if we see that half the nodes have a ton of edges (meaning, they have a high degree), and half don't, we should probably use a more complicated model than an Erdos-Renyi"
44. aloftus2 05 Apr 2021
  
  in Public
  
  <ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠−ℎ𝑖𝑔ℎ𝑙𝑖𝑔ℎ𝑡𝑐𝑙𝑎𝑠𝑠="ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠−ℎ𝑖𝑔ℎ𝑙𝑖𝑔ℎ𝑡">𝜏⃗ </ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠−ℎ𝑖𝑔ℎ𝑙𝑖𝑔ℎ𝑡>
  
  rendering weird
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/single-network-models.html
docs.neurodata.io docs.neurodata.io

Multigraph Representation Learning — Network Machine Learning in Python

3
1. aloftus2 19 Apr 2021
  
  in Public
  
  The normal brains are all drawn from the same distribution, and the abnormal brains are also all drawn from the same distribution
  
  wrong, fix
2. aloftus2 12 Apr 2021
  
  in Public
  
  The goal of MASE is to embed the networks into a single space, with each point in that space representing a single node
  
  you want to separately identify homogeneity and heterogeneity
3. aloftus2 12 Apr 2021
  
  in Public
  
  However, what you’d really like to do is combine them all into a single representation to learn from every network at once.
  
  add figure 1 from this http://www.cis.jhu.edu/~parky/CEP-Publications/MCPTP-JOC2010.pdf
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch6/multigraph-representation-learning.html
docs.neurodata.io docs.neurodata.io

2. Why Use Statistical Models? — Network Machine Learning in Python

1
1. aloftus2 15 Apr 2021
  
  in Public
  
  our
  
  a
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/ch5.html
Mar 2021
docs.neurodata.io docs.neurodata.io

Single-Network Models — Network Machine Learning in Python

3
1. aloftus2 30 Mar 2021
  
  in Public
  
  two possibilities
  
  edge possibilities
2. aloftus2 30 Mar 2021
  
  in Public
  
  ubstituting symbol A from STIXNonUnicode
  
  ?
3. aloftus2 30 Mar 2021
  
  in Public
  
  ≜
  
  what does this symbol mean?
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/single-network-models.html
docs.neurodata.io docs.neurodata.io

1.1. Matrix Representations — Network Machine Learning in Python

1
1. aloftus2 30 Mar 2021
  
  in Public
  
  remember to have regularized laplacian here
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch4/matrix-representations.html
docs.neurodata.io docs.neurodata.io

2. Why Use Statistical Models? — Network Machine Learning in Python

2
1. aloftus2 18 Mar 2021
  
  in Public
  
  see 𝑎𝑎aa\pmb a, a
  
  more thoughts on notation: I now think we should use A for a random network, and A for its realization
2. aloftus2 18 Mar 2021
  
  in Public
  
  let’s imagine we are tossing a coin 100 times, and we want to determine what the probability of the coin landing on heads is.
  
  Maybe start the note with this example, and then lead into "the traditional framework..." after? homl does this a lot, where the first thing you read will be an example or a thought experiment
Visit annotations in context

Annotators

aloftus2

URL

docs.neurodata.io/graph-stats-book/representations/ch5/ch5.html

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL