519 Matching Annotations
  1. Apr 2022
    1. heatmap

      range -1 +1

    2. Since most of these metrics already exist in networkx, you’ll just pull from there. You can check the networkx documentation for details.

      use graspologic

    3. community

      estimated community

    4. features

      these features

    5. for reference, gives an indication of how clustered the network is, and works by counting the proportion of triangles out of all edges that share a node.

      refer to previous definition

    6. triangles

      clarify earlier that we always mean closed after

    7. statistics

      for the 4 statistics we show

    8. 103501035010^{350}

      cite # atoms in universe, number of chess games, number of go games on a standard go board

    9. How do you define a distance between one node and another node?

      How do you divide a network by the number 6

    1. largest

      mention strongly and weakly connected (in a sentence)

    2. 𝑆𝐼,𝐡𝐾),(𝐡𝐾,𝑀𝐻),(𝑀𝐻,𝑄),(𝑀𝐻,𝐡𝑋),(𝑄,𝐡𝑋

      use 1 letter acronyms for each borrough

    3. Island

      ,

    4. 𝐴

      for undriectd graphs

    5. figure

      move MH

    6. see

      change layout so they are not linear

    7. degree

      do directed case too

    8. matrix

      do directed case too

    9. π‘‘π‘’π‘”π‘Ÿπ‘’π‘’(𝑖)=βˆ‘π‘—=1π‘›π‘Žπ‘–π‘—=βˆ‘π‘—:π‘Žπ‘–π‘—=1π‘Žπ‘–π‘—+βˆ‘π‘—:π‘Žπ‘–π‘—=0π‘Žπ‘–π‘—=βˆ‘π‘—:π‘Žπ‘–π‘—=11+0

      remove

    10. adjacency

      weighted

    11. 𝑗jj

      for example

    12. matrix

      hollow

    13. trite

      why?

    14. In the context of this book, you will usually only worry about the undirected case, or when the presence of an arrow implies that the other direction exists, too. A network is undirected if an edge between node 𝑖ii and node 𝑗jj implies that node 𝑗jj is also connected to node 𝑖ii. For this reason, you will usually omit the arrows entirely, like you show below:

      we

    15. red

      super not salient

    16. you look

      we see

    1. The

      An Adjacency Matrix not The Adjacency Matrix

    2. Matrix

      there are n! different adjacency matrices, and we are looking at 1 of them.

    3. Laplacian

      mention directed graphs

      and mention all the ones in graspologic

    4. This lets you make inferences with πΏπ‘›π‘œπ‘Ÿπ‘šπ‘Žπ‘™π‘–π‘§π‘’π‘‘LnormalizedL^{normalized} that you couldn’t make with 𝐿LL.

      true?

    5. code

      there should be a function in graspologic to convert an adjacency to a laplacian and back.

    6. below

      ?

    7. AttributeError Traceback (most recent call last)

      fix

    8. 4.1.2. The Incidence MatrixΒΆ Instead of having values in a symmetric matrix represent possible edges, like with the Adjacency Matrix, we could have rows represent nodes and columns represent edges. This is called the Incidence Matrix, and it’s useful to know about – although it won’t appear too much in this book. If there are 𝑛nn nodes and π‘šmm edges, you make an π‘›Γ—π‘šnΓ—mn \times m matrix. Then, to determine whether a node is a member of a given edge, you’d go to that node’s row and the edge’s column. If the entry is nonzero (111 if the network is unweighted), then the node is a member of that edge, and if there’s a 000, the node is not a member of that edge. You can see the incidence matrix for our network below. Notice that with incidence plots, edges are (generally arbitrarily) assigned indices as well as nodes. Click to show from networkx.linalg.graphmatrix import incidence_matrix cmap = sns.color_palette("Purples", n_colors=2) I = incidence_matrix(nx.Graph(A)).toarray().astype(int) fig, axs = plt.subplots(1, 2, figsize=(12,6)) plot = sns.heatmap(I, annot=True, linewidths=.1, cmap=cmap, cbar=False, xticklabels=True, yticklabels=True, ax=axs[0]); plot.set_xlabel("Edges") plot.set_ylabel("Nodes") plot.set_title("Incidence matrix", fontsize=18) ax2 = nx.draw_networkx(G, with_labels=True, node_color="tab:purple", pos=pos, font_size=10, font_color="whitesmoke", arrows=True, edge_color="black", width=1, ax=axs[1]) ax2 = plt.gca() ax2.text(.24, 0.2, s="Edge 1", color='black', fontsize=11, rotation=65) ax2.text(.45, 0.01, s="Edge 0", color='black', fontsize=11) ax2.set_title("Layout plot", fontsize=18) sns.despine(ax=ax2, left=True, bottom=True) Copy to clipboard /tmp/ipykernel_4749/905225500.py:4: FutureWarning: incidence_matrix will return a scipy.sparse array instead of a matrix in Networkx 3.0. I = incidence_matrix(nx.Graph(A)).toarray().astype(int) Copy to clipboard When networks are large, incidence matrices tend to be extremely sparse – meaning, their values are mostly 0’s. This is because each column must have exactly two nonzero values along its rows: one value for the first node its edge is connected to, and another for the second. Because of this, incidence matrices are usually represented in Python computationally as scipy’s sparse matrices rather than as numpy arrays, since this data type is much better-suited for matrices which contain mostly zeroes. You can also add orientation to incidence matrices, even in undirected networks, which we’ll discuss next. 4.1.3. The Oriented Incidence MatrixΒΆ The oriented incidence matrix is extremely similar to the normal incidence matrix, except that you assign a direction or orientation to each edge: you define one of its nodes as being the head node, and the other as being the tail. For undirected networks, you can assign directionality arbitrarily. Then, for the column in the incidence matrix corresponding to a given edge, the tail node has a value of βˆ’1βˆ’1-1, and the head node has a value of 000. Nodes who aren’t a member of a particular edge are still assigned values of 000. We’ll give the oriented incidence matrix the name 𝑁NN. Click to show from networkx.linalg.graphmatrix import incidence_matrix cmap = sns.color_palette("Purples", n_colors=3) N = incidence_matrix(nx.Graph(A), oriented=True).toarray().astype(int) fig, axs = plt.subplots(1, 2, figsize=(12,6)) plot = sns.heatmap(N, annot=True, linewidths=.1, cmap=cmap, cbar=False, xticklabels=True, yticklabels=True, ax=axs[0]); plot.set_xlabel("Edges") plot.set_ylabel("Nodes") plot.set_title("Oriented Incidence matrix $N$", fontsize=18) plot.annotate("Tail Node", (.05, .95), color='black', fontsize=11) plot.annotate("Head Node", (.05, 1.95), color='white', fontsize=11) ax2 = nx.draw_networkx(G, with_labels=True, node_color="tab:purple", pos=pos, font_size=10, font_color="whitesmoke", arrows=True, edge_color="black", width=1, ax=axs[1]) ax2 = plt.gca() ax2.text(.24, 0.2, s="Edge 1", color='black', fontsize=11, rotation=65) ax2.text(.45, 0.01, s="Edge 0", color='black', fontsize=11) ax2.set_title("Layout plot", fontsize=18) sns.despine(ax=ax2, left=True, bottom=True) ax2.text(-.1, -.05, s="Tail Node", color='black', fontsize=11) ax2.text(.9, -.05, s="Head Node", color='black', fontsize=11) Copy to clipboard /tmp/ipykernel_4749/2332982598.py:4: FutureWarning: incidence_matrix will return a scipy.sparse array instead of a matrix in Networkx 3.0. N = incidence_matrix(nx.Graph(A), oriented=True).toarray().astype(int) Copy to clipboard Text(0.9, -0.05, 'Head Node') Copy to clipboard Although we won’t use incidence matrices, oriented or otherwise, in this book too much, we introduced them because there’s a deep connection between incidence matrices, adjacency matrices, and a matrix representation that we haven’t introduced yet called the Laplacian. Before we can explore that connection, we’ll discuss one more representation: the degree matrix.

      put in appendix

    9. ax2.text(0, 0.2, s="Nodes 0 and 2 \nare connected", color='black', fontsize=11, rotation=63)

      add nodes 0 and 1 are connected

    10. The most

      A

  2. Mar 2022
    1. 1.1.2. Viewing Networks StatisticallyΒΆ

      Network machine learning v. network data science

    2. 1.1. What Is A Network?ΒΆ

      What is Network Machine Learning

    3. features

      give example, but then show something else

    4. For instance, each node might have a set of features attached to it: extra information that comes in the form of a table. For instance

      2 for instances

    1. Acknowledgements

      add all the other crap he included

    2. models

      i wouldn't use the word 'model' in this book

    3. The theoretical underpinnings of the techniques described in the book

      i'd put in appendix

    4. network data science

      maybe make a venn diagram including networks, data science, and machine learning

    5. Network Machine Learning and YouΒΆ

      start with something awesome that happened in the world because of network data science?

    6. a statistical object

      avoid 'statistics' everywhere

    7. This book assumes you know next to nothing about how networks can be viewed as a statistical object

      about how you can learn from network data.

      not 'viewed as a stats object'

    8. With the nodes as employees, edges as team co-assignment between employees, and node covariates as employee performance, you can isolate groups within your company which are over or underperforming employee expectations.

      move to end before brains

    9. So

      delete 'so'

  3. Dec 2021
    1. 𝜌ρ\rho

      let's mention Rho too

    2. 5.5.3. Correlated Network ModelsΒΆ

      put time-series after this

    3. 𝑅(1)R(1)R^{(1)} and

      let these be the identity matrix. i think you can just QR these, and then inverse QR R^3

    4. score matrices

      clarify that they will be dfeind below

    5. Let’s try this first by looking at the latent position matrices for 𝐀(1)A(1)\mathbf A^{(1)} and 𝐀(2)A(2)\mathbf A^{(2)} from the random networks for Monday and Tuesday first:

      they are implicit and not necessarily the same, rather, only X'X is the same.

    6. 𝑁

      M

  4. Oct 2021
    1. Pseudoinverse

      go end to end

    2. array([0.74745577, 0.49019859])

      too many sig digs

    3. f you think about it, however, you can think of this adjacency vector as kind of an estimate for edge probabilities. Say you sample a node’s adjacency vector from an RDPG, then you sample again, and again. Averaging all of your samples will get you closer and closer to the actual edge connection probabilities.

      not quite

    4. The average values were .775 and .149 - so, pretty close!

      say we computed

    5. took a long time

      computtion

  5. Sep 2021
    1. their paper is called β€œOn a two-truths phenomenon in spectral graph clustering” - it’s an interesting read for the curious)

      up

    2. - in 2018 -

      ,

    3. Adjacency

      combine with the below

    4. useful information

      careful

    5. them

      explicitly refer to the figure

    6. Heck

      goes earlier

    7. Laplacian

      never write 'the laplacian' always write "normalize" (etc) Laplacian

    8. two

      is this a rank 2 matrix?

    9. Stochastic Block Model

      no caps

    10. Spectral Embedding

      no caps

    11. β†“βŽ€βŽ¦βŽ₯βŽ₯[←𝑒⃗ 𝑇1β†’]+𝜎2βŽ‘βŽ£βŽ’βŽ’β†‘π‘’βƒ—Β 2β†“βŽ€βŽ¦βŽ₯βŽ₯[←𝑒⃗ 𝑇2β†’]+...+πœŽπ‘›βŽ‘βŽ£βŽ’βŽ’β†‘π‘’βƒ—Β π‘›β†“βŽ€βŽ¦βŽ₯βŽ₯[←𝑒⃗ 𝑇𝑛→]

      confusing

    12. -

    13. close

      in MSE

    14. ninety-degree

      maybe not so correct, eg, 90 degrees in high dimensions does not specify the other angles?

    15. Singular Value Decomposition

      lower case

    16. and another which rotates them bac

      i wouldn't say this

    17. [x11"x22β‹±"xnn]\begin{align*} \begin{bmatrix} x_{11} & & & " \\ & x_{22} & & \\ & & \ddots & \\ " & & & x_{nn} \end{bmatrix} \end{align*}

      makes me think this is diagonal

    18. you’ll

      the alg

    19. submatrices

      not them

    20. break a single matrix

      decompose a matrix

      or factorize?

    21. default

      for undirected graphs

    22. Laplacians

      such a thing as directed laplacians

  6. Aug 2021
    1. whether you’re using a neural network, or a decision tree, or whether your goal is to classify observations or to predict a value using regression -

      those 'or's are not parallel

    2. machine learning

      CS and physics

    1. 6.3.1. The Coin Flip ExampleΒΆ

      starred? maybe mention MLE in section title?

    2. π‘βˆ—pβˆ—p^*

      ptilde

    3. two plots look almost nothing alike

      i wouldn't say that

    4. β„™πœƒ(𝐀=𝐴)=β„™(𝐚11=π‘Ž11,𝐚12=π‘Ž12,...,πšπ‘›π‘›=π‘Žπ‘›π‘›)=βˆπ‘–,π‘—β„™πœƒ(πšπ‘–π‘—=π‘Žπ‘–π‘—)

      don't forget punctuation

    5. (in network statistics, often just one)

      not quite

    6. we can never hope to understand the true distribution of 𝐀A\mathbf A.

      strong

    1. ⟨π‘₯βƒ—Β βˆ’π‘¦βƒ—Β ,π‘₯βƒ—Β βˆ’π‘¦βƒ—Β βŸ©

      remove

    2. With π‘₯βƒ—Β xβ†’\vec x defined as above in the sum example, we would have that: βˆπ‘–=13π‘₯𝑖=6.12∏i=13xi=6.12\begin{align*} \prod_{i = 1}^3 x_i = 6.12 \end{align*} if we were to use ={1,3}I={1,3}\mathcal I = \{1,3\}, then: βˆπ‘–βˆˆξˆ΅π‘₯𝑖=3.4

      redundant

    3. ted to multiply all the elements of π‘₯βƒ—Β xβ†’\vec x, we would write: βˆπ‘–=1𝑑π‘₯𝑖=π‘₯1Γ—π‘₯2Γ—...Γ—π‘₯π‘‘βˆi=1dxi=x1Γ—x2Γ—...Γ—xd\begin{align*} \prod_{i = 1}^d x_i = x_1 \times x_2 \times ... \times x_d \end{align*} Where Γ—Γ—\times is just multiplication like you are probably used to. Again, we have the exact same indexing conventions, where: βˆπ‘–βˆˆ[𝑑]π‘₯𝑖=βˆπ‘–=1𝑑π‘₯𝑖=π‘₯1Γ—π‘₯2Γ—...Γ—π‘₯𝑑

      redundant

    4. In the general case, for a set S\mathcal S, we would say that π‘†βˆˆξˆΏπ‘ŸΓ—π‘S∈SrΓ—cS \in \mathcal S^{r \times c} if (think this through!) for any π‘–βˆˆ[π‘Ÿ]i∈[r]i \in [r] and any π‘—βˆˆ[𝑐]j∈[c]j \in [c], π‘ π‘–π‘—βˆˆξˆΏsij∈Ss_{ij} \in \mathcal S

      overkill?

  7. Jun 2021
    1. . We Like previously, there are two types of RDPGs: one in which 𝑋XX is treated as known, and another in which 𝑋XX is treated as unknown.

      delete

    2. Symbolically

      this is just 1 word.

    3. such

      such as

    4. 300300300

      be consistent. let's stick with 100 students

    5. , since they are undirected; if we relax the requirement of undirectedness (and allow directed networks) 𝐡BB no longer need be symmetric.

      delete

    6. simple

      undirected

    7. Imagine that we are flipping a fair single coin. A fair coin is a coin in which the probability of seeing either a heads or a tails on a coin flip is 1212\frac{1}{2}. Let’s imagine we flip the coin 202020 times, and we see 101010 heads and 101010 tails. What would happen if we were to flip 222 coins, which had a different probability of seeing heads or tails? Imagine that we flip each coin 10 times. The first 10 flips are with a fair coin, and we might see an outcome of five heads and five tails. On the other hand, the second ten flips are not with a fair coin, but with a coin that has a 4545\frac{4}{5} probability to land on heads, and a 1515\frac{1}{5} probability of landing on tails. In the second set of 101010 flips, we might see an outcome of nine heads and one tails. In the first set of 20 coin flips, all of the coin flips are performed with the same coin. Stated another way, we have a single group, or a set of coin flips which are similar. On the other hand, in the second set of twenty coin flips, twenty of the coin flips are performed with a fair coin, and 10 of the coin flips are performed with a different coin which is not fair. Here, we have two clusters of coin flips, those that occur with the first coin, and those that occur with the second coin. Since the first cluster of coin flips are with a fair coin, we expect that coin flips from the first cluster will not necessarily have an identical number of heads and tails, but at least a similar number of heads and tails. On the other hand, coin flips from the second cluster will tend to have more heads than tails.

      but a simplified version of this after the description.

    8. happens

      but below equation on 1 line

    9. ER network

      ER random network

    10. nework

      network

    11. Also, we let πšπ‘–π‘–=0aii=0\mathbf a_{ii} = 0, which means that all self-loops are always unconnected

      Also, we assume it is not possible for anyone to be friends with themselves, which means that a_ii = NaN for all i.

    12. When 𝑖>𝑗i>ji > j, we allow πšπ‘–π‘—=πšπ‘—π‘–aij=aji\mathbf a_{ij} = \mathbf a_{ji}. This means that the connections across the diagonal of the adjacency matrix are all equal, which means that we have built-in the property of undirectedness into our networks

      too complicated.

      We assume here that edges are undirected meaning that if there is an edge from i to j then there is also an edge from j to i

    13. ErdΓΆs RΓ©nyi network

      no such thing as an ER network

    14. The ErdΓΆs RΓ©nyi model formalizes this relatively simple model with a single parameter:

      change something, either this sentence or the table, iid belongs somewhere

    15. model

      use a different word

    16. If 𝐴AA and 𝐴″Aβ€³A'' are members of different equivalence classes; that is, 𝐴∈𝐸A∈EA \in E and π΄β€³βˆˆπΈβ€²Aβ€³βˆˆEβ€²A'' \in E' where 𝐸,𝐸′E,Eβ€²E, E' are equivalence classes, then β„™πœƒ(𝐴)β‰ β„™πœƒ(𝐴″)PΞΈ(A)β‰ PΞΈ(Aβ€³)\mathbb P_\theta(A) \neq \mathbb P_\theta(A'').

      delete

    17. (𝐴

      otherwise, ...

    18. 𝐸EE

      use different notation

    19. For the case of one observed network 𝐴AA, an estimate of πœƒΞΈ\theta (referred to as πœƒΜ‚Β ΞΈ^\hat\theta) would simply be for πœƒΜ‚Β ΞΈ^\hat\theta to have a 111 in the entry corresponding to our observed network, and a 000 everywhere else. Inferentially, this would imply that the network-valued random variable 𝐀A\mathbf A which governs realizations 𝐴AA is deterministic, even if this is not the case.

      MLE

    20. variaable

      typo

    21. parameters

      distribution

    22. ,

      above figure goes before section header

    23. To

      directed with loops for simplicity

    24. think

      first think about nodes

    25. first

      first edge?

      edges are not ordered

    1. Fundamentally

      and it borrows strength

    2. Now, the only question is how to actually pull the separate latent positions for each network from this matrix.

      clarify

    3. big

      do we want to use 4 different symbols for the 4 different numbers?

    4. rows

      blocks

    5. you don’t have to rotate or flip your points to line them up across embeddings.

      assuming no multiplicity of eigenvalues

    6. big circle

      big red circle. i prefer no outlines

    7. bad

      not very bad

    8. –

      remove '-' here

    9. rotations of each other

      i don't think this is english

    10. none of the embeddings are rotations of each other

      not quite right. talk to pedigo.

  8. May 2021
    1. two

      d_i

    2. So

      ask pedigo

    3. In this example they have eight (the number of columns in our combined matrix), but remember that in general we’d have π‘šΓ—π‘‘mΓ—dm \times d. T

      the problem is that they are not in the same subspace

    4. averaging

      where did .5 go

    5. great

      often a great idea (for example....)

    6. node

      these embeddings should look different

    7. he edges of

      remove

    8. Remember

      new section heading: averaging

    9. Grouping the Networks SeparatelyΒΆ

      different ways to do joint analysis

    10. normal

      removr

  9. Apr 2021
    1. Below, we turn our list of networks into a 3-D numpy array, and then do the above multiplication to get a new 3D numpy array of scores matrices. Because we embedded into two dimensions, each score matrix is 2Γ—22Γ—22 \times 2, and the four score matrices are β€œslices” along the 0th axis of the numpy array.

      generate multiple networks

    2. Any

      explain how to get each V first

    3. Combining the networksΒΆ

      add omni as another one

    4. Combining the classifiersΒΆ

      remove

    5. Multiple Adjacency Spectral EmbeddingΒΆ

      3.5.1

    6. communities

      we are learning representations and we can do whatever we want with them, we illustrate with clasification just to show something,

    7. product

      find a better word

    8. Model

      clarify label swapping

    9. Adjacency Spectral Embedding

      only under some strict models. clarify

    10. For example, say you have nine networks of normal brains and nine networks of abnormal brains.

      nine networks from each population.

      To illustrate this point, we simulate 18 brains using the below code.

    11. The normal and abnormal brains are all drawn from stochastic block models, but they come from different distributions. If you look at the code, you’ll see that the normal brains were set up to have strong connections within communities, whereas the abnormal brains were set up to have strong connections between communities. Below is a plot of the adjacency networks for every normal and every abnormal brain we’ve created.

      this paragraph goes below this figure?

    12. types

      groups

    13. create

      estimate

    14. algorithm

      please no

    1. ℝ

      (0,1)

    2. (π‘₯βƒ—Β βŠ€π‘–π‘₯⃗ 𝑗

      define transpose notation

    3. 2

      maybe 1

    4. We will call the matrix 𝑃PP the probability matrix whose π‘–π‘‘β„Žithi^{th} row and π‘—π‘‘β„Žjthj^{th} column is the entry 𝑝𝑖𝑗pijp_{ij}, as defined above. Stated another way, 𝑃=(𝑝𝑖𝑗)P=(pij)P = (p_{ij})

      this goes before any of the actual models, it is the fundamental object associated with any independent edge model

    5. 𝑝

      for each model, write out the P matrix.

    6. We showed above the model for the random node-assignment variable, πœπœβ†’Ο„Ο„β†’\vec{\pmb \tau}, above.

      remove

    7. process

      variable

    8. π‘š11

      connect to tau better. and remove the \mathcal{E}

    9. R.V.

      write out random variable

    10. 50

      what happend to the other 100?

    11. Likelihood

      can you also write out

      P = tau' B tau

      use this later to show that you can always rewrite the above as

      P = x' x, where x = tau * sqrt(B)

    12. β„“

      little k

    13. πœβƒ—

      not a parameter, given realization of a random variable

    14. known

      forgot an asterisk

    15. Logically

      not logic

    16. unique

      remove

    17. cluster

      group

    18. 222

      write out numbers < 10

    19. 𝐸𝑖EiE_i,

      maybe add the union of these equivalence classes is A_n

    20. 1. and 2.

      remove periods

    21. if it is totally unambiguous what πœƒΞΈ\theta refers to,

      remove

    22. As there are no diagonal entries to our matrix 𝐴AA, this means that there are 𝑛2βˆ’π‘›=𝑛(π‘›βˆ’1)n2βˆ’n=n(nβˆ’1)n^2 - n = n(n - 1) values that are definitely not 000 in 𝐴AA, since there are 𝑛nn diagonal entries of 𝐴AA (due to the fact that it is hollow)

      wrong, might not be 0

      don't talk about AdjMat as the network

    23. To summarize

      before this, put the 'formally, ...." stuff

    24. 𝐴 is an 𝑛×𝑛 matrix withΒ 0s andΒ 1s,𝐴 is symmetric,𝐴 is hollow

      this should be formal if you write 'formally'

      a_ij \in {0,1}, a_ij = a_ji, a_ii = 0, \forall i,j \in [n]

    25. |||E||\mathcal E|

      m, and then use it, not m(A)

    26. This means that all the networks are hollow.

      hollow is a property of the adjacency matrix

    1. right

      appropriate/ justified / reasonable

    2. learn

      explore and exploit

    3. If you care about what’s under the hood you should also have a reasonable understanding of college-level math, such as calculus, linear algebra, probability, and statistics.

      add, 'but you won't need it to understand the content, except certain sections that we will highlight as 'advanced material'

    4. http://docs.neurodata.io/graph-stats-book/intro.html

      hyperlink

    5. Johns Hopkins University

      the NeuroData lab at

    6. for example

      capitalize.

      for each example, define the nodes and edges

    7. Network Machine Learning

      no caps