I would have a question about the ex 5.2.a in the midterm. If we look at nodes 1 and 5, I understand that node 1 has a high score since it is "endorsed" by node 7 which has a the highest score. But 5 is endorsed by more nodes than node 1, so it also has a relatively high score too. However, I'm not sure how to clearly separate them. Would you have an advice for that kind of situation, please ?
Also about part b, the solution says that there is almost no teleportation with \(\theta = 0.001\) but on slide 16 of lecture 5, the factor in front of the teleportation matrix is \(1- \theta\). So I would say that there is almost only teleportation, maybe it was stated differently in previous years or am I missing something ?
Still about part b, I'm not sure to understand the solution. Since the matrix H does not change (the distribution over the nodes is just more uniform due to teleportation), I would have said that node 1 would still be the closest to 7. Could you explain the reasoning a bit further please ?
Thank you for your time,
First of all, yes the solution has a typo, case (b) should read "almost only teleportation occurs".
For (a), when teleportation is very rare, you can basically reason about the long-term behavior of the random surfer. In this specific graph, it goes through 7 then 1 in every "round", so their score is almost identical and greatest (1 is slightly smaller because of the residual teleportation probability - fyi we accepted 1=7 in the exam, this is subtle). Then 1's outflow splits equally among 2, 5, and 3. But then 5 also gets 1/2 of 2's inflow, and 1/2 of 3's inflow, so 5 clearly gets more flow than 2,3,4,6. 2=3 and 4=6 is clear from symmetry.
I have a question about this too. In part (a), when there is almost no teleportation (according to lecture, not the solution provided here), then I would argue that 1 and 7 actually have identical rank, since 1 gets all the flow from 7, and that the teleportation part contributes equally to both 1 and 7, so by symmetry they should have identical rank, no?
For this exercise, I do not really understand why does that the out-degree impact the PageRank score of a node. The justification for node 5 is a bit strange for me. I would have said that node 5 has a higher than 2 and 3 as "the flow is more likely to go through 5 than through 2 and 3". Could you explain the justification a bit more please ?
One more comment on (b) here: the explanation in the solution is a bit imprecise. To be more specific, when we have almost only teleportation, this pretty much forces a uniform stationary distribution, with only a small "error term" due to the occasional link being followed. The first-order approximation of this error term is dominated by a single link traversal (because two, three, etc. links in a row is exponentially less likely). Given that the stationary distribution is almost uniform, we can simply ask how much "error flow" comes through links to each node. Node one gets 100% from 7, while 5 gets 1/3 from 1, and 1/2 from 2 and 3 each -> 5 wins.
Alright, thank you !