I will be posting a video every Sunday on the other channel, so bell DABAAAAA dena: ru-vid.com/show-UCvEKHATlVq84hm1jduTYm8g I have commented the intuition in a lot of comments, however you can check this link: stackoverflow.com/questions/37253739/intuition-behind-the-algorithm-to-compute-single-source-shortest-path-for-every The reason of using topo sort is simple, we save a lot of time by starting from the start point, instead of starting from any random point. Since we know via topo sort we get to know the starting point, hence it becomes optimal to use topo sort!
This is a beautiful explanation. I always used to use DP for shortest path in a DAG. But here, you've done the same processing on the topological ordering ! I never imagined of such a beautiful solution.
This problem, can be solved through BFS too.But there will be too many redundancies in it. We start from source=0, it relaxes its adjacents, and pushes these adjacent nodes along with their distances into the queue. Then these adjacent nodes will further relax their adjacent nodes and push the relaxed nodes. Consider this graph: first of pair is the adjacent node and second of pair is the edge weight 1->[(2,2), (3,1)] 2->(4,2) 3->(5,1) 5->(4,1) 4->(6,1) Final queue will be like - (first of pair is the node and second of pair is the distance from source to this node) (1,0)(2,2)(3,1)(4,4)(5 2)(6 5)(4 3)(6 4) These all will be popped out when they relax their adjacent nodes. For ex, (1,0) will be popped out before (2,2)(3,1) are pushed into queue and so on. As we can see, there is redundancy, node 4 first came in the queue with a distance of 4 from source, and then again node 4 came with a distance of 3 from the source, which increases time complexity, because, now (4,3) will have to again relax its adjacent nodes to reduce the distances of its adjacent nodes. So, if the pre-requisites of any node(say, cur) are relaxed as minimum as possible before the relaxation of node cur.Then redundancy will never occur. Taking the same example as above, if 1 2 3 5 are relaxed as minimum as possible before the relaxation of node 4. Then redundancy will never occur. The solution to the above observation is Topological sort. If we have Topo sort sequence, then we will take the source node first and relax its adjacent nodes.After that, we take next node in the topo sort, and will do the same. TC - O(N+E) SC-O(N)
Although there will be redundancies in simple BFS method, but the time complexity is still the same i.e O(N+E), also we dont have to find topSort, so isn't BFS better?
Correct me if I am wrong. I think we don't need to find the full topological sort, we can just pass the src as node to the topological sort function, and don't care about the unvisited nodes after we have returned from the function because if they were unvisited even after the end of function that implies they were unreachable from src, so we can remove the loop. Replace //Start for (int i = 0; i < N; i++) if (!vis[i]) findTopoSort(i, vis, st, adj); //End with just "findTopoSort(src vis, st, adj);" This will greatly improve the time complexity for graphs with many nodes unreachable from src.
Checking whether the distance of node is infinity or not is not necessary, because as we are coming to this current node (say, cur), the pre-requisite nodes of cur (prerequisites nodes are those whose cur is an adjacent node ) have already relaxed cur (atleast 1 time). So we can remove that INF condition.
Thanku striver.I completed this graph series in 8 days and learned so much things.you explained very well.Thankuu again. If you have free time then pls pls pls do make a series on DP...You know there are 3 topics which are most difficult recurssion,graph and dp.I learned recurssion in depth from your live video on codeverse.so i also want to learn DP by u.so pls pls pls if u have time then pls make a series on DP or take live class on codeverse
For someone getting confused why we are bothering about finding the Topological Sort and the statement "Please check if the distance to the node is not Infinity", Read this comment to understand: 1. Why do we need to find the topological sort? The answer is very simple. For understanding the answer, ask yourself what is the type of the given graph. Yes, it is a DAG. Which means there are no cycles. Right? Of course! That means there are no back edges (an edge from node to itself or its ancestor). Now ask yourself what is a Topological Sort? Yes. It is the ordering of the nodes in the increasing order of the their indegrees. That means all the nodes having less indegree would be on top in this ordering. Now, suppose we want to find a shortest path to a node say 4 in the following topological sort ordering [0, 1, 2, 3, 4, 5, 6]. What is the first thing that comes in your mind? Yes. We need to make sure that before visiting the node 4 all the incoming edges to 4 have been taken into account to find the shortest path to it. This is the property of the topological sort. Isn't it? That's the reason why we should find Topological Sort. 2. Coming to the second point. Which is why do we need to check if the distance to the node in the traversal of the Topological Sort is not infinity. We know that the distance from source node to itself is 0. We also know that the graph is a DAG which do not have any back edge. So, it is obvious that all the nodes before the source node in the topological ordering are unreachable from the source node as there are no back edges. The very first node in the topological ordering which is not having the distance an infinity is the source node itself with the distance of 0. That's the reason why we need to check if the distance is infinity or not. P.S.: In case you need the source that I have used to get the intuition behind this, here is the link to it: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-BNpWnXUhMC4.html&ab_channel=5MinutesEngineering
This method is only applicable if the source node is a node with degree 0 and all other nodes are reachable from source node. You can use simple bfs or dfs but the topo sort is used to save time so that you don't have to traverse the same node again as using topo we can maintain a proper sequence of updating the nodes distance from the source!
No, this method can still be used even when other nodes are not reachable from source node, bcz the dist array will remain unchanged and dist of every node from source remains infinity.
Thanks! *Why do we require topological sort instead of normal BFS?* § Unlike unweighted graphs we can't use "only BFS" for weighted graphs. The algorithm fails certain neighbors have dependency on each other. See below graph (blues ones are nodes. Node (0) is directed to node (4) with weight '1' and node (4) is directed to node (1) with weight '1') 0️⃣--'3'-->1️⃣--'2'-->2️⃣ '1'\ '1'/ 4️⃣ □ Now consider neighbors of node (0) = (1), (4). BFS doesn't guarantee which out of (1) or (4) will be processed first (since it depends on the order in which they appear in adjacency list). Lets say (1) is processed first. It updates the distance of node (2) to '5'. □ Now node (4) gets processed and it updates the distance of node (1) to '2'. Ideally, the distance of node (2) should also be updated to '4', but it never happens. Why? because node (1) is already visited and so BFS will not push it in the queue again, and thus it never gets processed again. So, now we have an error in calculating distance for node 2. □ To avoid this error, we have to make sure that all the nodes that (1) depends on, in our case (4), are processed before (1), so that by the time we process (1), we are sure that it will never get updated again. □ How can we do that? By traversing in order of dependencies. That is, for every node all its dependencies should be processed before reaching it. Topological sort provides us with exactly this! □ If we do BFS traversal in order of how they appear in topolgical sorting, we can make sure we don't run into the dependency issues we saw above.
the modified bfs discussed in the previous video even works on this example but the only problem is that it takes more turns in queue but the previous video algo works here also.
Why topological sort? If we found a node with a distance of 7 from src and we updated the other paths in the graph taking 7 into consideration then later we found the distance for the same node from src is 4, in this case, previous paths will be wrong and we have to update them again. what topological sort ensures is that after we update the distance of the source node, since the topological order node will have a node with the maximum number of adjacents at last, we update those adjacents first which prevents the above problem
Correct me if I'm wrong, this algorithm finds the shortest distance from unreachable nodes to reachable nodes, right? That is why we use Topo sort correct? But what if the question contains a source node that is different from 0 in this case, like node 2. Should we in that case perform a normal BFS to find the shortest path?
@@anmolsharma9539 this problem can be solved using bfs this technique gives the shortest path from the node with indegree 0 to all other nodes , because we are doing a topo sort at the start. But if the question needs a diff node just proceed with bfs or other shortest path algorithms.
@@rithikdutt1332 he has explained that case in the video bro actually then we will print infinity specifically "INF" as in the video u can see the code also in which he has explained this. Now if you will pass 2 as source node the you will get the following output: INF INF 0 6 INF INF here you can see the unreachable node are printed at INF which actually represent the max distance or infinity distance. Hope you got your answer and please clear me if I am wrong!!
if(dis[node]!=inf) is useful in case of disconnected graph am i right? if we dont use it then it will also calculate the distance for unreachable nodes?
No, it's not useful for disconnected graph, it is useful in case of more than one strongly connected components. What I mean to say is, if the source node is there is let's say x. and if the topmost node of the stack (first topo sort node) is y. "If(dis[node]! =inf)", this is just used to check if we are able to reach the y from the x or not. If y is not able to reach from x (as it is a directed graph), then the distance will be kept as infinity itself, we won't change the distance. It's true what you've written, it's just that "disconnected" is not a proper word here.
@@samkr200 Why do we need to have this check "if(dis[node]!=INF)", can't we do something like "ans[n.first] = min(ans[n.first], ans[curr] + n.second)" As by doing this, we will make sure that we keep the minimum possible distance for visiting each node? And this we will do only for the adjacent nodes of the node popped from the stack?
I have a question. How do we know that the very first element that we are popping out of stack will be s whose dist we initialised as 0, because even when we do topological sort, we don't start with s, we start with 0, so are we assuming here that s=0? Because if top element of stack i.e. topological sort isn't s, then the statement, if dist[node]!=Integer.MAX_VALUE, wont be satisfied
yes you are right..but we are initializing dist[src]=0 so the nodes before source which are present in the stack will be of no use they will not follow the condition dist[node]!=inf..and simply be popped out from the stack
@@rohanrastogi1077 so basically the topo sort ordering ensures that if there are elements which come before the source , then actually there wont be a path from the source to those vertices (which occurred previously), and hence the distance of those vertices will always remain infinity from the source..right ?
do we need to have topological sort of whole graph. can'nt we just do it with dfs call from src (for toposort) because nodes in other components are at inf distance anyways.
Java Code /* Shortest path in Weighted DAG from a given source node */ import java.util.*; class WDAG{ public static void main(String[] args) { Scanner sc = new Scanner(System.in); int v = sc.nextInt(); int e = sc.nextInt(); ArrayList adj = new ArrayList(); for(int i = 0 ;i < v; i++){ adj.add(new ArrayList()); } for(int i = 0 ; i < e; i++){ int u = sc.nextInt(); int vv = sc.nextInt(); int w = sc.nextInt(); ArrayList temp = new ArrayList(); temp.add(vv);temp.add(w); adj.get(u).add(temp); } //Code to print Adj list System.out.println(""); System.out.println("Adjacency List is "); int i = 0; while(i < v ){ for(ArrayList ar : adj.get(i)){ Integer n = ar.get(0), w = ar.get(1); System.out.print(i + " --> ("+ n +" , "+ w +") || "); } System.out.println(""); i++; } System.out.println(""); int src = 0; int[] dis = shorttestPath(v,adj,src); for(int it : dis) if(it != Integer.MAX_VALUE) System.out.print(it+" "); else System.out.print("INF "); System.out.println(""); sc.close(); } public static int[] shorttestPath(int v,ArrayList adj,int src){ boolean[] vis = new boolean[v]; int[] d = new int[v]; Stack st = new Stack(); Arrays.fill(d,Integer.MAX_VALUE); d[src] = 0; for(int i = 0 ; i < v;i++){ if(!vis[i]) dfs(i,vis,st,adj); } while(!st.isEmpty()){ int cur = st.pop(); if(d[cur] != Integer.MAX_VALUE){ for(ArrayList it : adj.get(cur)){ int n = it.get(0), w = it.get(1); if(d[cur] + w < d[n]) d[n] = d[cur] + w; } } } return d; } public static void dfs(int v,boolean[] vis,Stack st, ArrayList adj){ vis[v] = true; for(ArrayList it : adj.get(v)){ int node = it.get(0); if(!vis[node]) dfs(node,vis,st,adj); } st.add(v); } }
@Shubam Gopal Vyas That's not correct, there are some test cases where overhead still remains as it is. And here striver has take the test case where overhead is not seen. See this graph : each edge u->v is denoted like this [ u , v , wt ] Graph[][] : [ 0 , 1 , 1 ] [ 1 , 2 , 3 ] [ 1 , 3 , 5 ] [ 1 , 4 , 4 ] [ 0 , 3 , 20 ] [ 0 , 4 , 30 ]
What if in this example , the source node is 4 and the distination is 3, then this algorithm will not work i guess. Should we pop the nodes from the stack until we have reached the source node? Thanks
Yeah, we will pop the nodes until we reach the source node(and those nodes which are unreachable from source will be INF) and the algorithm will then compute the distance from the source node to all the reachable nodes from it.
I have answered in the other comments as well. Please check them out, or you can check this article: stackoverflow.com/questions/37253739/intuition-behind-the-algorithm-to-compute-single-source-shortest-path-for-every
I've a different sort of doubt. Will the "src" provided always be the first element of the graph? Or there's no such rule and we can be provided with any element from the middle as "src"?
@@adityanarayangantayat7133 Nope. say if src is a 1 in this graph that striver has mentioned. then the nodes 0,4 will be left having infinite distance as they can't be reached. and we would just pop these nodes of the topologically ordered stack till we reach the src node which is having 0 distance. and then from the nodes after this src node we would update the distance array. Try Dry running once with 1 you'll get it better. Also Try with BFS to see why this is so much better.
simply do topological sort for source node only then there would be those node in stack which are accessible by source and hence no need to check inf condition. Am i right or missing some case??
I'm inquisitive to know about the weight of an edge in the Graph. Suppose, If I add just 3 nodes where W=Weight 0 => [(N=1, W=5), (N=2, W=3)] 1 => [(N=2, W=2)] Why in the Graph doesn't use W=1 or W=10 or any other value? How were the values calculated or defined here? Why not there's logic to set the values of weight through the program not manually?
So as we cant use the topo sort on the undirected graph, Can we assume that in worst case TC for find distance in Undirected is greater than directed(TC is reduced using topo sort).
what if the first node is not a source node in our stack as there can be multiple topological order so should we continue without doing any operation? eg - topological order gives you -> 4 - 0 - 5 - 1 - 2 - 3 .. and the source node is 0 in this case 4 itself has INT_MAX in it's dist array should we continue this step without doing any operation? Once again excellent explanation you're making one of the advance data structure easy to understand thanks a lot ;)
Yes. Since it's the first element of topo sort, it has in-degree 0 which means it is not reachable from 0 (if you are considering 0 as src node). So it should print "INF" for node 4.
Correct me if I am wrong can we make the dist[st.top()]=0 instead of passing the src node and making dist[src]=0 because we can't say that the src that we are passing is having indegree as 0 and it is also not guarenteed that the src node that we are passing would be on the top of the stack after the topo sort
Bhai you have attached the wrong code links. Like i guess, links are missplaced. Do check. Some people may have some issues. Last video contained code link of this one. This contain code link of next one. Btw, thanks for contributing so much to Community ❤️
Sir if we are going according to the topological sort, then why do we need to check if distance of a node is not infinity. Because we'll only be going to the nodes that are adjacent to the current node{the accessed node's distance will not be infinity as it will be already updated by then}, so no way of reaching a node that is inaccessible(of distance infinity) from the current node. Please explain sir. Or in what kind of case will it happen ?
Sir, why can't we say it is better algo than dijkstras... As it is O( N+E) but in case of dijkstras it was O( N log N) , but on internet it is mentioned dijkstras is best algo for shortest path
@@takeUforward Hi striver, I need some justification on my thoughts regarding shortest paths 1) Dijkstra works perfectly well for weighted undirected graphs(Cyclic and Acyclic) with no negative cycles(may contain some negative edges but not negative cycles). 2) Dijkstra works perfectly well for weighted directed (Cyclic and Acyclic) graphs with no negative weights(only positive weights) 3) The Above discussed TopoOrder Relaxing technique works for Directed Acyclic Graph with negative Edges. I need your view on above points. Hope You will answer!!!
in the example graph striver took 1 node that has indegree 0 that is the node 0 which is also an src , but what if we have 2 's indegree node for e.g. 0 & 5 and our topologiacal order came such a way that in the stack's top we have zero and our question have given that the src should be 5 Then what should i do can anyone tell me
why we are using topological sort here..i didn't get the intuition behind sorting first ?? what's wrong in iterating from 0-->N ? Can anybody Explain ?
This distance array we got contains min distance from node 0 to any other node , what if they has asked min distance from node 4 ? How will we get this from this array ?
as we are setting dist[src] = 0 , it will start from there, the condition dis[node] != infinity will not let any other node than src to enter that condition first. Because of topological sort notes coming before src will be discarded.
@@zacian4941 actually in previous algo if you observe you were getting the shortest path at the first visit of that node only after that there were happening comparisons but the value wouldn't had been changed means you are pushing the value of node in queue only when the dist of it (node) is INF so i think it will work in the same tc of bfs of O(V+E) ,correct me if i am wrong
I tried it and it seems to work. I added the edge weights instead of adding 1. Here's my python code: class Graph: def __init__(self, vertices): self.graph = [ [] for i in range(vertices)] self.vertices = vertices def addEdge(self, src, dest, wt): self.graph[src].append((dest, wt)) def dispGraph(self): for i in range(len(self.graph)): print(i, end=" -> ") for edge in self.graph[i]: print(edge, end = ", ") print() def findShortestPath(self, src, dest): distances = [float('inf')] * self.vertices distances[src] = 0 q = [] q.append(src) while q: node = q.pop(0) for nbr in self.graph[node]: if distances[node] + nbr[1]
Yes, I also solved the problem without using topological sort and it did work, am not getting the correct intuition to use topological sort here, can someone explain me?
@@adityapundir5549 When you use a bfs, you are not sure which is the starting point, hence it might happen you have to iterate in the queue many times, when you use topo sort, you are sure this is the first node which has no incoming nodes. Hence you start from the starting point, hence taking lesser amount of time. While using BFS, you are not sure if your source is the starting point or not, if it is not, then you will be taking extra turns.
So why topo sort work This is what i think The nodes which are above src in stack are non reachable so the check of dis[node]!=inf works. after that all nodes are visited in order of their dependencies like a node 3 will be reahced by all the nodes pointing to it and we will just take the minimum out of them.
as we are setting dist[src] = 0 , it will start from there, the condition dis[node] != infinity will not let any other node than src to enter that condition first. Because of topological sort notes coming before src will be discarded.