To a non-technical person like me, the simple explanation looks like Double Djikstra, one from each end. The measure of distance is basically giving a weight to the roads from the end state, as in making it choose things based on how close it is to the finish. -- The only problem I see is you can't do things "as the crow flies" because that might put you on the other side of a river because its so close, while a bridge across the river is a long way away. Basically, in some instances it would follow to the bottom of a "cup", fill the entire "cup" trying to access nodes on the other side simply because the nodes are close, before spilling over the edges, where a regular djikstra might not go into the cup, and if it does it would quickly "drop" entire sections of the cup for being too "far/hard/slow" I guess it depends on the strength of the "closeness" but it seems like a single setting doesn't account for concave search spaces.
(Almost) any cup that would be filled up by A* would also be filled up by Djikstra's. Remember, A* is still factoring in the distance travelled to get to a given node, so any path that is extremely long will be ignored, unless it brings it substantially closer to the destination. Meanwhile, Djikstra's would check out every node in the cup the moment those paths are shorter than the path it is taking around the cup.
Never in my life did I think that I'd be cracking up at a video about an A* Search Algorithm implementation. An entertaining video for sure 😂 I have a project due in less than 24 hours where we need to code A* from scratch, so thanks for reducing my stress and while teaching me this algorithm. I feel a lot better now.
One neat trick is to "prefer" one metric over another and use power notation to calculate overall heuristic. E.g. a node with distance 7 but weight 2, we added them as 7+2 = 9. But instead of that if we prefer shorter distance over smaller weight then weight should be the base raised to the power of distance. So this way we can choose easily between two nodes that would otherwise yield the same heuristic if we add them but with the new rule if one node is with weight of 2 and distance of 7 (2^7=128) and another has distance of 2 and weight of 7 (7^2 = 49) ... we chose the later as 49 < 128 because we preferred the one closer to the end node. Google maps often use this trick.
I've been watching your videos over the last few days, in order to solve a Pacman algorithm of Ghosts taking the shortest route, and found your explanations and content to be very educational and easy to follow. Many thanks and keep up the great work! Fingers crossed that I can now implement my version of A* on an adjacent list of nodes I've created for the maze...
Many thanks for the good video. However, I think you missed to highlight one thing: The heuristic *always underestimates* the distance. I saw people questioning why it can't be that the path through the right side is shorter when we did not calculate the cost. The actual shortest path is always longer than the heuristic distance. Here this lies in the nature of the problem, the euclidean distance (straight path) is always shorter than the lengths when driving zig-zag.
1st: '~ just adds a heuristic to dijktra' was the best statement!! Further, no usage of stupid unnecessary words like 'open list' and 'closed list'. Everythig nice and simple. Also, the animations help overcome handwriting. And the handwriting is there to keep the explaination realistic. 6 from 5 stars
Search from both ends at the same time and stop when the two searches meet. Instead of one search to depth N, now you have two searches to depth N/2. In a graph with many nodes and many connections, the number of nodes at each depth increases with the depth, so each search tree is less than half the size of the original, and the total number of nodes searched is reduced.
Love the content, but it's extremely super difficult to stay focused with camera jumping on and off the paper with the graph. I feel a little bit dizzy every 3 minutes, so had to take often breaks
I'm very close to having tears in my eyes from laughing when 'super sneaky' was introduced as 'not an official term you will find in the literature.' Brilliant side note.
I am impressed by the algorithm which is slightly better than Dijkstra. I've almost understood the idea behind the algorithm but I don't really understand one thing: How are we gonna compute the distance from a node to the destination node when we implement the algorithm. By the way, an implementation of the algorithm would be nice in the description of the video.
emphasis on heuristic, Dijkstra's algorithm guarantee solution quality, once the a node is explored, the minimum distance is known from the starting node to the current node. So A* doesn't address the problems of Dijkstra, it is a heuristic single source shortest path algorithm, while Dijkstra's algorithm, uses the principles of optimality, once the algorithm terminates, you have an optimal solution, e.g. min distance
Great example. But I'm wondering, what if the path from C -> E was a direct path, with just a cost of 1 ? That would mean it was much 'cheaper' than the path through B-H-G But you never expand C, so you'll never know. Or am I missing something?
My thought as well, I’m about to try and implement this myself. Instinctively I’d say you’d probably have to expand every other node until you reach a higher combined weight than your initially optimised path. Because yes if C -> E had distance 8 and weight 8 it would be more efficient to go via C.
They were missing one important detail: The heuristic *always underestimates* the distance. So the actual shortest path is always longer than the heuristic distance. Here this lies in the nature of the problem, the euclidean distance (straight path) is always shorter than the lengths when driving zig-zag.
Let's say a node "P" does not lead to target E, what would be the heuristic of it? But we can only know if a node is leading to the destination or not by traversing it right.
Just to clarify, your heuristic isn't consistent. The path you took had lower cost than your heuristic was at the start node. It seems you have lucked out, though.
It seems that if C-L, L-I, I-K, & K-E all had a weight of 1 and G-E had a weight of 3 the described method would have failed. Did he forget any steps or am I missing something? : \
Surely, the units you store for distance metric have to at least to be in the same ballpark as weights. Is there a way to calibrate them by range mapping and maybe multiplying instead of summing? Sounds like a loophole requiring manual tuning otherwise. Hrm...
But what If one of the last nodes is blocked? Distance to E will still be the shortest, but you can't get there. I guess you would have to alter the node distnces. and try again ore something like that
I not understand how you don’t check the rest of the path past C. Maybe initially c is more expensive to go to but what if the rest of the nodes after C are very cheap? You have to check for that..?? Or is this algorithm only trying to find the destination node as fast as possible, not the shortest path?
Is it true that this algorithm will sometimes find a path that is not necessarily the shortest if you choose a bad heuristic? For example, if I had a network with these nodes (Euclidean coordinates): X: (0,5) S: (0,4) A: (0,3) B: (0,2) C: (0,1) E: (0,0) and these (undirected) edges: X-S: 1 S-A: 1 A-B: 1 B-C: 1 C-E: 1 X-E: 1 If I use Euclidean distance as my heuristic, don't I end up with the path S-A-B-C-E with cost of 4, even though S-X-E has a cost of 2?
The problem that arose in my head is that, when you finish, there could be a node with a distance much shorter than the one currently for E but it's combined heuristic stops it from getting to the top, call this node X. There could be a very short path from X to E, say it is weighted 1. In the example of Google Maps, this could be a plane. Although X seems like it is not viable, it actually is. How could you fix this in your implementation (other than just using Dijkstra)?
How can you add up green numbers and black numbers? They are in different units, aren't they? Black numbers are in inches, and green numbers are in some units of difficulty to move between the two nodes. How do you convert one into the other in order to take the sum? [as any mathematician would say: you can't add apples and oranges]
Books on the shelf... _Security Engineering, 2nd Edition._ Ross Anderson; _Secrets and Lies._ Bruce Schneier; _The Elements of Statistical Learning._ Trevor Hastie, Robert Tibshirani, Jerome Friedman; _C++ The Complete Reference, 4th Edition._ Herb Schildt; _Cryptography and Network Security: Principles and Practice, 2nd Edition._ William Stallings; _Computers and Intractability; a guide to the theory of NP-Completeness._ David S. Johnson, Michael Garey; _Computer Security, 3rd Edition._ Dieter Gollmann; _Hacking: The Art of Exploitation._ Jon Erickson; _Database Systems: A Practical Approach to Design, Implementation, and Management, 5th Edition._ Carolyn E. Begg, Thomas M. Connolly; _The Manga Guide to Databases._ Mana Takahashi, Shoko Azuma; /* Yes! Really! */ _A Brief Guide to Cloud Computing._ Christopher Barnatt; _Pro WPF in C# 2010._ Matthew MacDonald; /* Ooh! Companion ebook available! */ /* * Whew! For any simple task, take your initial runtime estimate and double it. */
@@sumitmomin5753 IMO, it is because he's writing pseudocode so instead of using one particular data structure type (Vector, List, Map, etc ) and confuse anyone with "technical" programming terms, he's just saying "data structure" !
I wouldn't call that annoying, because from what I've seen, the people not as talented can get seriously confused if the teacher makes a mistake, so in pointing it out, you're probably doing at least some of them a favor.
Extremely well done run-through! Dr. Pound is right: A* is incredibly fast; so much so that we use it generously in path-finding (in gameplay engineering). That's a subroutine that multiple NPC instances are executing, 60 times a second, along with all the other stuff (that's a LOT more intensive).
I think something important to note which was very only briefly suggested is that if your distance-to-goal heuristic always underestimates you will always find the shortest path, but if not then the path you get may not be the shortest (which for some problems may be suitable). If you underestimate too much then the benefits of A* diminish and you'll explore more and more of the graph. Additionally, Dijkstra is a generalisation of A* where the distance-to-goal is always underestimated as 0.
Actually we saw an example for that in the video. We finished so fast in the end because the final distance was actually shorter than we expected only a step before. The heuristic being a overestimating one wouldn't have guaranteed to find the optimal path if there would have been a shorter ones in the right branch but it let us finish very fast
A* has its uses. You can program edge weights of ones you want your algorithm to avoid to be positive infinity or something if you want to be sure. Also, the euclidian distance based heuristic you pretty much only use when you have a 2 dimensional map aside from nodes on it. So there cant be a realistic situation, when the path where heuristic is bigger is actually shorter. If you are measuring weight on a different parameter (like, how many shops does the town have, and thats your criteria, not difficulies on the road itself) then you should use another heuristic function or another algorithm altogether :)
This is not to be confused with the Sagittarius A* search algorithm, used often in astronomical science. *That* method simply involves shoving everything together in one big pile so whatever you need is nearby.
I had an advanced algorithm exam 2 weeks ago and this algorithm was part of the test, I passed but never understood the algorithm. Until now. Nice video
@@aurelia8028 Yes, he should have passed. Most of the time those exams just test your memory. At that time he was only able to reproduce his college's explanation of the algorithm, after this video he's able to explain it with his own words (and maybe even implement it).
One question I had was, how do we know we can stop when E is removed from the priority queue? The answer is that every element removed from the priority queue is guaranteed to have the most efficient way to get back to the element before it in the path back to the start node S. So basically, once E is removed from the priority queue, we know there is a path from S to E, and all elements removed so far are part of the shortest path, or the path that minimizes the total heuristic cost.
The use of physical cards really helped make this explanation of the algorithm really clear. I was really struggling to follow purely written explanations, pseudocode, and actual code, because while I can code, I don't have a formal CS background.
You not having a formal CS background has nothing to do with struggling with algos like this. That is simply b/c you aren't used to solving those types of problems, and 99% of universities do no prepare students adequately in DSA either, so most of them are struggling too.
He sad "for A* to work really well you have to have a consistent metric and you have to not overestimate of how far you've got to go". But it will not work at all if you have overestimating metric.
You don't have to invent something to be brilliant. Just being able to understand, accurately recall, and be able to explain this material in a way that enables other people - especially people who don't have a formal background in this material - to understand it is brilliant in and of itself.
I really appreciate Dr Mike taking the time out to not only host these videos but also make all the materials necessary for them. Being a professor must be definitely a busy job and all this must definitely take quite some effort. Appreciated!
I know this is an old video, but hopefully someone sees this. At the end, you suggested that there could be other, perhaps better heuristics than Euclidian distance for A*. Could you give a few examples of other such heuristics?
Let me nitpick just a little... You're correct that A* terminates when the destination node gets to the front of the queue *if* the heuristic is guaranteed to be a lower bound on the actual path length. But in this video, the physical distance doesn't actually correlate with the path lengths, and so you cannot actually exclude the possibility that the shortest path to E might go through C, or some other not-yet-examined node. But nonetheless, I loved the video, and it was a great explanation of the algorithm. Thanks!
My guess for satnav would be: if (distance > CITY_DISTANCE) { path = path_to_closest_city + load(path_between_cities) + path_from_city_to_dest; // Bend Path from Triangle-Wave towards Sinusoid // Done by looking for better directions onto the main road in the correct direction Bend(path, path_between_cities); } // Adjust path based on current traffic & closed roads ApplyTraffic(path); DISCLAIMER: I also don't work for a satnav-company.
His calculation for "D" in A* was off. D was S + B + D (0 + 2 + 4) or 6, and it had a heuristic of 8, so that is 14. He wrote down 12 in black. Not a major deal breaker here obviously, but just pointing it out b/c that's what us programmers do :). Thanks so much for the video!
I'd just like to note that you, Dr. Pound, are the most likeable Computer Science professor I've ever come across. This is coming from a student of one of Germany's top MINT universities.
Ich auf einer der besten Universitäten (bezogen auf Naturwissenschaftliche Studiengänge) als Informatik-Student im Master, wäre mal gespannt zu hören über welche Universität du spricht? :) Das wäre mir neu, dass "MINT" Universitäten die besten in Informatik seien. Aber hey go ahead :)
@@Clashkh22 hahahaha und du sagst, "one of the Germany's Top MINT Universities" wtf alter, von der habe ich noch nie gehört. Gute Unis sind, Tu-München, Humboldt-Universität zu Berlin oder auch Tu-Berlin. wat für Fernuni alter
I borrowed a freind's older tomtom sat-nav some years ago(2010-13 ish) and told it to find the best paths between Dallas and Tacoma, several minutes later it responds that "the path has dirt roads, would you like to avoid dirt roads?" I select yes and it goes back to calculate for several more minutes and says "destination is on a dirt road no path found" and reset with no results. A broken algorithm for sure I don't know why they would even release it to the public, it wouldn't take the preference before calculating so it had to calc twice, interpreted "avoid" as an absolute command of no dirt at all, then discards all the calculations. My home is 100meters onto a dirt road so some dirt is unavoidable, I just wanted it to minimize unpaved routes so it wouldn't route me down a 50 mile mountain service road. Which it attempted to do on several occations. Like the plant nursury that, like my home, was just off the end of the pavement, the tomtom routed me from the other side over 5 miles of winding dirt road, because it was shorter physically and had no speed data so was assumed the fastest route.(and in this case I was only traveling/calculating about 40 miles)