If you're wondering how he got the 4 values on lines 6 and 8 at 7:00, use these formulas: LAT +/- (180/pi)*(1.05 miles / 3958.71 miles) and LON +/- (180/pi)*(1.05 miles / 3958.71 miles) / cos(pi * LAT / 180)
Also, do note that these values are not universal. It depends on where your point of interest actually is located. Specifically how far it is from the equator.
I had once used haversine formula for one of my freelance projects and didn't realize how much performance I left on the table after watching this video. Please keep making this awesome content!
I knew the trick with the bounding box. What I didn't know was that MySQL has that nice function. So I always did that calculation in my applications. Still learning new things every day. Keeps the job exciting. ;-)
Excellently explained 🎉 Great videos with clear explanations. Do you offer consulting? I'm running into a time series issue with timezones and I'd be willing to book a consultation to get an understanding of how to best approach the problem.
Nice video Aaron, but: You should use “Restaurant” condition from the very beginning, because in that case initial query will take less then ~2.2s. So it its a bit misleading and can lower value of otherwise very informative video. #feedback #keepGoing
How do you guarantee that the bounding box condition is determined by the engine, before the costly distance calculation is even performed? Do the order of the WHERE conditions that matters?
i have an applicatipion that save your geolocation position when you check in... isn't best to use decimal for EXAT location? and how much long should be this numbers?
This video was great! I actually will try to implement this because even though the geolocation functions I do have are for Admin only, it would be great to release to the wild and have users have access to it with speed.
I'm guessing you will want to do the time calculation outside of the database, as suggested. That calculation is probably expensive, so the fewer times you have to run that, the better... Hey presto! Another reason to use a bounding box! Decide what is the max distance you could travel in 1 hour from your location in each direction. Add & subtract from your centre, and that becomes a new bounding box.
Why are you storing your "point" in 2 different fields? Just use the POINT column type and add (spatial) index on that. Then buffer on your geometries of interest, and select those that intersect with the buffer. Done.
excelent video, "bounding rectangle" is an excellent way to speed up such kind of queries (personally I just "assumed" that earth is "flat", so euclidian distance ftw :-P - of course, it's not correct, but it was about closest locations, not all location in a range, so ... trading some accuracy to performance) what I didn't understand from your video is why do you need some extra distance offset, shouldn't be the rectangle just tangent to the circle in all four points? (or maybe I make the same mistake thinking 2D, but the earth is 3D?)
Good video. You might also want to look into Geohash where you can represent every coordinate with a string. You can then index that column with a string and use like % query.
Cool video! I like the minimal and focused presentation style and the usage of the simple UI. It looks very focused and doesn't distract from what you are trying to do. I've never had a chance to use planetscale but I will keep on watching your videos.
fantastic video but if we use a composite index on "x" , "y" and change the query to be where x>x1 and yy2 that should work and make full use of the composite index on (x,y). Again thx for the fantastic video
@@aarondfrancis I just meant you had such a nice number of km (3.1415...) between two first points and you promptly got rid of it. but I'm just joking - apologies if that wasn't clear. Great video in general - not many people know you can deal with spatial data on the db side at all, not to mention how to do it efficiently. Thanks mate and keep it up
Could someone please explain why he said he wouldn't use MySQL for driving directions? @11:30 Because that's exactly what I wanted my app to do and is the reason I clicked on this video 😂
This is able to calculate the distance between two points in a straight line, which doesn't always correspond with roads or driving time. If two points are one mile apart, but there is a mountain or lake between them, that will dramatically change the travel time! You'd need a navigation API for that. Something that can tell you where roads are.
I wonder if you would be able to gain some performance if you use a spatial index and compound index by doing a sub-query/join 🤔 Or having the sub-query would negate any potential gains from using a spatial index.
Great video, clear explanations - this gets me looking forward to learn more about MySQL geospatial capabilities. I wanted to ask what MySQL client are you using - it looks like it may run queries instantly as you type, almost like hmr- is that so?
@PlanetScale, just as you created your compund type_lon index, I wondered: "Could we create a compound index with truncated values of our latitude and longitude?" It would be like indexing all points by with map grid square they fall into. Then our 'bounding box' redundant query becomes 'the point falls into one of the 4/8/16 grid squares' A bit more work perhaps in query design to calculate what the valid grid squares would be, but perhaps that would allow for the index reducing down to just the bounding box contents more immediately, rather than still reading all of our long rectangle?
Answering my own question: yes! I work with UK postcode data occasionally, and adding an generated column and then indexing it improved the speed for finding postcodes nearby by a further MASSIVE factor: alter table postcodes add column gridsquare varchar(15) GENERATED ALWAYS AS (CONCAT(TRUNCATE(latitude, 2), ":", TRUNCATE(longitude, 2))) STORED; alter table postcodes add index gridsquare_index (gridsquare); My performance improvement is hard to work out as it got vanishingly small: Find all postcodes within 10km of one point: 1452 postcodes returned - using no bounding box: 0.4 seconds - using bounding box and index: 0.1 seconds - using gridsquares:
Yes! You can absolutely do that. I originally wrote an article with that method, but it didn't make the video: aaronfrancis.com/2021/efficient-distance-querying-in-my-sql Perhaps the next one 🤔
@@PlanetScale Thank you! Ooh, I didn't realise you had an article on it too! And compliment right back at you - it's a really well explained article! And, yes, that's so clear that by tranching one of the variables, we can make a compound index that is SOOO much better! (For others who haven't read the article - and btw, REALLY do! - the idea is to create a new column rounding each decimal to the nearest 0.001, thereby allowing you to query for specific starts of the decimal like 'it must be 30.254 or 30.255' as a way of indexing bounding boxes and getting another performance boost! It's really neat and Aaron explains it very nicely!) Also, I didn't know that MySQL `IN` was so much more optimised than `BETWEEN` - perhaps another weird one for humans vs computers! Is there any benefit then to making the index ENTIRELY from tranched values - would it be quicker? Or is the point of an index that we don't want to have so many rows with the same index and so reduce its power/versatility for other query patterns? (I feel like I have only started scratching the surface of the power of indexes recently, so I'm intrigued by how to get more out of them!)
I think your longitude wouldn't matter - so just set a maximum latitude if you are looking around the north pole, and a minimum latitude if looking around the south pole? Though I'm imagining the centre point is at the poles, but if you are 0.5miles south of the North Pole... not a clue! Yes, time to get researching!
To find the closest point you can probably just use pythagoras right? Unless you need to find points on the other half of the earth, this will work just fine.
The earth is a sphere, so no matter in which direction you travel a 100 miles, you always follow the same curvature. So, it doesn't matter if you just use Pythagoras's theorem instead right? You only need a calculated distance when you want to select it or compare it to some arbitrary value. Otherwise I think it's safe to use a heuristic.
If your "points" are in longitude-latitude where units are degrees, then you'll get the "distance" in degrees as well. The distance per degree latitude changes as a function of longitude, so accurate conversion isn't simple. That's why the haversine formula exists.
@@quillaja I understand that the haversine formula calculates the accurate distance between two points, but I think that when only comparing the distance, you can use a heuristic like pythaghoras
Honestly, just to be kinda safe? I don't think the bounding box calculation I used is done on a sphere, but rather on a plane. So to take into account the haversine deal, I just bumped it up a bit. The worst possible thing would be for my redundant condition not to be redundant after all, but rather filter out some things that should be there.