Тёмный

How Do Regular Expressions Really Work? 

Low Byte Productions
Подписаться 76 тыс.
Просмотров 28 тыс.
50% 1

In this video we're going to build a basic regular expression engine from scratch, in order to illustrate the underlying mechanisms that make them tick. First with a naive attempt, and then implementing the nuanced "backtracking" feature.
00:00 - Intro
00:53 - Constraints
02:40 - Parsing
10:45 - Writing the regex engine
17:00 - Testing the engine
18:40 - Fundamental flaw
20:20 - Implementing backtracking
=[ 🔗 Links 🔗 ]=
⭐️ Patreon: / lowleveljavascript
💌 Updates to your inbox: tinyletter.com/lowleveljavasc...
🗣 Discord: / discord
💻 Github Repo: github.com/LowLevelJavaScript...

Опубликовано:

 

5 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 44   
@wlockuz4467
@wlockuz4467 2 года назад
Wow this is mind blowing! I have read a lot of theory about Regex but I have never seen a practical example so well done! You earned a sub!
@LowByteProductions
@LowByteProductions 2 года назад
Thanks WloCkuz - much appreciated!
@fabiodan30
@fabiodan30 3 года назад
Thanks for spreading the good word on parsers and virtual machines. It's important work, what you're doing here, and more people should do it. Liked and subscribed!
@kylefong2888
@kylefong2888 3 года назад
YOU'RE ABSOLUTELY SICK!!! I geek out about stuff like this, this is so cool. Thank you for taking the time to share your knowledge!
@huizhang7413
@huizhang7413 Год назад
Holy shit, I spent 2 weeks learning the principles of compilers. I learned tons of terminology, regular expressions, NFA, DFA, and how to convert one into another. But when it came to writing a regular expression engine, I still didn't know where to start. But this video taught me in just 20 minutes!!! A big thank you to you.👍
@ShaunakDe
@ShaunakDe 3 года назад
Thanks for doing this. It was very instructive to watch.
@benfaerber4956
@benfaerber4956 2 года назад
Wow, I'm so happy to have found a channel like this about my favorite language! I'm writing an APL clone in Javascript and will have to use your videos for research!
@LowByteProductions
@LowByteProductions 2 года назад
Thanks Ben - hope they come in handy!
@LachlanMiller
@LachlanMiller 2 года назад
This was really good, thanks.
@AmanGupta_Dev
@AmanGupta_Dev 3 года назад
After watching your videos, I am pretty much confident that all those curious questions / doubts I had, will be solved ! once and for all !
@LowByteProductions
@LowByteProductions 3 года назад
I certainly hope so!
@poof65
@poof65 3 года назад
Very interesting, even if I don't have seen it at the best moment. End of day on my couch, I was starting to fall asleep at the end of the video 😅
@LowByteProductions
@LowByteProductions 3 года назад
I commend your effort given the circumstances 😁
@ratchet1freak
@ratchet1freak 3 года назад
a better way than backtracking is to instead have an array of current states and go character per character. Then when a choice is to be made on a state you duplicate that state and simulate both choices. This lets you avoid the most common pitfall of regex implementations If a state fails to match the character you drop it from the state array. The regex matches when you get to the end of the string with a state that is at the end of the regex.
@imsherry7225
@imsherry7225 3 года назад
You Are A Great Person ❤️✌️
@LowByteProductions
@LowByteProductions 3 года назад
Thanks Sherry ☺️
@imsherry7225
@imsherry7225 3 года назад
@@LowByteProductions My Pleasure
@sshh6285
@sshh6285 2 года назад
subscribed, awesome, unique content
@RenanBorges
@RenanBorges 3 года назад
amazing!!!
@LowByteProductions
@LowByteProductions 3 года назад
Haha thanks buddy!
@yapdog
@yapdog Год назад
Wildcard (.) matches everything but newline ( ). At least in most engines. Great video, very timely for me 😁👍
@dentjoener
@dentjoener 2 года назад
For repetition (including all of the variants) I usually use a generic repeat pattern with min and max, and for unbounded versions I set max to some incredibly high value. (MAX_SAFE_INTEGER or something). This allows me to do the following tranformations: ? -> repeat(0,1), * -> repeat(0,MAX), + -> repeat(1,MAX),... etc
@AnonymousAccount514
@AnonymousAccount514 10 месяцев назад
Mind Blown
@anonymoussloth6687
@anonymoussloth6687 2 года назад
Can u tell me where i can learn to create a more complicated one? I have studied finite automata and compiler design so I am not new to regex and dfa, but I want to learn how to implement onw from scratch
@JSRFFD2
@JSRFFD2 3 года назад
This was a very good video, thank you! I wonder if backtracking could be handled more elegantly with recursion. I also wonder if backtracking or recursion would be better suited to handle something like a|b
@LowByteProductions
@LowByteProductions 3 года назад
For your first point, yes I think you could use recursion, although a bit of alteration would be needed. In that case, you're using the call stack itself as the data structure that keeps track your state. The call stack is however a limited size (varies from environment to environment), and JS unfortunately doesn't have tail calls. For your second point, alternations need a bit of extra modelling (in parsing), but are essentially just a new kind of element type. So in the stateMatchesStringAtIndex function you could add a new case that iterates through the possibilities, trying each one until it finds one that works.
@mluevanos
@mluevanos 2 года назад
Thanks for bringing the advanced JavaScript topics to YT.
@BryanChance
@BryanChance Год назад
I learned about regex when I was trying to learn Perl. Oh yes, the syntax is absolutely mind boggling to me at the time. (I'm talking about Perl's syntax) Just kidding but it's close. LOL After about 8 months, it finally clicked; the regex syntax made sense. Anyhow, Perl and regex are perfect together. I think learning Perl and regular expression broke my brain. LOL Then I found out about sed, awk, and all the other text utilities commands like cut, sort, uniq, tr, etc.. I couldn't imagine the complexity of writing a regex engine. But I 'll find out watching video!. Thank you for sharing your work.
@LowByteProductions
@LowByteProductions Год назад
I learned perl a long time ago - probably before I had any real idea what I was doing 😁 I look back very fondly on it, and remember it being amazing for regex and text processing in general.
@eugenegordo3130
@eugenegordo3130 3 года назад
Thanks for the video! Tell me please which font you using in the editor?
@LowByteProductions
@LowByteProductions 3 года назад
I believe it's inconsolata
@eugenegordo3130
@eugenegordo3130 3 года назад
Thanks!
@MrLiquitorleaveit
@MrLiquitorleaveit 2 года назад
I´m feeling like I´m a bazillion hours away from producing code like this. Watched the whole video though, kind of fascinating how it all adds up in the end.
@LowByteProductions
@LowByteProductions 2 года назад
Hey - you'll surprise yourself with how much less than a bazillion hours it can be if you focus on the right stuff. Anytime your reading/watching/trying to understand something that makes you a bit uncomfortable because it's not immediately clear - that's actually a great sign! Follow it up with a trip to Wikipedia or a Google search. Keep exposing yourself to tough stuff. Very quickly concepts will fall into place, and you'll be able to look into even tougher stuff. Honestly it's one of the things I wish I had been more conscious about earlier in my career.
@laujimmy9282
@laujimmy9282 6 месяцев назад
​@@LowByteProductions this comment is really helpful for many of us ❤ thx
@user-io4sr7vg1v
@user-io4sr7vg1v 8 месяцев назад
Why are you using arrays of arrays?
@maxdemian6312
@maxdemian6312 10 месяцев назад
I always read regex as reghex
@calvinlucian387
@calvinlucian387 3 года назад
Nope. Still can't work with regex. 😪 Great tutorial tho ❤
@LowByteProductions
@LowByteProductions 3 года назад
Give super-expressive a try (github.com/francisrstokes/super-expressive). Might help you better wrap your mind around the idea!
@poof65
@poof65 3 года назад
Regex is love, Regex is life 😍
@calvinlucian387
@calvinlucian387 3 года назад
@@LowByteProductions Thanks! That's a gem of library! ✌
@koenderbb5191
@koenderbb5191 3 года назад
Zeg eens eerlijk, ben je Nederlands?
@LowByteProductions
@LowByteProductions 3 года назад
Nee helaas kom ik uit Engeland. Maar mijn dochter is wel Nederlands.
@fylink
@fylink 3 года назад
Your algorithm misses nested qunatificators. For example, imagine the expression /([A-Z].*){3}#/ matched against the string 'AxsaBsaxsaCsaxas#'. Your algorithm will fail and it would need drammatic modifications to cope with that.
Далее
JavaScript Is Weird (EXTREME EDITION)
21:29
Просмотров 685 тыс.
ЮТУБ БЛОКИРУЮТ?
02:04
Просмотров 418 тыс.
Разоблачение ушные свечи
00:28
Просмотров 472 тыс.
Редакция. News: 128-я неделя
57:33
Просмотров 1,7 млн
Regular Expressions - Computerphile
17:19
Просмотров 241 тыс.
Compilers, How They Work, And Writing Them From Scratch
23:53
Using Regular Expressions - Computerphile
11:39
Просмотров 124 тыс.
Getting up in another processes memory
46:54
Просмотров 14 тыс.
Regular Expressions (Regex): All the Basics
21:39
Просмотров 56 тыс.
ЮТУБ БЛОКИРУЮТ?
02:04
Просмотров 418 тыс.