So instead of the state machine being implemented as a state variable + a switch statement, it is implemented as functions returning other functions. If state A transitions to state B, it is encoded as function A returning function B.
Is anybody familiar with the work of Michael Jackson? (sees a lot of hands) Good (goes right into talking about some obscure CS researcher.) This man is the pinnacle of nerd. I'm pretty sure the audience thought you were about to make a joke about thriller.
It was a joke and I'm so sad no one laughed! I sure did when I heard him offhandedly quip "as you know, Michael Jackson developed Jackson Structured Programming".
guys, where can I find go parser for parsing sql files ? because, I found only a single statement parser and one located in influxdb, but it's not what I need.
You have to call the "run()" method on the lexer you get back before you would get anything from the channel. I suppose that could be kicked off inside the same lex() function, but that's not how it was shown in the talk.
Main objection point is that you can't write this way if you're parsing unicode patterns. For example, what would be the accept() string for accepting upper case, lower case, etc. letters ? Nevertheless, the final scanner was extremelly clean and it must have been really fun to write.
You can simply modify the accept() to take in a boolean function instead of a string. That would allow accept() to take in the unicode.IsLetter() and unicode.IsLower() functions as arguments, which are unicode aware.
I'm only 20 mins in, but I'm failing to see the "beauty" in this approach. The logic is spread out over codebase. Code should be optimized for (human) reads. A single switch statement where state is kept track of reads way easier.
I have a feeling this is another case of, when a new paradigm starts to take off (the actor model) everyone wants to write everything in it, and it all seems beautiful until we actually work with it and begin to learn why this new idea is not always appropriate for every task.
This is an exquisite example of clean code. Lexical analysis code is often hairy but as Rob shows doesn't have to be. Go first class functions and slices really shine here.
09:26 "We could use a tool. Lex is a pretty famous one. Initially written by Mike Lesk and then redone when he was an intern by someone called Eric Schmidt." ... who at the time of this talk was his Boss (and the CEO of Google)
Am I the only one who doesnt love how stateful this solution is? State can lead to tons of bugs, it might be a little nicer if each state function also returned the next state of the lexer instead
@henriquedante Unicode is actually not a problem - Go as a language natively supports utf8 as its character encoding for the string type. For example, around 23:00-24:00, where you see "switch r = l.next() {...", r is the next rune in the output - i.e. the next unicode code point assembled through utf8 decoding. If you want to accept only upper case, just call the relevant test function from package unicode on the value of r :-)
I think we did something like that as an exercise, except each state function would directly call its successors using tail-recursion. That would have been in OCaml, IIRC…
He's a very good speaker. I do enjoy his talks. Even if you're not interested in the topic, you'll become interested in the topic. You come away with insights that you didn't even know you were missing before watching the talk.