How Look Up Tables (LUTs) make your code Smaller, Faster, and Better (example in C)

Подписаться 160 тыс.

Просмотров 65 тыс.

50% 1

Patreon ➤ / jacobsorber
Courses ➤ jacobsorber.thinkific.com
Website ➤ www.jacobsorber.com
---
How Lookup Tables make your code Smaller, Faster, and Better (example in C) // What's a LUT? It's a data structure - really just an array used a certain way. The idea is simple, but often overlooked. This video tutorial provides a concrete example in C that I hope gets you on your way to producing faster, cleaner, easier-to-maintain code with Lookup Tables!
***
Welcome! I post videos that help you learn to program and become a more confident software developer. I cover beginner-to-advanced systems topics ranging from network programming, threads, processes, operating systems, embedded systems and others. My goal is to help you get under-the-hood and better understand how computers work and how you can use them to become stronger students and more capable professional developers.
About me: I'm a computer scientist, electrical engineer, researcher, and teacher. I specialize in embedded systems, mobile computing, sensor networks, and the Internet of Things. I teach systems and networking courses at Clemson University, where I also lead the PERSIST research lab.
More about me and what I do:
www.jacobsorber.com
people.cs.clemson.edu/~jsorber/
persist.cs.clemson.edu/
To Support the Channel:
+ like, subscribe, spread the word
+ contribute via Patreon --- [ / jacobsorber ]
Source code is also available to Patreon supporters. --- [jsorber-youtube-source.heroku...]

Опубликовано:

8 июл 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 201

@JohnHollowell 3 года назад

LUTs are also often used in photography/videography that can color grade a source to a preferred output. This is super useful as the conversion is a simple lookup (rather than a computation) for every pixel in every frame of the input

@JacobSorber 3 года назад

Definitely. I probably should have mentioned that I used at least one LUT in the making of the video. 😀

@RobertFletcherOBE 3 года назад

Yeah I''d checkout OCIO if thats your thang.

@ranjithmkumar 3 года назад

3 years back, I used LUTs for Temperature calculation of NTC temprature sensor which avoids floating point calculations in Microcontrollers without FPU.

@Kefford666 2 года назад

Did you have a converted value for every possible sensor measurement?

@ranjithmkumar 2 года назад

@@Kefford666 Not every Resistance vs Temperature values. I had values for unit difference in celsius temperature. I calculated the slope for resistance values in between 2 resistance values.

@lorensims4846 3 года назад

I was recently reading an article about how the old (first?) computer game "Space War!" was implemented in a very old issue of Byte magazine. Turns out they used a look up table for the values for the effect of gravity around the central star. This removed the need for time-intensive calculations on those old computers.

@strictnonconformist7369 3 года назад

I used lookup tables on the Apple 2 series for such things where I could, as well as the PC later for 3d graphics without FPU, before the days of PC GPUs. Far better method to use.

@Iamwolf134 2 года назад

LUTs also allow for iterating on different still images, character sheets and the like.

@dickheadrecs Год назад

probably like the actual moon landing

@ancapftw9113 3 года назад

I used that without knowing what one was. I wanted to generate a solar system and then look up data for the planets, moons, asteroids etc. later. Saved me from having to generate the planet every time.

@timtreichel3161 3 года назад

This was a nice example. LUTs are also often used to compute functions such as sin, cos, ..., especially when running on slower hardware such as MCUs. I would like to mention tho, that when using LUTs for such purposes as computing functions, the hardware can make a huge difference. A modern desktop CPU can performe many operations in the time of a simple array lookup because reading data from memory is "slow". Therefore I would always recomend profiling when optimizing for performance, in some cases LUTs may be slower.

@IlariVallivaara 3 года назад

This is a great comment. Many people suggest optimization tricks from 90s or something that do not apply anymore on modern hardware. Therefore it is of utmost importance to profile your code on the intended hardware. One example I stumbled upon recently was approximating trigonometric functions with LUTs vs. few-line polynomial approximations. On the desktop environment, the polynomial approximation was much faster, even though the average intuition could easily say that a no-computation look-up should be the fastest possible.

@timtreichel3161 3 года назад

@@IlariVallivaara Thank you so much for your reply. I never personally had a case where this is relevant so it's nice to see that I'm not saying comlete nonsense.

@strictnonconformist7369 3 года назад

@@timtreichel3161 furthermore, computing floating point/double precision arithmetic on today’s CPUs (desktop, servers, not most microcontrollers) can readily be run in parallel with integer instructions, and it’s just another dependency for a sequence of instructions overall, especially for hyperthreading with an OoO CPU that speculatively executes enough instructions. As long as you don’t execute too many floating point instructions too regularly in a tight loop, you’ll have those computations without the costs of the L1 data cache being filled up, without the L1 instruction cache being filled up, and the overall memory cache on chip being filled: everything you store in cache memory for lookup forces other data and instructions out of the fastest memory tiers and there is a performance penalty. Before CPUs had FPUs, if they had enough RAM for a sufficiently large LUT it absolutely was a big performance win, because even though memory constraints were far more extreme, CPU constraints were far worse. Even on the x86 CPUs with FPUs, for the trig functions, if you could use less accuracy (drawing a circle on a screen only needs to be so accurate) you’d still have better performance with a LUT because the FPU took too many cycles, combined with no OoO execution to hide the latency.

@dmitripogosian5084 Год назад

It all depends what you compute. If you computation is relatively complex, look-up tables and subsequent interpolation is a time-honored technique.

@student99bg 9 месяцев назад

That was my first thought when I saw the name of this video. I thought about lookup tables from math and how it is quite possible that instead of calculating Taylor series for sin, cos, tangens you instead pre calculate values to arbitrary precision once, put them in a lookup table and then use them whenever you want.

@andreyermak3780 Год назад

Its funny how I ended up in this video trying to understand how a LUT works in a FPGA Architecture. And surprisingly, I understood it. Thanks!

@chrisknestrick374 2 года назад

As a note, instead of sizeof(char*) you can use sizeof(messages[0]), so you don’t have to worry about messing up the type. It also gives you the ability to write a handy dandy helper macro that can be reused anywhere in your code, regardless of type: #define CountOf(array) (sizeof(array)/sizeof(array[0]))

@danielsantrikaphundo4517 Год назад

Yes, clever

@sebastiangudino9377 Год назад

It's work pointing out (For people new to this stuff) this only works for arrays creates a compile time. Not for pointers creates with malloc at runtime. Since, if you ask the c compiler for the size of a heap alicates pointer, you will just get the size of the pointer. When working with memory in the heap you should always keep track of your data size. And give it as an argument to functions which you define which need that information. Another solution that you could use in some cases is using "Terminated arrays". Strings for example are a "Null-Terminates Char Array". That's how strlen can get the size of a string, even if it is allocated on the heap. You could use this same principle in arrays you create which have a definable notion of a null terminator (A value that should not appear on the array under normal circunstances anyway). But having to iterate when whole array to get it's Leng is clearly pretty slow

@trimethoxy4637 3 года назад

love format of your videos, not too short, not too long, just enough to answer questions and cover topic, but not too detailed and without improper mumbling. thank you dr. Sorber!

@JacobSorber 3 года назад

You're welcome. Glad I like them.

@MuzixMaker 3 года назад

Used frequently to generate waveforms via D/A, e.g digitally controlled oscillators.

@adamtuft 3 года назад

I used a similar idea in a recent project where I had an array of integers that were opaque handles to some resources, and I needed to look them up programmatically. The values of the handles were arbitrary and not known until run-time but the number of handles was known at compile-time. This meant I could define an enum (using macro functions) whose labels could be used as an index into the array of handles, meaning I could do something like messages[enum_large_party] instead of messages[6]

@TheExcellentVideoChannel 2 года назад

I've done this before to synthesise a very fast unique set but didn't realise it was called a LUT. Watching your video made me think of using it to create an event map rather than using a case statement. You're right, once you start thinking about it you start to see the multitude of opportunities for its use. Thanks, nice work.

@NKCSS 2 года назад

8:18 if you put in a MIN or clamp (lowerbound already set at 0, so not really needed) function, you could also just change it to: printf("%s ", messages[MIN(people, MAX_PEOPLE)]); Under water you'd have a function call of course that's similar to the IF logic you have right now, but most people should be familiar with MIN/MAX functions that it's preferred I would say.

@junbird 3 года назад

Used something like this when I was working on a simple game and I needed to keep track of all the types of items that the player would be able to find. I had an array of struct Item (let's call it items) which was intuitively indexed through the enum items type. If I needed to retrieve informations about the rock, which was one possible class of items of the game, I would just read items[ROCK]. All rocks of the games would be linked to that element of the array, so all rocks shared the same basic traits (which sounds like OOP, but isn't). Also, the player's inventory was just an array of enum items, that would contain the values of the items that contained. This is basically what you would do in a relational database, but in main memory (structs are schemas, arrays are tables. If you wanted to emulate how relational database really work, you could use BSTs instead of arrays, but that would depend on your needs, I guess).

@chrisknestrick374 2 года назад

Really useful technique - I actually used it multiple times this week at work :)

@HansBezemer 2 года назад

So true. I wrote entire compilers and interpreters on tables (hand written parser). If you know what you're doing access is amazingly fast - not to mention the ease of maintenance. Shame function call overhead is so high in C.

@ajtan2607 2 года назад

I once tried to implement a simple version of a game combat mechanic. One of the factors used in calculating the damage was the "type advantage". It's basically selecting a damage multiplier based on the attack type and the target type. My first instinct was to go for `if-else` or `switch`. I soon realized that this is not going to end well after looking at a diagram that shows over 300 different combinations. Until it hit me. It was a big facepalm moment. The diagram presented the data in a tabular format, so why don't I present the damage multiplier flags in a table as well? I already had an enumeration of types, so I just used them as index values for accessing my table of type advantage which is just simply a matrix of integers. Lookup tables truly saved me from writing hundreds of lines of code. It even made the algorithm significantly faster and it was not even a priority.

@LuckyMrAsHat 3 месяца назад

It has been a while. Have you finished making your game?

@ajtan2607 3 месяца назад

@@LuckyMrAsHat Has it been at least 2 years already? Dang. Anyways, I wasn't exactly making a game. I was only experimenting with some game mechanics. Thank you for asking.

@farizrahman6133 3 года назад

Thanks for the information

@danielsantrikaphundo4517 Год назад

Interesting, thank you!

@skaruts 3 года назад

I started using more LUTs because of Lua with LuaJIT, because LuaJIT can only compile conditional branches under some circumstances. But they're useful in a lot of ways. In Lua they're also useful for making fast table objects with properties that map to the table indices (as in, e.g. a vector where you can get the *x* value both as *v[1]* or *v.x,* for performance or convenience, respectively), or even for read-only properties.

@darkknight8207 3 года назад

your videos are very helpful.

@JacobSorber 3 года назад

Thanks. Glad you think so!

@roshandsouza2329 3 года назад

Thanks, nice video.

@fivefive2433 3 года назад

Nicely explained thank you. One edit I'd add to save program memory is to add a second lut which maps to indexes of unrepeated array of strings, so probably it'll be something like 0, 1, 2, 3, 3, 4, 4, 4, 5 etc, which will save some bytes, just thought of sharing an idea

@hetsmiecht1029 3 года назад

Since a string is just a pointer to chars on the heap, can't you just make the pointers in the single lut point to the same place in memory? If so, how would you syntactically write that? Would this work: const char* [] lut = {"first+third", "second", lut[0]}; Or do you need to instantiate a string first: const char* tmp = "first+third"; const char* [] lut = {tmp, "second", tmp};

@shawon265 3 года назад

@@hetsmiecht1029 I think this will produce an error. You're usinh lut[0] in definition of lut.

@ajtan2607 2 года назад

In C and C++, string literals having identical values may not share the same address. Whether or not this is the case is implementation-defined. To solve this issue, you can declare an array of unique messages then just assign those to your LUT. const char* umsg[] = { "abc", "def", "ghi" }; const char* message[] = { umsg[0], umsg[1], umsg[2], umsg[2] };

@gael6252 2 года назад

@@hetsmiecht1029 But then you wouldn't be able to index the array using the number variable, that's when hash maps come into the game and solve that memory leak problem for us.

@AndrewTSq 3 года назад

This was used in like 99% of all demos back in the days, when the cpus was not that fast and could not do FP.

@jakub7321 3 года назад

Hello, what font did you use in VSCode for this video? Thanks.

@karanshedge514 3 года назад

Hi i love your videos, How do you deploy to your friend or Client Machine exe file your C or Cpp code exe files with other dependency (Like dll ) and link them to main exe file so our code know where to find dll or other files if they didn't exist on system???

@mow3890 3 года назад

Thanks 🙏

@vhm14u2c 3 года назад

In past I made a LUT based upon using number letter combinations of a phone #, shown all results of number/letter combos, Only thing I didn’t make afterwards was a lookup words database for common or most used results, as it would have been a bit more ‘brainy’ in the code.

@ryandyc 2 года назад

Would i call it in assembly using dc.b/w?

@breakprismatshell6270 2 года назад

I use std::map to select the right tests/functions to take care of a command line argument. Not sure if this counts as LUT as there is a string comparison involved, but I also do not see value in mapping the strings to an enum first.

@muhammadprawirayuda3510 Год назад

nice thanks

@fw3mbedded598 Год назад

cool I had done the same during a DOT MATRIX display project .. but did not know the official name LUT ... I only thought LUT as a acronym used in FPGA world .. thanks for another wonderful video

@BalaMurugan-bh7if 3 года назад

another good one! please provide an vedio about difference between DRAM,TCAM & SRAM type..

@ludoviclagouardette7020 3 года назад

I would have used min instead of an explicit if/else

@corsicanbread7276 3 года назад

Hello Jacob, at some point could you make a video about web sockets in C? Great video!

@Maranello988 3 года назад

Great news, so waiting for trie's series :)

@lewiscole5193 3 года назад

Ummm ... okay? Maybe I'm missing something here but you do realize that tries are often times implemented as a collection of nodes that just happen to be LUTs containing pointers to other nodes, right?

@JacobSorber 3 года назад

Not long now. Thanks for your patience.

@rustycherkas8229 2 года назад

With time to kill, I recently wrote an Enigma emulator whose 'heart' was (bidirectional) nested LUTs for each value (character) substitution, including offsetting for each of the 3 or 4 rotors positions... Tip: to emulate a rotor's "wrap around" (because they're circular), I avoided lots of modulo tweaking by having my rotor LUTs contain 3 concatenated copies of the alphabet cipher substitution... Incoming values entered the 2nd range, and output values may come from the 1st, 2nd or 3rd range, but were automatically 'normalised' to A-Z (0-25, actually) to enter the 2nd range of the next element. C is fun...

@hbibamrani9890 2 года назад

Yesterday i reinvented look up tables. Today i see they've been invented long time ago.

@BlackSkyUploadTube 3 года назад

I think something wrong. Because of last 3 cases, they are unsigned inteager ranges but aren't mapped to representive value like 4 to 3, 5 to 4, or 35 to 5. If am I wrong, correct me please.

@paullin178 Год назад

Lut is very useful in image Processing for pixerwise operation

@mkvalor 3 года назад

An obvious quiestion a new programmer might ask is: "Why didn't he handle the case where the number of messages is less than zero?" If he had accepted user input, that would have been in play. However, with the numbers baked into the program (as they are here), the type of the variable makes this unnecessary. The type size_t is always unsigned, so the compiler would have halted with an error if a negative number had been used.

@anthonykaparounakis135 3 года назад

Good catch!

@MDarkus3 3 года назад

I first encounter LUT in a QSPI flash driver to build the hexadecimal sequence that will be sent to memory device: addr/cmd/dummy-cycle/...

@sebastianmadrigalrodriguez4266 3 года назад

You can do some branchless programming to get rid of the if statements in the lookup table message function call!

@syedyousufhussain2725 3 года назад

At 2:21 after typing semicolon your cursor came out of that curly brackets did you used any short cut key to get out of that or u just used the down arrow key

@JacobSorber 3 года назад

Down arrow. Sorry, it isn't something fancier.

@devmnnu5334 3 года назад

can we use math's min for last if else statement : min(NUM_MESSAGES - 1, people) . will it make it slower

@SimGunther 3 года назад

Try it out on quick-bench and see if affects anything

@mehmetdemir-lf2vm 3 года назад

1. i use the macro below for finding array length, which is very easy to use, rather than typing the whole thing again and again: #define lengthofarray(x) (sizeof(x)/sizeof(*(x))) 2. 8:37 rather than using if statement, the statement below makes more sense: printf("%s ",messages[people

@Kefford666 2 года назад

I may use this for a CANopen module I’m making. I could receive the COB ID then use a look up table to call the appropriate function to handle that request

@paullemmermann5993 3 года назад

Hey Jacob, could you do a video about another build system, like automake and autoconf or meson

@leocarvalho8051 2 года назад

Or cmake

@benzpinto 3 года назад

👍 nice

@abbasaliyev1751 3 года назад

i love how minecraft optifine mod uses lookup table for sin function

@hsaidinsan6345 3 года назад

What static code analyzer you use for c ? Can you make video about this topic pls

@serhatarslan4138 8 месяцев назад

Static code analysis is related to compile time errors. These analysis tools detect logical errors in code. It aims to increase the code quality. Static code analysis can help detect certain types of compile-time errors, such as syntax errors, type mismatches, and missing references, which can prevent code from compiling.

@mikewashington4188 3 года назад

Finally, business world developers gets their due. A LUT is somewhat like a reference table (RT).

@NKCSS 2 года назад

8:00 What I like to do is make it really clear what you're trying to do, and add a const int MAX_PEOPLE = NUM_MESSAGES -1; Then you get: if (people > MAX_PEOPLE) { //... Which is VERY clear to read, and you can trace it back to being linked to the number of messages if you want to. This removes the need to do the -1 on line 21 as well.

@philrod1 2 года назад

This ^^ ... and inside the if statement put people = MAX_PEOPLE and do away with the else

@UTJK. 3 года назад

Wow! I always used this technique without knowing it has a specific name...

@mstech-gamingandmore1827 3 года назад

Coudn't you also do messages[min(people, MAX_MESSAGES-1)]?

@framepointer 3 года назад

Could you please make a video explaining the union data type and the differences between unions and structs?

@gregoryfenn1462 3 года назад

A union is like a struct except that the data memory overlaps in unions. So the size of the Union is the size of the largest member. A struct has members in sequence and non-overlapping; so it’s size is the sum of the sizes of its members. Syntactical, they are both very similar. The main difference is that in a union modifying an attribute will modify all other attributes, whereas this is of course not true for structs.

@unchayndspersonalaccount7690 3 года назад

I've got a question... In C, does declaring a static local variable inside a while/for loop impact the loop's speed? Just trying to make my code nicer and put the static declarations inside their relevant blocks, if that doesn't cause a performance problem.

@vishwajithk8811 3 года назад

typically statics reside till the process exits. So if performance in your words is proportional to memory, then ans is nearly no; if performace in your words is time or delay then it depends on compiler and optimization level - use godbolt.org and write 2 functions and compare assembly of those 2 ASMs - one function with statics block and other with no statics block

@rustycherkas8229 2 года назад

Probably hurts performance if your var does not need to be static. Other automatic vars nearby will live (short term) on the stack. Inappropriate use of 'static' may result in more cache-misses. Also, static makes vars "immortal". Others reading your code will scratch their heads trying to figure out why the var's value needs to live between iterations of the loop, only to be clobbered.

@jpsalis 3 года назад

does your IDE just show

@JacobSorber 3 года назад

Yeah, it's a font ligature. I like it, though it's definitely confused a few people.

@schmidtlach 3 года назад

Can I use lookup tables to call functions? Thank you in advance for the information!

@JacobSorber 3 года назад

Yes, you can. Just store function pointers in the LUT.

@schmidtlach 3 года назад

@@JacobSorber Thank you

@NeZversSounds 3 года назад

In game development most popular case would be Tile Map.

@RobertFletcherOBE 3 года назад

would it be a bit neater to say int index = min(0, max(people, length_of_array)) also is there anywhere where common c idioms are recorded. It seems like there is allot of knowledge in danger of being lost at the moment as the c experts get older and older.

@rupeshkalantre7128 3 года назад

We can remove the last if-else block : people = min({people, NUMMESSAGES - 1, 0}); printf (messages[people]);

@SimGunther 3 года назад

For modern x86, you can use a conditional move (cmovCC). However, if you want to be more general, you could do something like: // Mininum number algorithm for 32 bit signed ints a and b int minimum; { int diff = a - b; int sign = diff >> 31; minimum = b + (diff & sign); } Typically, the first branch would need to be the most common branch for the code to run fastest in a processor with branch predicting (which again is all modern x86 and out-of-order execution processors), which is why the branched example in the video is easy enough to understand and refactor.

@lewiscole5193 3 года назад

@@SimGunther > Typically, the first branch would need to > be the most common branch for the code > to run fastest [...] That assumes that the inputs to the routine exhibit some sort of temporal regularity which may well not be the case (e.g. the routine takes in random numbers). And while you may think that a branched example is easy enough to understand, I happen to think that LUTs are easy enough to understand as well.

@noodlish 3 года назад

Is the single character ≤ valid in C? I've only ever seen the two characters

@RubiksSolver25 3 года назад

It’s very likely that it’s because of the theme that it appears that way. So when he types in

@noodlish 3 года назад

@@RubiksSolver25 Yeah that makes sense. It's just that in the video, he compiles the code straight away with no errors.

@TheVertical92 3 года назад

@@noodlish its just a font ligature from his editor. The actual text is

@user-he4ef9br7z 3 года назад

It's the font. Fonts like Jetbrains Mono change

@JacobSorber 3 года назад

Yeah, it's a font ligature. Sorry for the confusion.

@JohnHollowell 3 года назад

Are LUTs not the best tool when you have large gaps of the same response? e.g. if the numbers were 0, 1, 2,

@JacobSorber 3 года назад

Yeah, in that case, you definitely start running into a time-space tradeoff. And, for a lot of programs the performance difference isn't meaningful. So, most of the time, I would go with the option that gives you code that's easier to read and maintain.

@rustycherkas8229 2 года назад

In such a case, one can try to find a 'data conditioning' function. Consider the ordinal suffixes: 1st, 2nd, 3rd, 4th.... The following bit of code 'conditions' an integer DayOfMonth (1-31) to access into a tiny table of the appropriate 2 letter string decoration: if( i > 20 ) i %= 10; // Change 21-99 to 1-10. if( i > 3 ) i = 0; // All others end with "th" // 0 1 2 3 puts( &"th\0st\0nd\0rd"[ i * 3 ] );

@dickheadrecs Год назад

there used to be printed books for look up tables for all sorts of math problems

@4zv4l38 2 года назад

what about printf("%s ", msg[n%MSG_LENGTH]); to not have an if/else statement ? anyway great video !

@edwardsmith1237 3 года назад

I used the following as a LUT "in two directions" to write a small python utility to convert numbers from one base to another. Here's the snippet: # Constants HEX_DIGITS = ("0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F") def to_base_10(n_str, base): digits = n_str.upper() n = 0 for i in range(len(digits)): d = digits[i] val = HEX_DIGITS.index(d) n = n * base + val return str(n) def from_base_10(quotient, to_base): digits_out = "" while quotient > 0: remainder = quotient % to_base quotient = quotient // to_base digits_out = HEX_DIGITS[remainder] + digits_out return digits_out

@slippydouglas 3 года назад

Yeah, this isn’t a LUT. A LUT uses a sizeable table to replace math and logic computations that would normally be done. The only constant list/array/table you have here is a small Base-16-Number-to-ASCII translation. That’s a basic encoding, not a LUT. And actual simple LUT for this would be pre-converted numbers from 0-1000, falling back to the fairly standard algorithm you have here for numbers greater than 1000. That said, don’t do that; the algorithm here is so simple you won’t gain any speed; you’ll probably lose some instead. LUTs are actually useful when a mathematical algorithm would have to do expensive calculations otherwise; e.g. square roots, sin/cos/tan, etc. My advice: always write a true algorithmic solution 1st, then your LUT 2nd so that: 1. you can populate values in the LUT from the algorithm, 2. you can handle out-of-bounds cases not within the LUT, and 3. most importantly, so you can speed test the LUT vs. the algorithm to see if you’re actually gaining any speed- a bottleneck in modern computing is often RAM/cache access, not the speed of computation itself. Good luck, and best of luck with your studies/learnings.

@strictnonconformist7369 3 года назад

@@slippydouglas a LUT has no minimum or maximum size except for memory constraints: claiming otherwise shows you’re inexperienced. If it is a table, for a LUT, depending on context, it could even be 0 elements if created dynamically from some data set, so if you wish to consider 0 a minimum, ok, there IS a minimum valid size, that of 0. How big or the type of object that is indexed doesn’t relate to whether or not it counts as a LUT, the fact it has an integral form of index into it, does. That integral type can be a single bit, for all it matters, because every little bit counts. What is shown here in Python is much the same sort of solution you’d employ in the BASIC interpreters that existed when I started, that had no bitwise operators.

@slippydouglas 3 года назад

@@strictnonconformist7369 Wow, burn, straight from the get-go. Good look. But for real, I’m sorry, you’re incorrect. A LUT is software development technique, not a data structure. What you’re using to accomplish is what defines it. So yes, there isn’t hard limits because all software design patterns don’t have specific programming language level constraints, but a LUT that’s too small/basic to be useful to accomplish the point of a LUT - replacing expensive computations - isn’t a LUT. By your definition, all enums and all arrays would also be LUTs. That’s not true. And indexable data structure is not necessarily a LUT; indexability is only one component of a LUT. So take your own advice and get a bit more experienced before talking trash. You could go get a couple decades of experience across dozens of programming languages at dozens f companies like me, but just a few years out in the real world alongside professional devs should be good enough.

@mockingbird3809 3 года назад

Nice font

@sudarshanv9797 3 года назад

Can you make a video in hardware description languages?

@BenjaminWheeler0510 3 года назад

check out nandland on yt

@TheEdgardisaac 3 года назад

CRC16 calculation for JBUS / MODBUS

@RAINE____ 3 года назад

I work on DNA and I use lookup tables to convert {A,C,G,T} into {00,01,10,11}

@dev_among_men 3 года назад

Have you compared switch as this case it hould be faster IMO you can also use a flat map for semantics

@RAINE____ 3 года назад

@@dev_among_men That's interesting. Why do you think it would be faster? Using a lookup table, every base character could be converted in just one operation without having to check what it is?

@dev_among_men 3 года назад

@@RAINE____ I was thinking of hashmaps and overhead of hashing func

@RAINE____ 3 года назад

@@dev_among_men Oh yes. Well once we've compressed 32 bases into a 64-bit number, we usually then hash for whatever analysis we're carrying out.

@rustycherkas8229 2 года назад

You could save yourself the memory fetch by using a hash function. Here's one for the four amino acid identifiers: int val = (letter ^ letter>>1)>>1&3; Give it 'A' and get back 0; give it 'C' and get back 1; 'G' yields 2; and 'T' yields 3... (If, like me, you like things dense, here's the same thing: (aa ^ aa/2)/2&3

@xeridea 2 года назад

I would argue it is harder to see what is going on in this example since you need to count the lines to know the index. Could be fixed with comments. Overall I think isn't a great example, but the concept is very good, especially for compute heavy functions.

@zxuiji 2 года назад

You could've removed the jump by something like: ``` const size_t max = NUM_MESSAGES - 1; size_t index = (people * !(people / max)) + (max * !!(people / max)); puts(messages[index]); ``` *edit:* meant to say jump instead of comparison, fixed

@sugiyartosugiyarto1587 9 месяцев назад

Sir make vidio Lut 2d...thanks

@joemacdonnagh6750 3 года назад

You did a great job of libcurl , could you do SDL2 sometime?

@JacobSorber 3 года назад

Maybe. I'll put it on the list and see what I can do. Thanks.

@KangJangkrik 3 года назад

REQUEST: how to get know which line causes SEGMENTATION FAULT? Thanks!

@JacobSorber 3 года назад

Check out my video on segmentation faults. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-I55kWLXqhDI.html

@wedusk 3 года назад

I have used LUTs before to avoid having to calculate primes (among other things).

@BenjaminWheeler0510 3 года назад

Yes. A cool thing with LUTs you can do dynamically is called memoization (great for recursive functions or otherwise). Look it up, it's amazing

@wedusk 3 года назад

@@BenjaminWheeler0510 Absolutely essential for dynamic programming. That's basically what I was talking about. Cheers!

@DipietroGuido 3 года назад

Most optimal or near optimal Rubik's cube solvers (if not all) use several LUTs

@glowiever 3 года назад

I remember back when trigonometry function call was considered "expensive" and so we stored the calculation into a lookup table.

@JacobSorber 3 года назад

Trig functions still are fairly expensive on some microcontrollers, especially those without floating-point units.

@lewiscole5193 3 года назад

@@JacobSorber If I can trade memory for speed and could live with relatively low precision, then I'd probably just use a LUT for dealing with a trigonometric function. But if I could spare the time and/or needed more precision or perhaps more multiple trig functions but didn't have a hardware multiply and/or floating point, I'd probably use the CORDIC method instead. Perhaps you'd care to add CORDIC to the list of things you might want to cover one of these days "Real Soon Now", Professor Sorber.

@rob876 2 года назад

@@lewiscole5193 Cordic in combination with an LUT is what I'd imagine is used?

@lewiscole5193 2 года назад

@@rob876 If you're thinking that you use a look up table to come up with an initial approximate value for some function and then use CORDIC to iterate to a closer value, then no, that's not what I'm suggesting. CORDIC (AKA Volder's algorithm) can do all kinds of good stuff (trig functions as well as hyperbolic functions and division) and relies on adding and shifting of values with some function specific constants and is wonderful when you have a relatively small word size, limited memory, and no hardware multiplication (not just no floating point, but no integer multiplication either) which pretty much describes many micros that you might see in embedded applications. Just stuff a bunch of constants for the functions you want into various tables and let the CORDIC algorithm do its thing with them. And if you need some evidence that this is actually something useful to do, then I would point out that CORDIC is what calculator use (or used to use) for trig functions, et. al. If you have a larger word size and lots of memory, then life becomes easier, but even with hardware acceleration, trig functions take time as Professor Sorber noted. (Even a division is painful, so we're not just talking about trig functions here.) But lookup is "cheap" and fast. With a 16-bit word, a single table lookup can get you a value that's good to 4-digits (decimal) and if you have a 36-bit word, you can get a value that's good to 10-digits (decimal), all for the cost of the memory to hold a table with as many entries as you want/need. There are plenty of articles on what makes CORDIC work, but I've found that you have to find the "right" description for it to make sense to you. But there are (were) plenty of articles laying around about using CORDIC in gate arrays to produce results that should at least suggest that it's still relevant today.

@rob876 2 года назад

@@lewiscole5193 Thank you for your reply. I'll need to revisit the cordic algorithm.

@aceofkestrels 3 года назад

Btw in Java at least, a switch statement actually uses a LUT internally

@sm0kyJoe 3 года назад

Are you secretly spying on me? Just a few days ago i had to utilize lookup tables to implement less compute heavy oscillators. I might have to perform some benchmarks to make sure that the inaccuracy introduced by that (which is small enough to not matter) is outweighed by the performance gain.

@JacobSorber 3 года назад

Only occasionally on Tuesdays. Glad the video was relevant.

@sam-kx3ty 3 года назад

I think you're more of a computer scientist than an electrical engineer but nice content.

@JacobSorber 3 года назад

Yeah, I think so, too.

@sebasfavaron 3 года назад

Nice video! If you really wanted to avoid the if/else statement you could call messages[max(people)] (implementing max or using it from somewhere)

@duartelucas8129 3 года назад

Shit, great content!

@JacobSorber 3 года назад

Thanks! Glad you enjoyed it.

@SriHarshaChilakapati 3 года назад

Nice video! Although I kinda feel that lookup tables are sort of premature optimization in this specific example. You're reducing time complexity, but then space complexity increases, right? Apart from this, I kinda love these. The first time I saw these were in a base64 library. Thanks for reminding of these again.

@lewiscole5193 3 года назад

Professor Sorber's example is obviously contrived, but I would say that generally speaking, you are correct that you're trading space for speed. His example of using a LUT to look up a trigonometric value of some sort is a good example. Even with small 8-bit machines, it could be a useful trade off to look up a sine value, say, versus trying to compute one especially if you need to use that value right away (as in a motion control application) even if that means burning some memory to get there from here. IOW, you have to look at what trade offs make the most sense when you're designing the application.

@JacobSorber 3 года назад

Yes, in some cases, a LUT will use more space. With LUT-based trig function implementations, you have a tradeoff between space and precision, and there are hybrid solutions that use a LUT with some computation (like linear interpolation between LUT values). When basically replacing nested ifs or switch statements, it probably doesn't add much memory - maybe a little, but it's more like you're trading code size for data size (both use memory).

@SriHarshaChilakapati 3 года назад

@@JacobSorber Ah, I missed the last part and forgot that both instructions in .text section and variables in .data section need memory. However, this is memory on disk we're talking about which is lying inside the executable binary. What will happen in runtime? Everything in data section will simply be copied into RAM right? Also, I'm curious what actually happens when memory paging happens, please do a video on that too. Btw, sorry if my question isn't proper. My C and OS concepts were rusty as it's been 6 years since I touched C code. I'm an Android developer and work mostly in Java these days.

@lewiscole5193 3 года назад

@@JacobSorber > When basically replacing nested ifs or > switch statements, it probably doesn't > add much memory - maybe a little, but > it's more like you're trading code size for > data size (both use memory). While the amount of space may well be roughly the same between an implementation that uses ifs/switches compared to a LUT, I would argue that generally speaking the amount of time that it takes to find the correct entry is not. I think that a far better example that demonstrates this compared to the one you (Professor Sorber) gave in the video is the case of implementing some sort of interpreter, say like some microprocessor simulator where you "decode" the opcode by calling a handler routine. You can of course use a switch statement, but going through billions and billions of cases compared to simply looking up the handler routine using the opcode as an index clearly shows the advantage of a LUT over a case when it comes to speed, IMHO. And while I might not mind using a bunch of ifs and switches in a command interpreter, I myself would prefer to use LUTs to identify tokens and recognize expressions. I for one am glad that things like isupper, et. al. involve the use of a LUT rather than a bunch of tests.

@lewiscole5193 3 года назад

@@SriHarshaChilakapati Well, I should probably just let Professor Sorber address this, but since he seems to be out right now, let me take a swack at it. > Ah, I missed the last part and forgot > that both instructions in .text section > and variables in .data section need > memory. Correct. > However, this is memory on disk we're > talking about which is lying inside the > executable binary. In order for any program to be able to do anything, its instructions and the data that it works on needs to be in memory. Always. So when you want your program to be executed, the OS loads the instructions (i.e. the .text section) as well as the data (i.e. the static .data section) into memory and possibly sets up memory and register for dynamic data manipulation (i.e. the .bss section) as well as possibly a stack or two and then transfer control to the starting entry point for the program. > What will happen in runtime? Everything > in data section will simply be copied into > RAM right? See above. > Also, I'm curious what actually happens > when memory paging happens, please > do a video on that too. On a machine that has the right paging hardware, the OS can effectively just transfer control to where the first instruction in the program is supposed to be without actually loading anything into memory. The hardware will recognize that the page of memory (sometimes called a page frame) that should contain the 1st instructions doesn't actually contain the instructions/data that it's supposed (to thanks to the page table entries that the OS has set up for the program which indicates nothing is present), and will generate a page fault interrupt. (Note that often times people like to distinguish between the case where an instruction encounters a condition where the instruction can potentially be restarted and the case where the instruction encounters a condition where the instruction can't be restarted. The former conditions are referred to as "faults" by some rather than "interrupts", with the word "interrupt" being used only for the latter set of conditions. I'm not one of those people. If a signal causes a processor to transfer control, that signal is an "interrupt" to me.) The page fault will cause the CPU to transfer control to an OS page fault interrupt handler which will eventually result in some OS code being execute that will look at the executable image on "disk" and load in the page of instructions/data that are needed. The OS will update its page table to reflect the fact that a page frame has been filled with some data and after possibly some more piddling around, the OS will hopefully transfer control back to the program, this time with the required instructions/data now in place, and life should be wonderful until the next unfortunate event occurs. This is called demand paging as pages are loaded as they are referenced (demanded). > Btw, sorry if my question isn't proper. My > C and OS concepts were rusty as it's been > 6 years since I touched C code. I'm an > Android developer and work mostly in > Java these days. Professor Sorber hasn't done anything on paging specifically.

@directx872 3 года назад

Faster, better, stronger.

@40oz82 2 года назад

"the code is long and ugly" when i left the video

@mumarj 3 года назад

Good example but I don't see why this would be cleaner than object/struct/dict

@benjaminshinar9509 3 года назад

I'm waiting for the trie video! I don't actually like using LUTs. it feels like i'm doing something by hand that's already been done much better elsewhere. I know that maps are usually implemented with a tree, but i feel like a map is a lot more intuitive. and sure, i can use a global lut to store the results of expensive computations, but that's something that can be done in compile time (cpp) or as a global init function, and i'm still running into the issue of having to control the index and size of the lut. (plus, it just means that i tried to recreate a the jump table). and if I say 'just store the top 20 most common / likely options for this expensive computation', then i'm trying to re-invent the cache... there's the option with enums, but even in this case I prefer to use a switch statement, just so i can change the order of the enums and without worrying.

@lewiscole5193 3 года назад

> it feels like i'm doing something by hand > that's already been done much better > elsewhere. Okay, I'll bite ... what's wrong with pre-computing some thing if you can do it? > I know that maps are usually implemented > with a tree, but i feel like a map is a lot more > intuitive. There's a lot to be said for intuitiveness, but then again, I suspect that what a lot of people find "intuitive" is that which they've used before. IOW, once you start using LUTs, they will become "intuitive" as well. Professor Sorber is trying to introduce you to something that you might find "intuitive" if you let it. > and sure, i can use a global lut to store the results > of expensive computations, but that's something > that can be done in compile time (cpp) or as a > global init function, [...] You keep saying this like it's a down side when I don't see it that way at all. > and i'm still running into the issue of having to > control the index and size of the lut. (plus, it just > means that i tried to recreate a the jump table). Professor Sorber's example was obviously contrived in an attempt to present a simple example of how a LUT can be used and so give you just a taste of what can be done with one. That doesn't mean that one can't find "problems" with the example, but that also doesn't mean that the "problems" found by an obviously neophyte programmer are really deal breakers when it comes to their use. For example, you're apparently assuming that a LUT must be used to look up a branch target address which need not be the case as Professor Sorber's verbal example of looking up a trigonometric value clearly indicates. And controlling the size of the LUT index need not be a problem at all if you can make the LUT size a power of two in which case all you might have to do is to mask the LUT index to come up with a valid index. > and if I say 'just store the top 20 most > common / likely options for this expensive computation', > then i'm trying to re-invent the cache... And what makes you think that a direct mapped cache isn't a LUT? > there's the option with enums, but even in > this case I prefer to use a switch statement, > just so i can change the order of the enums > and without worrying. You seem to be bound and determined to try to come up with a reason to not use a LUT. And if you can come up with a way to avoid using them, then that's your choice (or that of your employer who's paying you to write the code), but I find it amusing that you're trying to do so while effectively asking for a video on tries where the nodes of a trie can often times be viewed as LUTs. Seriously, if you were tasked with writing a program that simulates some processor, say a 6502, do you really think that testing the op-code against every possible value to look up an op-code handler routine instead of using a LUT to directly look up the handler routine is A Good Idea?

@rogo7330 3 года назад

It's not very accurate comparison. It would be if you added "return" or "brake" inside each "if" body. But it is still interesting, thanks!

@JohnHollowell 3 года назад

the else if should take care of skipping all following conditionals without the need for a return or break

@rogo7330 3 года назад

@@JohnHollowell yep, didn't notice them, thanks!

@TheTrueCBaer 3 года назад

FPGAs have a a sh*t-load of LUTs for programmable logic

@Bodyja 3 года назад

void print_party_info(size_t people) { // this could be people = min(people, NUM_MESSAGES-1); but yeah more includes... people = people > NUM_MESSAGES - 1 ? NUM_MESSAGES - 1 : people; printf("%s ", messages[people]); }

@r.pizzamonkey7379 3 года назад

Well I mean, a switch statement is just a LUT for blocks of code.

@hetsmiecht1029 3 года назад

Not really, I think? The point of an LUT is to avoid having to do a bunch of comparisons and conditional jumps, and a switch statement to my knowledge doesn't avoid that. You could make an lut of function pointers though. That way you only need to make one jump.

@r.pizzamonkey7379 3 года назад

@@hetsmiecht1029 Actually it does. C (and by extension C++) implements switch statements as look up tables internally. A look-up table with function pointers is basically just a switch statement with extra steps. Interesting sidenote, that's why you need to append an explicit break statement at the end of each section if you want it not to continue; by default you're just telling the compiler where to put those addresses in the internal LUT, because it's not a function it doesn't actually push a return value onto the stack, you need to use a break statement, which is really just a fancier "goto".

@HansBezemer 2 года назад

@@hetsmiecht1029 Nice in theory, however every VM maker can tell you that function call overhead is so high, that you lose a lot of speed. In practice, switch() statements are much faster if don't call functions, but resolve the logic there and then. An even faster method is using GCC extended "goto", but obviously that only works in GCC.

@bryanwelsh6383 3 года назад

On modern computers your computations would have to be pretty hefty before look up tables make any sense... going to memory is slow as hell. Grouping related lookups makes sense because then you're less likely to get a cache miss, but if you can swap lookups for computation thats probably going to be a win the majority of the time, right? Unless you're pretty confident the computation is costly and/or is going to be in l1 cache, you're probably better off not replacing the computations with lookup tables. I get that's maybe different for embedded systems but the video seems a little too keen on lookup tables, and you don't mention any of the potential drawbacks. If you switch your code to using lookup tables in place of computations and you're not hitting the cache the performance penalty will be considerable, and considering the audience are mostly beginners they likely won't realise why, or even that it's happened at all. The only way lookup tables could be an always win situation is if you were doing metaprogramming stuff and it was being inserted as literals into your executable, or am I wrong here?

@JacobSorber 3 года назад

The point of the video was not that LUTs are always the way to go, but rather that they are an option that many beginners are unaware of, often leading them to complicated branch-heavy code. The performance question is, as you point out, more nuanced. But, remember that branching code typically also includes memory look-ups. It's definitely going to depend on your application, your code, and its workload, but I have often observed the type of behavior that Rick reported on in this post (www.linkedin.com/posts/activity-6803302729423884288-yR-L).

@bryanwelsh6383 3 года назад

@@JacobSorber Branch misprediction may be expensive and perhaps lookup tables can be one way to avoid them, but I still found the tone of the video a little misleading. Perhaps a video on branch prediction would be beneficial.