Nvidia H100 GPU on Lambda labs is just $2/hr, I am using it for past few months unlike $12.29/hr on AWS as shown in the slide. I get it, it's still not cheap but just worth mentioning here
You are right, we reported the AWS price there as it's hte most popular option and it was not practical to show all the pricing of all the vendors. But yes you can get them for cheaper elsewhere like from Lambda, thanks for pointing it out
@@rankun203 They are available only in specific region mine is in Utah, I don't think they have expanded it plus there is no storage available in this region meaning if you shut down your instance, all data is lost
at 51:30 he says don't repeat the same prompt in the training data. What if I am fine-tuning the model on a single task but with thousands of different inputs for the same prompt?
It will cause overfitting. It would be similar to training an image classifier with a 1000 pictures of roses and only one lilly, then asking it to predict both classes with good accuracy. You want the data to have a normal distribution around your problem space.
Cool video. If I want to fine-tune it on a single specific tassk (keyword extraction), should I first train an instruction-tuned model, and then train that on my specific task? Or mix the datasets together?
It really depends on the dataset. Ludwig has also an early stopping mechanism where you can specify the number of epochs (or steps) without improvement before stopping, so you could set epochs to a relatively large number and have the early stopping take care of not wasting compute time
when I run the code in Perform Inference, I frequently receive ValueError: If `eos_token_id` is defined, make sure that `pad_token_id` is defined. what should I do?