very nice tutorial, but i get warning texts saying BucketIerator, Field and TabularDataset are being deprecated ... also i cant scale BucketIterator for TPUs and multi-GPUs , any better alternatives ??
Can you do a video on the updated Torchtext 9.0? I think they revamped much of this and the new features look pretty awesome with subword tokenization implemented. (i.e. 'sub', '_word')
Please create one video on semantic segmentation using Pytroch CNN. Dataset must contain cancer images + ground truth images. And train model will return best IoU and Accuracy of proposed model.
Great video, I do really learn a lot, thanks. When I run BucketIterator, it came up with an error 'int' object is not subscriptable , I check my codes but still got no ideas of where the fault is.
Great video! I wanted to ask, how to use TabularDataset to split train, validation and test? Should I use something like this below? train_data, valid_data = TabularDataset.splits( path="data", train="train.json", test="valid.json", format="json", fields=fields ) test_data = TabularDataset.splits( path="data", test="test.json", format="json", fields=fields )
Yes that looks fine to me, from my understanding and this is I believe how I showed it in the video is that you need separate json/csv files and then in TabularDataset you specify those files using train, test and also possibly validation data
Great tutorial, but sadly I think its already outdated. Torchtext has deprecated 'Field' and some other classes and Printing the keys and values of the dict 10:27 doesn't give proper representations of the objects anymore, they probably broke something while updating the code
Will look into it more but I think you might be right about this unfortunately... I just really hope all Seq2Seq tutorials that build on this will still be able to run :/
@@AladdinPersson Sorry, false alarm. The mistake in my case was that I used TabularDataset.splits() when I had just a train.csv and no test. There is no need of splits method in that case. I was just mindlessly copying from the video. My bad. The deprecation thing is just a warning as of now. The deprecated stuff will go into torchtext.legacy, so I guess your code will still work in 0.8 with some import changes.
Yeah I saw that torchtext is going to update, I will wait a little bit to make sure they aren't going to make any additional changes and then update my previous videos to the new use of the API
I have a question, how Can I save (train_data, test_data = datasets.IMDB.splits(TEXT, LABEL)) train data or test data in IMDB dataset as a CSV file? since every time I need 15 mins to load data. thanks bro
Very nice tutorial! While i was looking at torchtext, I actually came across the libraries torchnlp and allennlp. I couldn't really tell what the differences between them were. Have you worked with them ?
Hi, I am trying to iterate the training data and its working fine but for test data but is showing me an error? Could you pls help me in resolving that error? I would really appreciate it. Thanks