Topic Modeling with Python

PyTexas

Подписаться 3,9 тыс.

Просмотров 67 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

13 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 31

@yongkangchia1993 4 года назад

One of the best presentations ever on LDA Topic modelling. Simple but effective Illustrations. Thank you!

@juxtux 8 лет назад

Thank you so much Christine! Your explanations are clear and you capture the essence of LDA. You just saved me from my final project in my Adv. Machine Learning class at Duke U., thank you again!

@justchilllin7061 8 лет назад

+Jorge Monardes Hope you do well in Katherine Heller's class!

@juxtux 8 лет назад

Lin Xiao I think I survived XD

@hingar 2 года назад

Excellent presentation. Thank you

@caiqueandrade6259 5 лет назад

Excelent content. Thank you Christine!

@hanaahammad6063 6 лет назад

Excellent. Thanks Christine

@muzafarrasool55 8 лет назад

Your explanations are nice and illustrative. Thank you so much Christine! I am interested in comparing text documents in a topic space for removing redundancy (if any) in a collection of documents. I wish if you could please provide a link of data set that can of some help.

@syedyasin2594 7 лет назад

Latent Metonymical Analysis and Indexing (LMai) algorithm, invented in 2006-2007...does much of this in an unsupervised way, (with zero guidance) ...The algorithm decomposes each document to Decide a Topic and then clusters the relevant Topics. Description: The present invention relates to Latent Metonymical analysis and Indexing (LMai) is a novel concept for Advance Machine Learning or Unsupervised Machine Learning Techniques, which uses a statistical approach to identify the relationship between the words in a set of given documents (Unstructured Data). This approach does not necessarily need training data to make decisions on matching the related words together but actually has the ability to do the classification by itself. All that is needed is to give the algorithm a set of natural documents. The method is elegant enough to classify the relationships automatically without any human guidance during the process as shown in FIGS. 6 and 7.

@kidanemehari9092 7 лет назад

Thank u. Christine. it would be nice if u could put the link to the slides, in the comment section

@manjeetnagi 7 лет назад

you can find it here - chdoig.github.io/pytexas2015-topic-modeling/#/

@Machin396 6 лет назад

Thanks for the talk!

@swatichauhan4328 8 лет назад

very useful. thanks a lot. nice explanation

@Dr_Ali.Aljboury 6 лет назад

Thank you for your review it's really interesting am IT too and I used all of these you talked about. . The machine learning and unsupervised in my work and even SVM too by Gusseain. Please go a head you did very well Regards

@rahul_bali 6 лет назад

The Talk 1:55

@chiranjit7798 8 лет назад

Amazing!!! Thanks

@ozkaa 7 лет назад

so helpful, thankyou

@punpompur 7 лет назад

what about difference between linear discriminant analysis and latent dirichlet allocation? It was mentioned that latent dirichlet allocation is used for unsupervised classification. Does it mean that it will never work for superivsed purpose or just that it is not useful for supervised classification?

@davidfortini3205 6 лет назад

Unsupervised learning does not use labels, so if you want to use it for supervised learning you can take the TF-IDF matrix (as features) and use any classifier on it (SVM, ANN, Random Forest)

@ahmedmohammed-xo7rr 8 лет назад

nice video.thanx alot can we apply rule to LSA on 20newsgroup.

@Dr_Ali.Aljboury 6 лет назад

ahmed mohammed you used both LDA and LSA too for 20newsgroup dataset it's work too I read about it paper he's did for that .

@BrianFaure1 7 лет назад

Great video but 3:34 is just so damn funny

@ChristantoMaulanaAdityanugraha 4 года назад

Hi, any recommendation how I can use stopwords in indonesian language? I've been looking for it everywhere but there are still working progresses.

@amarimuthu 7 лет назад

I tried to follow the tutorial for doing Topic Modelling with gensim Python library. It seems the input file size is too big of 12.9 GB is that right? (dumps.wikimedia.org/enwiki/latest/)