#2 Introduction to Corpus Linguistics - Types of Corpora

Yassine Iabdounane

Подписаться 2,1 тыс.

Просмотров 18 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

29 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 61

@YassineIabdounane 4 года назад

Sample Copora 02:25 Corpora for Comparison 05:13 General Corpora 09:50 Specialized Corpora 10:58 Annotated Corpora 11:35 Unannotated Copora 17:11 Learner Corpora 17:50

@ghmarioumaima5391 4 года назад

Hello! I want to email u but i couldn't find ur email. Would u please Write it for me and thanks.

@YassineIabdounane 4 года назад

@@ghmarioumaima5391 Hi Oumaima! Sorry about that. There you go: yassine.iabdounane@gmail.com

@ikramullah3484 7 месяцев назад

I am new to the field of Corpus Linguistics. I am learning too many things from your videos. Thank you for sharing such informative videos.

@miromarita3631 10 месяцев назад

God bless you sir I'm so grateful for learning this gorgeous lesson Thank you so much 🥰❤️

@itsjustme5176 11 месяцев назад

Thank you so much, clearly explained.. i'm doing my master's degree in spain and curpus lingsuistics is a new concept to me.

@ferroumsamir6531 2 года назад

you are helping me a lot in my Master's degree in NLP. Thank you man ! Keep up the good work.

@YassineIabdounane 2 года назад

thanks for the nice words man! best of luck with your Master's degree!

@0101799 5 месяцев назад

Thank you for giving us insightful and organized lessons about the corpus linguistics!

@0101799 5 месяцев назад

Also, it was very cute of you showing the "Helsinki" in the la Casa de Papel!!!!

@Mercury-t1t Год назад

Plz sir keep sharing your knowledge with us ❤

@naveedkhattak7775 4 года назад

First the first time, i have understood the things related to CL. Thank yoi

@YassineIabdounane 4 года назад

I'm very happy to hear that! All the best

@saralassri964 4 года назад

I am so proud of you!

@YassineIabdounane 4 года назад

Thank you so much my dear!

@Mimo-n7o 14 дней назад

Hey, is COCA a monitor corpus or a diachronic corpus please?

@MOCCLIVE 4 года назад

It's awesome to learn different typer of Corpora.

@YassineIabdounane 4 года назад

Thank you for watching!

@humairajabeen2573 Год назад

Aoa, sir how can I contact you for my PhD research in linguistics using corpus linguistics. thanks

@runnihuang1161 4 года назад

thank you so much！amazing course！

@YassineIabdounane 4 года назад

My pleasure! I'm glad you liked it!

@FzFz-pn9gb 10 месяцев назад

What are the type of registre. And please explain registre and geres

@bashairj3156 4 года назад

Thannnkk you so much! Thank you Yassine!

@YassineIabdounane 4 года назад

My pleasure!

@sweetASrere 4 года назад

Great videos Yassine! Thank you

@YassineIabdounane 4 года назад

Thank you Reina! I'm glad you find them useful!

@Enjoy.your_life34. 3 года назад

@@YassineIabdounane whats the example of Specialized corpora

@YassineIabdounane 3 года назад

@@Enjoy.your_life34. a specialized corpus includes texts of a particular type, an example would be the Michigan Corpus of Academic Spoken English (MICASE)

@GhumroQadir 2 года назад

Very useful videos. I loved them.

@YassineIabdounane 2 года назад

Thanks man! Happy to know that :)

@Lwahranya Год назад

Merci beaucoup !

@sittiesohaylagubat289 3 года назад

I have a question sir, how will i use corpus linguistics to this topic Singularization of "they" ? Hope you answer my question..thank you.

@YassineIabdounane 3 года назад

It depends on what you want to study exactly. If you are interested in its historical development I would suggest using a historical corpus of English and see how the use of 'they' changes over the years.

@sittiesohaylagubat289 3 года назад

Thank you so much for responding my concern sir. I have a study research which in title of THE SINGULARIZATION "THEY" IN AN UNDERGRADUATE THESIS. In our matrix written in methodology. We will use Corpus Instrument instead. So in your own opinion, what exactly corpus were gonna use for our reaserch? Because, i'm not that familiar of corpus yet. There's a lot of questions in my mind about corpus. Thank you for responding again.

@prodibsa769 4 года назад

bro, can you make a video on how to search binomial word pairs in a certain corpus, like COCA.

@YassineIabdounane 4 года назад

To look for binomial in COCA simply use this expression: * _n* and * _n* That's about it bro :) PS: please delete the spaces between * and _ when you use the expression. I added them because a character between two * is printed in bold here in the comments like this *_n* and *_n*

@quincyjones7951 2 года назад

excellent was very helpful - thanks!

@YassineIabdounane 2 года назад

my pleasure! Happy you find it helpful :)

@asmaamagdy6456 11 дней назад

Keep going

@sabrinamalik2972 4 года назад

Can you please elaborate statistical significance and significance test with examples? And also type-token ratio Please...

@YassineIabdounane 4 года назад

On statistical significance and significance testing: Say that you have two corpora, one contains texts produced by men, and the other contains texts produced by women. You would like to see whether men use the word ‘wonderful’ more than women do. You compare the frequencies and you get that men have used the word 128 times while women have used it 110 times only. So, it seems that indeed men use ‘wonderful’ more than women do. Nevertheless, there is a number of things to consider, corpus size for example! Here’s the question, is the observed difference actually significant to claim that in general men use that word more than women do? or is it just a matter of chance and has nothing to do with men and women’s speech? To determine whether the difference is statistically significant and not due to chance, we need to use significance tests. One example would be the chi-square test. What the chi-square test does is that it compares the difference between the actual observed frequencies (128 and 110 in our case), with the expected frequencies ( the ones that we would expect if no factor other than chance had been involved). The closer these two results are to each other, the greater the probability that the observed frequencies are influenced by chance alone, hence the difference would not be significant. If you want to read more about it, I recommend this: www.lancaster.ac.uk/fss/courses/ling/corpus/Corpus3/3SIG.HTM Here’s more on expected frequencies and the chi-square test: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-ZUGKFoHUHQI.html&t On type/token ratio: Type/token ration is a measure of lexical richness. In essence it gives you an idea about how many distinct words (types) are used in a text relative to the total number of words (tokens). It is calculated by dividing the total number of types by the total number of tokens. The closer the score is to 1, the richer the text (the more distinct words are used), the further it is from 1, the more repetitions you have in the text.

@sabrinamalik2972 4 года назад

Thank you so much.

@mairasabdrahman3861 2 года назад

Tq for the information

@nomansaeed2076 3 года назад

good

@radzuwan85 3 года назад

Thanks!

@shifaais3129 3 года назад

God bless you!

@YassineIabdounane 3 года назад

Thank you very much! God bless you too :)

@sheikhmuhammadnawaz1998 4 года назад

Informative

@YassineIabdounane 4 года назад

Thank you!

@andreanicole6548 2 года назад

I love the winnie the pooh "repertoire" meme hehe

@YassineIabdounane 2 года назад

makes you feel so fancy doesn't it? lol

@sabrinamalik2972 4 года назад

Please explain Reference corpus.

@YassineIabdounane 4 года назад

Hi Sabrina! A reference corpus is a corpus that you choose as a standard of comparison with the corpus you're working with. It is usually more general and representative of the source language as a whole and it is large enough to represent all relevant varieties of a language and its features. Here's how it is useful. Say you are working with a corpus of biology, and you want to display a list of keywords that are particularly characteristics of the type of discourse or language contained within that biology corpus. In this case, you'd need to compare this 'specialized corpus' with a more general 'reference corpus' so as to see the list of words that are particular to 'biology'.

@sabrinamalik2972 4 года назад

Excellent. Thank you so much.

@sabrinamalik2972 4 года назад

Refernce corpus and monitor corpus are same or different? Because when I searched examples Bank of English Is used as example for both corpora.

@YassineIabdounane 4 года назад

Not all monitor corpora can be used as reference. A monitor corpus is one which grows in size over time. Still, the data that makes the corpus may not be general enough for the corpus to be used as reference. For instance, a monitor corpus of newspapers' data is certainly not a general corpus, or one to be viewed as 'a standard' for comparison.

@dieths7776 4 года назад

Can i ask? What is the purpose of corpus?

@YassineIabdounane 4 года назад

A corpus is intended to be a representative sample of authentic language use. There are various types of corpora as you can see so specific research purposes would vary depending on the type of the corpus chosen. But the general aim I would say is to study how a language is used authentically in a given context (either generally, or across different regions, time periods, domains etc...)