Python Tutorials for Digital Humanities

Python Tutorials for Digital Humanities

286
2 029 252

Подписаться

On this channel, I provide tutorials for working with Python in a digital humanities project. I design my videos and tutorials for humanists who have no coding experience. I am a medieval historian by trade, but I create my videos with all humanists in mind. If you want to interact with the videos in more dynamic ways, check out my website, www.PythonHumanities.com. On that site, I host live coding exercises and quizzes. It is still a work in progress and will be complete during the Summer of 2020. I post 1-10 videos per week, so check back frequently.

✅Be my Patron: www.patreon.com/WJBMattingly

What is Semantic Searching? (NLP Concepts)

7:31

What is Semantic Searching? (NLP Concepts)

Месяц назад

Mastering Streamlit in 2024: Creating Interactive Applications with Input Widgets (02)

5:43

Mastering Streamlit in 2024: Creating Interactive Applications with Input Widgets (02)

3 месяца назад

Best way to do Named Entity Recognition in 2024 with GliNER and spaCy - Zero Shot NER

5:01

Best way to do Named Entity Recognition in 2024 with GliNER and spaCy - Zero Shot NER

3 месяца назад

Best Way to Transcribe Audio and Video with Python and Whisper-MLX ASR #datascience

12:10

Best Way to Transcribe Audio and Video with Python and Whisper-MLX ASR #datascience

5 месяцев назад

Shoud I learn NLP in 2024? #datascience #machinelearning #ai

8:46

Shoud I learn NLP in 2024? #datascience #machinelearning #ai

6 месяцев назад

Streamlit in 2024 Tutorial - 01 - The Basics - The different ways to write data into an app

7:46

Streamlit in 2024 Tutorial - 01 - The Basics - The different ways to write data into an app

6 месяцев назад

Understanding Pandas in Python: A Comprehensive Overview in 3 Minutes

3:13

Understanding Pandas in Python: A Comprehensive Overview in 3 Minutes

6 месяцев назад

The Best Way to Build a RAG System with Python - Verba from Weaviate - Quick Tutorial

7:00

The Best Way to Build a RAG System with Python - Verba from Weaviate - Quick Tutorial

6 месяцев назад

What is RAG or Retrieval-Augmented Generation #RAG #NLP #MachineLearning #AI #GPT #techtalk

4:52

What is RAG or Retrieval-Augmented Generation #RAG #NLP #MachineLearning #AI #GPT #techtalk

7 месяцев назад

How to use GPT Builder from OpenAI in ChatGPT

5:44

How to use GPT Builder from OpenAI in ChatGPT

7 месяцев назад

How to Create a Python Package or Library and Upload to PyPi with Twine in 5 Easy Steps #python

12:35

How to Create a Python Package or Library and Upload to PyPi with Twine in 5 Easy Steps #python

10 месяцев назад

How to Easily Find Keywords in a Document with KeyBERT in Python

7:17

How to Easily Find Keywords in a Document with KeyBERT in Python

10 месяцев назад

Training a spaCy SpanCat Model to Annotate in Texts more quickly in Prodigy | SpanCat 03

9:14

Training a spaCy SpanCat Model to Annotate in Texts more quickly in Prodigy | SpanCat 03

11 месяцев назад

How to Prepare Annotations in Prodigy for Training a SpanCat Model in spaCy (Part 2 | SpanCat) #nlp

12:22

How to Prepare Annotations in Prodigy for Training a SpanCat Model in spaCy (Part 2 | SpanCat) #nlp

Год назад

SpanCat with spaCy on Real Data | Part 01 - The Project and Cultivating Data for Annotation

12:08

SpanCat with spaCy on Real Data | Part 01 - The Project and Cultivating Data for Annotation

Год назад

LatinCy | How to use spaCy for Latin NLP in Python #nlp #spacy

7:35

LatinCy | How to use spaCy for Latin NLP in Python #nlp #spacy

Год назад

How to use the Map Function in Python for Beginners - Intermediate Python with Free Textbook

7:17

How to use the Map Function in Python for Beginners - Intermediate Python with Free Textbook

Год назад

When to use NER, EntityRuler, SpanCat, or SpanRuler in spaCy

10:49

When to use NER, EntityRuler, SpanCat, or SpanRuler in spaCy

Год назад

What are Named Tuples in Python? #python #datascience #data #programming #digitalhumanities

13:10

What are Named Tuples in Python? #python #datascience #data #programming #digitalhumanities

Год назад

Easy Tutorial for Zip and Enumerate in Python

5:47

Easy Tutorial for Zip and Enumerate in Python

Год назад

Using Lists in a spaCy 3 Pattern for EntityRuler, SpanRuler, or Matcher (spaCy Quick Tip) v. 3.5

3:35

Using Lists in a spaCy 3 Pattern for EntityRuler, SpanRuler, or Matcher (spaCy Quick Tip) v. 3.5

Год назад

Fuzzy Matching with spaCy 3.5 (spaCy 3.5 update)

6:44

Fuzzy Matching with spaCy 3.5 (spaCy 3.5 update)

Год назад

Case Insensitive Matching in Python with .Casefold()

7:33

Case Insensitive Matching in Python with .Casefold()

Год назад

The Easiest Way to do Coreference Resolution with spaCy with spaCy-Experimental

11:54

The Easiest Way to do Coreference Resolution with spaCy with spaCy-Experimental

Год назад

Topic Modeling with LeetTopic - Transformer Topic Modeling that Generates a Bokeh App (EASY!)

16:42

Topic Modeling with LeetTopic - Transformer Topic Modeling that Generates a Bokeh App (EASY!)

Год назад

Easily Make DataFrame App with Streamlit Pandas (Only 2 lines of Python!)

12:45

Easily Make DataFrame App with Streamlit Pandas (Only 2 lines of Python!)

Год назад

How to Serialize (Save) spaCy Doc Containers to Disk with DocBin and Pickle

13:23

How to Serialize (Save) spaCy Doc Containers to Disk with DocBin and Pickle

Год назад

How to Create an Image Grid with Streamlit using Columns (Easy Tutorial)

17:23

How to Create an Image Grid with Streamlit using Columns (Easy Tutorial)

Год назад

CLIP Made Easy - How to Create an Annoy Index with Clip Embeddings for Text Queries on Images

20:22

CLIP Made Easy - How to Create an Annoy Index with Clip Embeddings for Text Queries on Images

Год назад

Комментарии

@BillVoisine 15 часов назад

Thank you!!

@pravinmhaske 16 часов назад

My output shows one huge cluster and many tiny clusters. What does that mean? Note: First cluster clearly represents the topic. All other clusters have the other irrelevant words with frequency generally 1. Shall I remove all those tokens from the corpus?

@sid40000 23 часа назад

Great solution! Thank you!

@Bob_dunno 2 дня назад

Very useful, thank you

@Abubakar-hu8wu 2 дня назад

which ide are used??

@visualsbysri 2 дня назад

Does it serve open domain question answering system?

@willyjauregui6541 3 дня назад

Its RAG the same as langchain?

@python-programming 2 дня назад

You can build a RAG system with langchain but they are two different things. RAG is a workflow while LangChain is a framework.

@-Fazy 6 дней назад

The library you use in the course can be used to solve CAPTCHA image?

@hamzaaljamaai8790 6 дней назад

Thank you sir. I am enthusiastic to apply the knowledge I acquire from this course !

@adityakhopade2137 10 дней назад

Hi please make a video on to deploy same app on streamlit, veause iam getting error in import onnxruntime

@python-programming 8 дней назад

Sure!

@adityakhopade2137 8 дней назад

@@python-programming error is solved bro, i just used numpy<2

@K-si8mo 10 дней назад

Hey, please how did you comment all those lines at once?

@AnusriM-w2i 10 дней назад

is there a way to obfuscate or hide source code in pypi

@DigitalicaEG 12 дней назад

Would it be possible to have the user directly edit the table itself like excel cells?

@DigitalicaEG 12 дней назад

Would it be possible to have the user directly edit the table itself like excel cells?

@NavyaVedachala 14 дней назад

Are there any resources for finetuning GLiNER? The repo for GLiNER is giving me bugs when I attempt to finetune

@python-programming 13 дней назад

I built a library called gliner-finetune github.com/wjbmattingly/gliner-finetune it may help

@hamzaehsankhan 14 дней назад

train_spacy() update for spacy 3 as follows: def train_spacy(data, iterations): TRAIN_DATA = data nlp = spacy.blank("en") #blank fresh english model if "ner" not in nlp.pipe_names: #if the model does not have an ner pipeline ner = nlp.create_pipe("ner") nlp.add_pipe("ner", last=True) else: ner = nlp.get_pipe("ner") for _, annotations in TRAIN_DATA: #Each element in TRAIN_DATA is a tuple of text, {"entities":[]} for ent in annotations.get("entities"): #Each element in entities value is (start, end, label) e.g. (0, 20, "PERSON") ner.add_label(ent[2]) #ent[2] could be PERSON, ORG etc. other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "ner"] #with nlp.disable(*other_pipes): #deprecated line with nlp.select_pipes(disable=other_pipes): optimizer = nlp.begin_training() for itn in range(iterations): print("Starting iteration", itn) #randomly shuffling training data in every iteration #this is a common practice in machine learning #this is done to ensure that the model doesn't just memorize the order random.shuffle(TRAIN_DATA) losses = {} for text, annotations in TRAIN_DATA: #Commenting Spacy2 nlp.update() #nlp.update( # [text], # [annotations], # drop=0.2, #this is going to prevent overfitting # sgd=optimizer, # losses=losses # ) #Spacy3 nlp.update() doc = nlp.make_doc(text) example = Example.from_dict(doc, annotations) nlp.update([example], sgd=optimizer, losses=losses, drop=0.2) print(losses) return nlp #returning trained model

@CodeMaster-w5g 15 дней назад

Great explanation

@Shivu365 15 дней назад

Ur voice is not clear for me bro, waste of my time.. Sry...

@Lnd2345 16 дней назад

Thanks. You can just write and/or instead of the ampersand. Etc.

@entertainmenttv774 16 дней назад

Please zoom🙏

@hamzaehsankhan 17 дней назад

If you're using above render method in a jupyter notebook, it will return an empty html string. The concrete problem above is that displacy.render auto-detects that you're in a jupyter notebook and displays the output directly instead of returning HTML. Important note: To explicitly enable or disable “Jupyter mode”, you can use the jupyter keyword argument - e.g. to return raw HTML in a notebook, or to force Jupyter rendering if auto-detection fails. html = displacy.render(sentence, style="dep", jupyter=False) :)

@aliucbursa 17 дней назад

It is an amazing video series for this Json subject. It helped a lot to my projects.

@hamzaehsankhan 17 дней назад

textacy.extract.token_matches to be used instead of textacy.extract.matches Great tutorial, thanks!

@hamzaehsankhan 17 дней назад

My spacy model is all over the place. Nevertheless. Will keep following. Great tutorial.

@AhmedRmdan 20 дней назад

Great series!

@aayushsinha7439 21 день назад

Thanks for such a simplified explanation, helped me with my ongoing project a lot!

@vidiohs 21 день назад

🎉

@grumpyguy7656 21 день назад

Now what if you want a default entry so if the user hit enter nothing changes; for entry in temp if i == int(entry_option): field1 = entry["field1] field2 = entry["field2] temp_data = {} temp_data[field1] = str(input(f"current {field1}") or f"{field1}") temp_data[field2] = str(input(f"current {field1}") or f"{field2}") new_data.append(temp_data) i=i+1

@kenchang3456 22 дня назад

I like the idea but I'm better looking at code. Thanks.

@leutrimiTBA 24 дня назад

rescaling is not added

@dcpotomac20850 25 дней назад

Wow I got more in this 60 seconds explanation than all the lengthy confusing videos I watched on RAG.

@PatinetaMr 25 дней назад

What happened to the webpage? Many links do not work anymore

@JasonVerro 25 дней назад

Thanks for the video. I was having issues with the "getSkewAngle" function. I found an easy workaround though. I changed the last line "return -1.0 * angle" to just "return angle" Hope this helps anyone else with this problem.

@critical-chris 26 дней назад

Interesting! I have been working with whisperX and whisper-timestamped on my MacBook so far and wasn't aware of MLX. Thanks for sharing! But since you emphasize the word level timestamps: with standard whisper those are known to be very inaccurate (i.e. pretty much unusable - whisper is simply not trained to predict timestamps). So, are you suggesting that timestamps in whisper-mlx are better?

@critical-chris 26 дней назад

Regarding your example with Auschwitz: how exactly did it learn that Auschwitz belongs to concentration_camp type? Is it because your example sentence happened to say exactly that or is that just a coincidence?

@samrahnice 29 дней назад

Sucha nice video but it is so dim and small. I can't even see it. :( Any solution to fix this problem?

@alextasarov1341 Месяц назад

So it's a fancy way of implementing long term memory, cool

@AlexandreMarr-uq8pw Месяц назад

I don’t need how it works I need the demonstration

@Rudra0x01 Месяц назад

Please in next video explain how to create dataset for RAG

@encianhoratiu5301 Месяц назад

Where can I find the initial dataset?

@24035709 Месяц назад

Simply put. LLMs can't access your company's information directly. RAG lets you share relevant documents with the LLM, so it can answer your questions using your own data.

@python-programming Месяц назад

precisely!

@RamandeepSingh_04 17 дней назад

But is it safe to connect our database to llms? Can the information get leaked?

@python-programming 17 дней назад

@@RamandeepSingh_04 It is if you use open-source LLMs that can be hosted locally.

@RamandeepSingh_04 17 дней назад

@@python-programming okay so it means downloading and installing the llm on my device and then using it?And can all the open source llms be downloaded?

@stateportSound_wav 13 дней назад

@@RamandeepSingh_04yes, but Small Language Models will typically run smoothly on newer phones or PCs with lower-end GPUs, Large LMs for more VRAM capable systems, or likely the soon-to-come ai chipsets. Personally my M1 Mac really struggles to run a smaller Dolphin-Llama model, and I’d need to upgrade that one to M3 silicon or newer I think. I have 16GB VRAM in my PC GPU, so it might run better on there.

@excusemenoexcusemeno1671 Месяц назад

I didn't understand.

@Noelh86579 Месяц назад

does it work on live streaming call?

@python-programming Месяц назад

It will, but the latency may be an issue. I'm not sure of anything that can do real-time NER the way in which you can get transcriptions in near real-time.

@user-ys7st8kk5e Месяц назад

you saved my life....thanks

@superfreiheit1 Месяц назад

Great video, but please use a white background. Horrible to read on a black background

@huseyincukur4688 Месяц назад

Hi, Congratulations Your videos are very successful. However, I have a slightly more specific problem. I'm trying to read the writings on the tire treads, but OCR doesn't work. I relatively increase the performance of OCR by performing various pre-processes, but these processes are not adaptive and do not produce the effect I expect. What is your advice for a problem that has a contrast problem, such as the writings on the tire surface, and cannot be solved by pre-processing?

@yhd0808 Месяц назад

Thank you so much for sharing this

@flosrv3194 Месяц назад

no way to install this shit, get error popping from everywhere and when i resolve them, thre others appear, unusable crap

@tarik1895 Месяц назад

Hello thanks for thé vidéo dumb question if it IS bert based does it have thé same limitation in term of thé text size?

@luLu-pu6mf Месяц назад

Thanks a lot, but when I print network the size for most frequent nodes are big but other nodes disappears.. the code is the same.. I don't understand my mistake