Тёмный
Python Tutorials for Digital Humanities
Python Tutorials for Digital Humanities
Python Tutorials for Digital Humanities
Подписаться
On this channel, I provide tutorials for working with Python in a digital humanities project. I design my videos and tutorials for humanists who have no coding experience. I am a medieval historian by trade, but I create my videos with all humanists in mind. If you want to interact with the videos in more dynamic ways, check out my website, www.PythonHumanities.com. On that site, I host live coding exercises and quizzes. It is still a work in progress and will be complete during the Summer of 2020. I post 1-10 videos per week, so check back frequently.

✅Be my Patron: www.patreon.com/WJBMattingly
How to use GPT Builder from OpenAI in ChatGPT
5:44
7 месяцев назад
Комментарии
@BillVoisine
@BillVoisine 15 часов назад
Thank you!!
@pravinmhaske
@pravinmhaske 16 часов назад
My output shows one huge cluster and many tiny clusters. What does that mean? Note: First cluster clearly represents the topic. All other clusters have the other irrelevant words with frequency generally 1. Shall I remove all those tokens from the corpus?
@sid40000
@sid40000 23 часа назад
Great solution! Thank you!
@Bob_dunno
@Bob_dunno 2 дня назад
Very useful, thank you
@Abubakar-hu8wu
@Abubakar-hu8wu 2 дня назад
which ide are used??
@visualsbysri
@visualsbysri 2 дня назад
Does it serve open domain question answering system?
@willyjauregui6541
@willyjauregui6541 3 дня назад
Its RAG the same as langchain?
@python-programming
@python-programming 2 дня назад
You can build a RAG system with langchain but they are two different things. RAG is a workflow while LangChain is a framework.
@-Fazy
@-Fazy 6 дней назад
The library you use in the course can be used to solve CAPTCHA image?
@hamzaaljamaai8790
@hamzaaljamaai8790 6 дней назад
Thank you sir. I am enthusiastic to apply the knowledge I acquire from this course !
@adityakhopade2137
@adityakhopade2137 10 дней назад
Hi please make a video on to deploy same app on streamlit, veause iam getting error in import onnxruntime
@python-programming
@python-programming 8 дней назад
Sure!
@adityakhopade2137
@adityakhopade2137 8 дней назад
@@python-programming error is solved bro, i just used numpy<2
@K-si8mo
@K-si8mo 10 дней назад
Hey, please how did you comment all those lines at once?
@AnusriM-w2i
@AnusriM-w2i 10 дней назад
is there a way to obfuscate or hide source code in pypi
@DigitalicaEG
@DigitalicaEG 12 дней назад
Would it be possible to have the user directly edit the table itself like excel cells?
@DigitalicaEG
@DigitalicaEG 12 дней назад
Would it be possible to have the user directly edit the table itself like excel cells?
@NavyaVedachala
@NavyaVedachala 14 дней назад
Are there any resources for finetuning GLiNER? The repo for GLiNER is giving me bugs when I attempt to finetune
@python-programming
@python-programming 13 дней назад
I built a library called gliner-finetune github.com/wjbmattingly/gliner-finetune it may help
@hamzaehsankhan
@hamzaehsankhan 14 дней назад
train_spacy() update for spacy 3 as follows: def train_spacy(data, iterations): TRAIN_DATA = data nlp = spacy.blank("en") #blank fresh english model if "ner" not in nlp.pipe_names: #if the model does not have an ner pipeline ner = nlp.create_pipe("ner") nlp.add_pipe("ner", last=True) else: ner = nlp.get_pipe("ner") for _, annotations in TRAIN_DATA: #Each element in TRAIN_DATA is a tuple of text, {"entities":[]} for ent in annotations.get("entities"): #Each element in entities value is (start, end, label) e.g. (0, 20, "PERSON") ner.add_label(ent[2]) #ent[2] could be PERSON, ORG etc. other_pipes = [pipe for pipe in nlp.pipe_names if pipe != "ner"] #with nlp.disable(*other_pipes): #deprecated line with nlp.select_pipes(disable=other_pipes): optimizer = nlp.begin_training() for itn in range(iterations): print("Starting iteration", itn) #randomly shuffling training data in every iteration #this is a common practice in machine learning #this is done to ensure that the model doesn't just memorize the order random.shuffle(TRAIN_DATA) losses = {} for text, annotations in TRAIN_DATA: #Commenting Spacy2 nlp.update() #nlp.update( # [text], # [annotations], # drop=0.2, #this is going to prevent overfitting # sgd=optimizer, # losses=losses # ) #Spacy3 nlp.update() doc = nlp.make_doc(text) example = Example.from_dict(doc, annotations) nlp.update([example], sgd=optimizer, losses=losses, drop=0.2) print(losses) return nlp #returning trained model
@CodeMaster-w5g
@CodeMaster-w5g 15 дней назад
Great explanation
@Shivu365
@Shivu365 15 дней назад
Ur voice is not clear for me bro, waste of my time.. Sry...
@Lnd2345
@Lnd2345 16 дней назад
Thanks. You can just write and/or instead of the ampersand. Etc.
@entertainmenttv774
@entertainmenttv774 16 дней назад
Please zoom🙏
@hamzaehsankhan
@hamzaehsankhan 17 дней назад
If you're using above render method in a jupyter notebook, it will return an empty html string. The concrete problem above is that displacy.render auto-detects that you're in a jupyter notebook and displays the output directly instead of returning HTML. Important note: To explicitly enable or disable “Jupyter mode”, you can use the jupyter keyword argument - e.g. to return raw HTML in a notebook, or to force Jupyter rendering if auto-detection fails. html = displacy.render(sentence, style="dep", jupyter=False) :)
@aliucbursa
@aliucbursa 17 дней назад
It is an amazing video series for this Json subject. It helped a lot to my projects.
@hamzaehsankhan
@hamzaehsankhan 17 дней назад
textacy.extract.token_matches to be used instead of textacy.extract.matches Great tutorial, thanks!
@hamzaehsankhan
@hamzaehsankhan 17 дней назад
My spacy model is all over the place. Nevertheless. Will keep following. Great tutorial.
@AhmedRmdan
@AhmedRmdan 20 дней назад
Great series!
@aayushsinha7439
@aayushsinha7439 21 день назад
Thanks for such a simplified explanation, helped me with my ongoing project a lot!
@vidiohs
@vidiohs 21 день назад
🎉
@grumpyguy7656
@grumpyguy7656 21 день назад
Now what if you want a default entry so if the user hit enter nothing changes; for entry in temp if i == int(entry_option): field1 = entry["field1] field2 = entry["field2] temp_data = {} temp_data[field1] = str(input(f"current {field1}") or f"{field1}") temp_data[field2] = str(input(f"current {field1}") or f"{field2}") new_data.append(temp_data) i=i+1
@kenchang3456
@kenchang3456 22 дня назад
I like the idea but I'm better looking at code. Thanks.
@leutrimiTBA
@leutrimiTBA 24 дня назад
rescaling is not added
@dcpotomac20850
@dcpotomac20850 25 дней назад
Wow I got more in this 60 seconds explanation than all the lengthy confusing videos I watched on RAG.
@PatinetaMr
@PatinetaMr 25 дней назад
What happened to the webpage? Many links do not work anymore
@JasonVerro
@JasonVerro 25 дней назад
Thanks for the video. I was having issues with the "getSkewAngle" function. I found an easy workaround though. I changed the last line "return -1.0 * angle" to just "return angle" Hope this helps anyone else with this problem.
@critical-chris
@critical-chris 26 дней назад
Interesting! I have been working with whisperX and whisper-timestamped on my MacBook so far and wasn't aware of MLX. Thanks for sharing! But since you emphasize the word level timestamps: with standard whisper those are known to be very inaccurate (i.e. pretty much unusable - whisper is simply not trained to predict timestamps). So, are you suggesting that timestamps in whisper-mlx are better?
@critical-chris
@critical-chris 26 дней назад
Regarding your example with Auschwitz: how exactly did it learn that Auschwitz belongs to concentration_camp type? Is it because your example sentence happened to say exactly that or is that just a coincidence?
@samrahnice
@samrahnice 29 дней назад
Sucha nice video but it is so dim and small. I can't even see it. :( Any solution to fix this problem?
@alextasarov1341
@alextasarov1341 Месяц назад
So it's a fancy way of implementing long term memory, cool
@AlexandreMarr-uq8pw
@AlexandreMarr-uq8pw Месяц назад
I don’t need how it works I need the demonstration
@Rudra0x01
@Rudra0x01 Месяц назад
Please in next video explain how to create dataset for RAG
@encianhoratiu5301
@encianhoratiu5301 Месяц назад
Where can I find the initial dataset?
@24035709
@24035709 Месяц назад
Simply put. LLMs can't access your company's information directly. RAG lets you share relevant documents with the LLM, so it can answer your questions using your own data.
@python-programming
@python-programming Месяц назад
precisely!
@RamandeepSingh_04
@RamandeepSingh_04 17 дней назад
But is it safe to connect our database to llms? Can the information get leaked?
@python-programming
@python-programming 17 дней назад
@@RamandeepSingh_04 It is if you use open-source LLMs that can be hosted locally.
@RamandeepSingh_04
@RamandeepSingh_04 17 дней назад
@@python-programming okay so it means downloading and installing the llm on my device and then using it?And can all the open source llms be downloaded?
@stateportSound_wav
@stateportSound_wav 13 дней назад
@@RamandeepSingh_04yes, but Small Language Models will typically run smoothly on newer phones or PCs with lower-end GPUs, Large LMs for more VRAM capable systems, or likely the soon-to-come ai chipsets. Personally my M1 Mac really struggles to run a smaller Dolphin-Llama model, and I’d need to upgrade that one to M3 silicon or newer I think. I have 16GB VRAM in my PC GPU, so it might run better on there.
@excusemenoexcusemeno1671
@excusemenoexcusemeno1671 Месяц назад
I didn't understand.
@Noelh86579
@Noelh86579 Месяц назад
does it work on live streaming call?
@python-programming
@python-programming Месяц назад
It will, but the latency may be an issue. I'm not sure of anything that can do real-time NER the way in which you can get transcriptions in near real-time.
@user-ys7st8kk5e
@user-ys7st8kk5e Месяц назад
you saved my life....thanks
@superfreiheit1
@superfreiheit1 Месяц назад
Great video, but please use a white background. Horrible to read on a black background
@huseyincukur4688
@huseyincukur4688 Месяц назад
Hi, Congratulations Your videos are very successful. However, I have a slightly more specific problem. I'm trying to read the writings on the tire treads, but OCR doesn't work. I relatively increase the performance of OCR by performing various pre-processes, but these processes are not adaptive and do not produce the effect I expect. What is your advice for a problem that has a contrast problem, such as the writings on the tire surface, and cannot be solved by pre-processing?
@yhd0808
@yhd0808 Месяц назад
Thank you so much for sharing this
@flosrv3194
@flosrv3194 Месяц назад
no way to install this shit, get error popping from everywhere and when i resolve them, thre others appear, unusable crap
@tarik1895
@tarik1895 Месяц назад
Hello thanks for thé vidéo dumb question if it IS bert based does it have thé same limitation in term of thé text size?
@luLu-pu6mf
@luLu-pu6mf Месяц назад
Thanks a lot, but when I print network the size for most frequent nodes are big but other nodes disappears.. the code is the same.. I don't understand my mistake