PyData

PyData

3 575
13 738 837

Подписаться

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

We aim to be an accessible, community-driven conference, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

PyData Amsterdam 2024 Aftermovie

1:52

PyData Amsterdam 2024 Aftermovie

21 час назад

Using LLMs to Create Knowledge Graphs From a Large Corpus of Parliamentary Debates

34:38

Using LLMs to Create Knowledge Graphs From a Large Corpus of Parliamentary Debates

22 часа назад

Building Professional Voice AI with Vocode [PyCon DE & PyData Berlin 2024]

24:51

Building Professional Voice AI with Vocode [PyCon DE & PyData Berlin 2024]

23 часа назад

Build a personalized Bitcoin (BTC) virtual assistant in Python with Hopsworks and LLM function call…

27:30

Build a personalized Bitcoin (BTC) virtual assistant in Python with Hopsworks and LLM function call…

23 часа назад

Your Model _Probably_ Memorized the Training Data [PyCon DE & PyData Berlin 2024]

45:16

Your Model _Probably_ Memorized the Training Data [PyCon DE & PyData Berlin 2024]

2 часа назад

The evolution of Feature Stores [PyCon DE & PyData Berlin 2024]

29:00

The evolution of Feature Stores [PyCon DE & PyData Berlin 2024]

2 часа назад

RAG for a medical company: the technical and product challenges [PyCon DE & PyData Berlin 2024]

42:57

RAG for a medical company: the technical and product challenges [PyCon DE & PyData Berlin 2024]

2 часа назад

There is a Better Way to Automate and Manage Your (Fluid) Simulations

27:12

There is a Better Way to Automate and Manage Your (Fluid) Simulations

2 часа назад

How Python helped us uncover secrets of protein motion [PyCon DE & PyData Berlin 2024]

30:22

How Python helped us uncover secrets of protein motion [PyCon DE & PyData Berlin 2024]

2 часа назад

The Struggles We Skipped: Data Engineering for the TikTok Generation [PyCon DE & PyData Berlin 2024]

29:15

The Struggles We Skipped: Data Engineering for the TikTok Generation [PyCon DE & PyData Berlin 2024]

2 часа назад

Documenting R&D Progress using jupyter-book - and feel safe for the next performance audit

26:21

Documenting R&D Progress using jupyter-book - and feel safe for the next performance audit

2 часа назад

Is GenAI All You Need to Classify Text? Some Learnings from the Trenches

31:37

Is GenAI All You Need to Classify Text? Some Learnings from the Trenches

2 часа назад

How to embrace your Leadership role as a Data Nerd (or other creative types)

29:19

How to embrace your Leadership role as a Data Nerd (or other creative types)

4 часа назад

Whispered Secrets: Building An Open-Source Tool To Live Transcribe & Summarize Conversations

26:09

Whispered Secrets: Building An Open-Source Tool To Live Transcribe & Summarize Conversations

4 часа назад

Reinforcement Learning: Bridging The Gap Between Research and Applications

29:53

Reinforcement Learning: Bridging The Gap Between Research and Applications

4 часа назад

Mostly Harmless Fixed Effects Regression in Python with PyFixest [PyCon DE & PyData Berlin 2024]

25:07

Mostly Harmless Fixed Effects Regression in Python with PyFixest [PyCon DE & PyData Berlin 2024]

4 часа назад

Exploring Zarr: From Fundamentals to Version 3.0 and Beyond [PyCon DE & PyData Berlin 2024]

30:04

Exploring Zarr: From Fundamentals to Version 3.0 and Beyond [PyCon DE & PyData Berlin 2024]

4 часа назад

Tutorial Boost your Data Science skills with the new Python in Excel [PyCon DE & PyData Berlin 2024]

1:27:12

Tutorial Boost your Data Science skills with the new Python in Excel [PyCon DE & PyData Berlin 2024]

7 часов назад

Lessons learned from deploying Machine Learning in an old-fashioned heavy industry

31:27

Lessons learned from deploying Machine Learning in an old-fashioned heavy industry

7 часов назад

Personalizing Carousel Ranking on Wolt's Discovery Page: A Hierarchical Multi-Armed Bandit Approach

43:45

Personalizing Carousel Ranking on Wolt's Discovery Page: A Hierarchical Multi-Armed Bandit Approach

7 часов назад

How to Do Monolingual, Multilingual, and Cross-lingual Text Classification in April, 2024

30:11

How to Do Monolingual, Multilingual, and Cross-lingual Text Classification in April, 2024

9 часов назад

Missing Data, Bayesian Imputation and People Analytics with PyMC [PyCon DE & PyData Berlin 2024]

26:36

Missing Data, Bayesian Imputation and People Analytics with PyMC [PyCon DE & PyData Berlin 2024]

9 часов назад

František Kaláb - How Data Improves Your Life in Prague [PyData Prague #21]

29:41

František Kaláb - How Data Improves Your Life in Prague [PyData Prague #21]

9 часов назад

PyData Eindhoven 2024 - Aftermovie

2:09

PyData Eindhoven 2024 - Aftermovie

День назад

Ngesa Marvin - Keras (3) for the Curious and Creative | PyData Global 2023

1:13:39

Ngesa Marvin - Keras (3) for the Curious and Creative | PyData Global 2023

21 день назад

Olivier Grisel - Predictive survival analysis with scikit-learn, scikit-survival and lifelines

1:23:26

Olivier Grisel - Predictive survival analysis with scikit-learn, scikit-survival and lifelines

21 день назад

Heidrich, Kiraly, & Ray - sktime - python toolbox for time series | PyData Global 2023

1:30:16

Heidrich, Kiraly, & Ray - sktime - python toolbox for time series | PyData Global 2023

21 день назад

Pattaniyil, Ravi, & Zengin - Using LLMs to improve your Search Engine | PyData Global 2023

1:29:40

Pattaniyil, Ravi, & Zengin - Using LLMs to improve your Search Engine | PyData Global 2023

21 день назад

Ramona Sartipi - When Design Thinking Meets Opensource | PyData Global 2023

46:21

Ramona Sartipi - When Design Thinking Meets Opensource | PyData Global 2023

21 день назад

Комментарии

@axe863 11 часов назад

Y'all are not constructing multi-class labeling correctly....

@ElizabethOliver-n1d 22 часа назад

I'm favoured financially with Bitcoin ETFs, Thank you buddy. $43,700 biweekly profit regardless of how bad it gets on the economy!!!...

@LucasBenjamin-n2y 22 часа назад

Huge! been trying to trade on my own for a while now but it isn’t going well. Few weeks ago I lost about $7,000 in a particular trade. Can you at least advise me on what to do?

@ElizabethOliver-n1d 22 часа назад

Well, I picked the challenge to put my finances in order. Then I invested in cryptocurrency, stocks, through the assistance of my discretionary fund manager,

@ElizabethOliver-n1d 22 часа назад

Mr Alex brucemacias

@AndySullivan-f5o 22 часа назад

I'm not here to converse for him to testify just for what I'm sure of, he's trust worthy and best option ever seen

@EberePromise-g1k 22 часа назад

Such a genuine personality!! He is really a good investment advisor. I was privileged to attend some of his seminars.

@ahmedkhaled330 День назад

fantastic talk. slides and references, please?

@kapilkhond6339 День назад

This was really helpful session. Thanks for sharing such insights here.

@emmang2010 День назад

Thank you very much.

@Ale-lq2qk День назад

starts at 3:30

@MrBigbanan 2 дня назад

⋯⋰⋮⩓⫶⫻

@marc-andrechenier5488 2 дня назад

What not use linearmodels for python? It also has fast treatment of fixed effects and can be used for e.g. two-way fixed effects applications

@s3alfisc 2 дня назад

Hi - pyfixest author here. linearmodels is a great library - I simply like R's fixest library very much!

@marc-andrechenier5488 2 дня назад

@@s3alfisc Thanks for the response, definitely curious about trying out your library at work :)

@s3alfisc 2 дня назад

@@marc-andrechenier5488 Cool, looking forward to any potential feedback! =)

@sefumbweha9926 2 дня назад

Since October 7 the way Hamas been moving ther thing just making the future in Palestine harder. say peace na fit it anymore. Instead of talk, its violence all the time, and the zionest entity takes advantage. The last hope was wit Salam Fayyad who could play the games. Hamas instead of looking after the everyday people just get rich in Doha, spend money on tunnels and luxury

@TheNitroPython 2 дня назад

You should post the GitHub repo link

@6Diego1Diego9 2 дня назад

he wont open source this hypothetical new framework? sad

@MischaPanch 2 дня назад

I might actually do it soon since there'd be at least two users I know of (more than zero), but it will probably be very difficult to grow a community around it

@MorrisonOscar-u6l 4 дня назад

Anderson Jessica Jackson Timothy Martinez Helen

@Mayur7Garg 4 дня назад

The quality of the video is too poor. The text in the notebook is barely readable.

@TheKumarAshwin 4 дня назад

Pandas Left the chat

@mrchesitostar7652 5 дней назад

Hi Bro this Is amazing appreciate it

@yssefunc 5 дней назад

Nice and clean talks … love it

@KatherineCoursey-d5q 5 дней назад

Moore Deborah Gonzalez Daniel Martin Thomas

@KatherineCoursey-d5q 5 дней назад

Wilson Eric Thomas Kevin Lopez Christopher

@KatherineCoursey-d5q 5 дней назад

Harris Cynthia Perez Betty Martinez Thomas

@KatherineCoursey-d5q 5 дней назад

Moore Jason Young Linda Johnson Jason

@DuBoisEdmund-r1t 6 дней назад

White David Garcia Angela White Steven

@DuBoisEdmund-r1t 6 дней назад

Thompson Larry Anderson Betty Jones Paul

@DuBoisEdmund-r1t 6 дней назад

Smith Christopher Martin Dorothy Lopez Jose

@hannahrussel2329 7 дней назад

Young Matthew Lewis Robert Brown Margaret

@MelanieKnudsen-y4b 7 дней назад

Wilson Timothy White Steven Anderson Donald

@jiayangcheng 9 дней назад

Love the presentation. Great work!

@АкулинаСемерикова 9 дней назад

Smith Charles Brown Brenda Martinez Melissa

@CherryBlossomStorm 11 дней назад

but if I make a diagram, how do I know its the correct diagram, without having run experiments? you have your diagram first then do your modeling and analysis! I suppose you have to start with some hypothesis though

@viorelteodorescu 11 дней назад

Sounds very good - and what I needed at this point in my developing a solution for document management. Thank you for the software and for the lecture! Will try it straight away.

@mubangizijulius4303 11 дней назад

This is the best video i have ever found online.do you have other good videos like this

@jackpurdoneverydayzero 12 дней назад

Amazing how NLP has changed with the release of ChatGPT etc

@EdnaJohansson-c4e 12 дней назад

Alia Shoals

@ish694 14 дней назад

This is super cool!

@donciclon 14 дней назад

Love his talks, but damn, had to watch this at .7x speed

@MargaretWilson-h9c 14 дней назад

Wisozk Turnpike

@vunder8737 15 дней назад

This truly was a wonderful presenter, would love to listen to him on other presentations

@BeckCaesar-r8l 15 дней назад

Taylor Nancy Martin Ruth Gonzalez Daniel

@BeckCaesar-r8l 15 дней назад

Davis Timothy Anderson Michael White Donna

@orasporas1 15 дней назад

The font is to small.

@noclaf78 16 дней назад

Well presented!

@RodneyVillegas-o8k 16 дней назад

Harris Melissa Johnson Mary Walker Carol

@metecantimur9542 17 дней назад

@19:35, I think we can implement the __call__ function to return the value of the polynomial for value x.

@taxzanUSA 17 дней назад

I want to scrape multiple tables from a website search query into Excel. where do i begin?

@wolpumba4099 17 дней назад

*cudf.pandas: Accelerating Pandas with GPUs for Faster Data Processing* * *0:00** Introduction:* Ashwin Srinath, Senior Software Engineer at NVIDIA, introduces cudf.pandas, a tool that allows you to run pandas code on GPUs without code changes, achieving significant speedups. * *0:10** Motivation:* Pandas is popular but can be slow due to single-threading and its non-query-engine nature. Alternatives like cuDF exist, but they often require code changes and have different APIs. * *2:08** cuDF Overview:* cuDF, based on CUDA and C++, is a GPU-based data frame library offering a pandas-like API and substantial performance gains (10-100x faster than pandas). It currently supports 60-75% of the pandas API. * *3:25** Reasons to Stick with Pandas:* Despite alternatives, pandas remains valuable for its flexibility, ease of collaboration, a vast ecosystem of dependent libraries, and ongoing performance improvements. * *5:15** cudf.pandas Approach:* cudf.pandas aims to combine the benefits of pandas with GPU acceleration, allowing users to retain the familiar pandas API while leveraging the speed of GPUs. * *5:34** How It Works:* cudf.pandas acts as a proxy for pandas, intercepting pandas calls and attempting to execute them on the GPU via cuDF. If an operation isn't supported on the GPU, it seamlessly falls back to CPU execution using pandas. * *7:54** Demo (Part 1 - Basic Operations):* The demo showcases how to load the cudf.pandas extension in Jupyter Notebook. Several examples demonstrate performance gains for groupby, string operations, and merge operations, but also highlights cases where GPU acceleration doesn't provide a speedup (e.g., `count` on axis=1). * *10:53** Proxy Pattern Explained:* cudf.pandas uses a proxy pattern where proxy functions and types intercept pandas calls, attempting GPU execution first and falling back to CPU if necessary. * *11:18** Demo (Part 2 - Performance Optimization):* The demo focuses on optimizing time series data operations. Using the cudf.pandas profiler reveals that `index.between_time` is a CPU bottleneck. By rewriting the code to use GPU-supported datetime properties, the execution time is significantly reduced. * *15:11** Optimization Benefits:* Code optimized for GPU execution often also runs faster on the CPU, demonstrating that writing GPU-friendly code can be beneficial even without a GPU. * *15:49** Demo (Part 3 - Third-Party Library Acceleration):* The demo shows how cudf.pandas can accelerate third-party libraries that rely on pandas. Using LangChain as an example, it demonstrates how an LLM-powered agent utilizing pandas for data analysis can benefit from GPU acceleration, significantly reducing query execution time. * *19:00** Recap:* cudf.pandas offers GPU acceleration for pandas with no code changes. Optimizing code for GPU execution is crucial for maximum performance. Third-party libraries can leverage GPUs through cudf.pandas. * *19:28** How It Works (Technical Details):* cudf.pandas relies on the proxy pattern and customizes the Python import mechanism to deliver proxy modules, ensuring seamless integration with existing pandas code. * *20:50** Comparison with Other Approaches:* The talk briefly discusses the limitations of duct typing and the potential of the DataFrame Standard API for interoperability between data frame libraries. * *22:42** FAQs:* The presentation concludes with FAQs covering performance expectations, pandas API support, compatibility with third-party libraries, and handling data larger than GPU memory. * *24:03** Getting Started:* Instructions for installation and access to demo materials are provided. * *24:34** Q&A:* A brief Q&A session addresses questions from the audience regarding CPU performance gains, multi-node scaling, Docker images, UDF support, and the availability of the GitHub repository. I used gemini-1.5-pro-exp-0801 on rocketrecap dot com to summarize the transcript. Cost (if I didn't use the free tier): $0.09 Input tokens: 23132 Output tokens: 856

@UlyssesAlger-k5l 17 дней назад

2017 Stephon Cliff

@visskiss 18 дней назад

Hi Dimitry, I am trying to discover how to get a "minimum heart rate" from a bunch of samples of "sedentary heart rate" say about 10 per hour. The minimum heart rate would express the true minimum for the (noisy) samples (as opposed to just actual minimum). I thought about using extreme value analysis, but after this explanation, that doesn't seem correct. What would you suggest?

@DreamsAPI 18 дней назад

Thank you Vincent for sharing the link to this video of yours mentioning contextual helper in Jupyter lab notebook. Plus your demo of reflection is a good idea was extra goodie on top

@StuckDuckF 19 дней назад

Great work, thank you for your efforts.

@duncanhart 19 дней назад

Those people who keep talking throughout the presentation are incredibly rude.

@OOD2021 19 дней назад

The slides are great but did anybody find out where this guy was talking about?