Тёмный

SwissText2024: Jesse Berent (Google) on "Connecting Digital Ink with Large Vision/Language Models" 

Swiss Text
Подписаться 216
Просмотров 233
50% 1

Recording of Jesse Berent's keynote at the 9th SwissText Conference in Chur, Switzerland on "Connecting Digital Ink with Large Vision/Language Models".
About SwissText: The Swiss Text Analytics Conference (SwissText) is an annual conference in Switzerland that brings together experts from industry and academia in the fields of Natural Language Processing (NLP), Computational Linguistics and Text Analytics. LINK: www.swisstext.org
Digital note-taking and hand drawn input is gaining popularity, offering a durable, editable, and easily indexable way of storing notes in the vectorized form known as digital ink. At the same time, the adoption of tablets with touchscreens and styluses is increasing, and a key feature is interpreting handwritten or drawn input. This talk explores the intersection of handwriting recognition and modern AI by focusing on two new approaches. The first part delves into the application of large vision-language models (VLMs) to online handwriting recognition using new representations and tokenizers. This approach, which is compatible with off-the-shelf models and methods,
offers a promising avenue for seamless integration of online handwriting recognition into existing multi-modal models. The second part will focus on converting images of handwriting (pen-and-paper notes) into digital ink with VLMs. This capability bridges the gap between traditional and digital note-taking, facilitating seamless integration of handwritten content into digital AI-assisted workflows. The presentation will conclude with a discussion of the broader implications of these advancements for the future of handwriting recognition and human-computer interaction.

Наука

Опубликовано:

 

2 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии    
Далее
The moment we stopped understanding AI [AlexNet]
17:38
Просмотров 810 тыс.
#kikakim
00:31
Просмотров 11 млн
▼КОРОЛЬ СОЖРАЛ ВСЕХ 👑🍗
29:48
Просмотров 393 тыс.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Inside Mark Zuckerberg's AI Era | The Circuit
24:02
Просмотров 1,2 млн
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
Просмотров 1,3 млн
What are AI Agents?
12:29
Просмотров 96 тыс.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Просмотров 98 тыс.
iPhone socket cleaning #Fixit
0:30
Просмотров 17 млн