Тёмный

SaraEye - This is the world's first ChatGPT with a sense of sight! 

SaraAI
Подписаться 345
Просмотров 1,8 тыс.
50% 1

Introducing the world's first ChatGPT with a sense of sight!
Be amazed by our ChatGPT-powered voice assistant that not only hears but also sees and asks questions based on what it sees.
Is this the future of ChatGPT?
www.SaraAI.com
#AI #artificalintelligence #computer #vision #computervision #gpt4 #gpt #chatgpt #chatgpt4 #openai #google #texttospeech #speechtotext #voicerecognition #opencv

Опубликовано:

 

5 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 11   
@2000ReRyRo
@2000ReRyRo 4 месяца назад
Why the weird delays in responding by the human? Why does the machine seem more natural than the human?
@monsieur3d985
@monsieur3d985 10 месяцев назад
Extremely interesting. What do you transmit to ChatGPT so as it could interact through vision ? I guess you give some indications about what your Sarakit "see", is it ?
@ArturMajtczak
@ArturMajtczak 10 месяцев назад
Exactly right. The cameras observe the environment, and in the background, a separate program identifies objects, people, motion, etc. This information is invisibly sent as prompts to ChatGPT, which then responds as you can see in the video.
@monsieur3d985
@monsieur3d985 10 месяцев назад
​@@ArturMajtczak This reminds me of the SHRDLU program developed by Terry Winograd at MIT in 1968 (based on around fifty nouns, verbs and adjectives in 3D world of blocks). I guess that it takes a lot of computing power to do this analysis, and that you do it on an external server from the pairs of images sent. I guess that Rasperry and your SaraKit card are used only to position the motors, process the image and sound and communicate with the server, is it ? Your approach is interesting. Do you think the GPT-4 Vision update uses a similar principle and communicates through prompts with the conversational system? (this system quickly has its limits it seems to me).
@ArturMajtczak
@ArturMajtczak 10 месяцев назад
@@monsieur3d985 Sending images to a server and waiting for a response is indeed too slow and costly, so the image analysis is actually done on the Raspberry Pi itself, using a simple trained model. While this model might not recognize everything, it certainly has broad and sufficient capabilities. Image recognition isn't performed in real-time at 25 frames per second - that's not necessary at this stage. We just analyze changes in the background image, which takes about 100 to 600 ms. As I mentioned, this process runs in a separate thread and is efficient enough for our purposes. In terms of the GPT-4 Vision update, while it might use a similar principle of communicating with the conversational system through prompts, our approach focuses on local processing to avoid the delays and costs associated with server-based processing. This method, although it has its limitations, is quite effective for our current needs.
@kam7847
@kam7847 Год назад
Cool, where can I buy it?
@ArturMajtczak
@ArturMajtczak Год назад
We are developing the project, but it is not available for sale yet
@2000ReRyRo
@2000ReRyRo 11 месяцев назад
Kind of ironic that you call this a "natural" conversation. It is so UNnatural that I'm not sure which is the robot -- the little black thing sitting on the desk or the bigger white thing sitting on the chair.
@ArturMajtczak
@ArturMajtczak 10 месяцев назад
@@2000ReRyRo heh heh
@yogi9704
@yogi9704 5 месяцев назад
whats the point of the camera, except asking the questions
@ArturMajtczak
@ArturMajtczak 5 месяцев назад
1. You don't need to use a wake word like 'Alexa' or 'OK Google'; just look at SaraEye, it sees you and knows you are speaking to it - it’s more natural, the way people communicate. When we are in a group and speak while looking at someone, that person knows we are speaking to them... 2. By looking at the device, and more importantly, SaraEye looking back at you, a unique bond is formed that is hard to achieve by talking to a 'speaker' like Alexa. :)
Далее
Runway VID-2-VID Changes Everything! Full-Demo
17:52
Просмотров 181 тыс.
The Optimal Morning Routine - Andrew Huberman
16:29
Japan Releases Fully Functioning Female Robots
10:05
Просмотров 1,7 млн
9 incredible AI apps that changed my life forever
16:29
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
Просмотров 1,6 млн