Can Open Source AI Agents Beat Perplexity AI? Testing Codestral, GPT4o, and Mixtral

Data Centric

Подписаться 10 тыс.

Просмотров 2,9 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Наука

Опубликовано:

29 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 29

@john_blues 2 месяца назад

My man just casually says, "I recently replicated what a billion dollar company did...". And I'm sharing how I did it with you. This is why I love the internet.

@pessini 2 месяца назад

Been following your series, great job man! Thanks for that.

@MrSuntask 2 месяца назад

Wow. One of the best videos I have seen in a long time 😊🎉

@SeeFoodDie 2 месяца назад

Would love to see how Claude Opus compares with a web search tool. Especially since perplexity has it as an option. Thanks for all the hard work on this.

@ZacMagee 2 месяца назад

Love your detailed content, fantastic work. Very inspiring

@SeeFoodDie 2 месяца назад

Fantastic analysis. Thank you

@supercurioTube 2 месяца назад

It was a very long video but I watched it all anyway. Your commentary was entertaining and insightful and it was fun to follow along. I'm building my own LLM app and although it's more "pipeline" than "agents", how it works out as well as the development process are similar. At the moment I'm adding state snapshots in order to replay from any step. Hopefully that will make iterating quicker and more deterministic.

@Data-Centric 2 месяца назад

I've just built this same WebSearch Agent in LanGraph (video is out now). State Snapshots are definitely a great idea, they're are one of the key abstractions in LangGraph.

@supercurioTube 2 месяца назад

@@Data-Centric that video is in my Watch Later 🙏 I'm looking forward to hear what you're gonna say in terms of pros/cons of building on top of a framework vs writing the logic yourself. TBH writing code for a simple LLM app is really not difficult at all for a seasoned developer, so I haven't found the appeal yet. Maybe there's an explosion in complexity when looking at more advanced agent systems. The built-in snapshot/replay functionality is a nice to have for sure as it becomes harder to implement with many agents, especially with concurrency. I'm thinking about making videos on lessons learned when building my project following your inspiration, thanks for that!

@pessini 2 месяца назад

To add a note on the questions. The reason why the models struggle to answer the Aruba question is because of the debate around where the Caribbean islands are sometimes considered to be part of North America due to their political history even though they are on South America's geographical plate and continental shelf. I'm from Brazil and here in geography classes, Aruba is not part of South America's countries.

@Data-Centric 2 месяца назад

I suspected there might be some ambiguity here in relation to Aruba's continent but wasn't sure, thanks for confirming this!

@caseyhoward8261 2 месяца назад

U just made my build a lot easier! ❤️

@jasonyoo182 2 месяца назад

great video! Thanks for sharing

@bsarel 2 месяца назад

Nice! But what’s the perplexity reference? The free one? The pro one? Also, it would be nice to have a comparison of your solution and perplexity solution utilizing the same models. I’ve been using perplexity with Opus for a while and switched to GPT-4o and I think the GPT-4o yields better results overall.

@Data-Centric 2 месяца назад

It was benchmarked against the free version.

@locodo12 2 месяца назад

It would be interesting to see the result of Yi-1.5-34B-Chat-16K and upcoming Qwen-2(?b).

@ManjaroBlack 2 месяца назад

I couldn’t get qwen2 to cooperate when trying it out.

@ManjaroBlack 2 месяца назад

I configured the agent to use SearXNG instead of Serper. Llama3 8b really leaves a lot to be desired though.

@spotnuru83 2 месяца назад

Hi John, thank you so much for sharing your experiements and knowledge here really appreciate it. I actually wanted to build something, it would be great if i can get your guidance. Basically I want to build multiple agents which does different kinds of work, like one agent is getting data from internet as you are doing, others could be like interacting with web applications like for automated testing, then other AI agent for calling APIs and returning response , then other AI Agent for generating code.. so I have many in mind, but i am stuck how to acheive this. like if a question is asked it has to first identify which AI Agent to pick and then that agent to perform specific action and return the response. I want everything opensource, that is without any GPT pros or Gemini pros or any paid versions. and I want it local so that the data which we send does not go sit in to the cloud. Request to kindly guide me in building this..

@viyye Месяц назад

Shouldn't the challenge be to make the local not so advanced LLM work as good advanced ones through prompting

@prometheuswillsurvive3700 Месяц назад

How to run Llama3-TenyxChat-22B in ollama ? what is the way to do it ? ollama cant find and download it . also tenyxchat-70b so

@HassanAllaham 2 месяца назад

Thank you for the excellent content.🌹🌹🌹 I made a look at the code and I have a question: How do you guarantee that the loop cumulative text that goes back to the model does not exceed the window context size of the model? If it exceeds that limit it will make the model less effective. Am I right or not? If I am right then how can we solve this issue?

@andynguyen8847 2 месяца назад

I ran into this exact problem when I used my method, kinda similar to his, and to get around the exceeding memory context. I just monitor the length of the memory context on each iteration and when it is near the limit, I just do a prompt to get the llm to extract the queries, answer, citation/link while keeping it short, clear and concise. I then use that as my new memory context, normally it's like 1/3 of the old memory context length. I've gotten llama3 70b to answer the last question a few time though not consistent so prompting def can improve the outcome by a lot.

@Data-Centric 2 месяца назад

Thanks for having a look at the code. In short, there is nothing at present to deal with the issue of context length. However, if I was to address it I would consider the following options: 1. Use a sliding window of a fixed context length. 2. Consider having a long term memory store for previous agents, something like a vector store perhaps. You could then retrieve context on the go as a when required. This is obviously more complex.

@wowzerxx526 2 месяца назад

i love you

@8eck 2 месяца назад

The power of Perplexity, is it's millions of crawled websites. Anyone can create Perplexity alternative, but only few can scale crawler to such scale. Plus creating such a generic crawler on a such scale.

@Data-Centric 2 месяца назад

Absolutely, doing it at scale is a different ball game entirely. Perplexity is impressive from an engineering perspective.

@8eck 2 месяца назад

@@Data-Centric add to this embeddings generation pipelines of millions of texts with all others techniques on top. I'd love to see their internal architecture someday.