only for now. but not in the near future. it will solve entire problems by itself. like for example if an artist and a team of artists work together to create a photorealistic 3D image based on planning, would at some point be considered an entire problem. but now top models can get photorealistic outputs in seconds, where the picture is indistinguishable from an actual photograph. so it replaces the hours of work and problem of an artist or an entire group of artists, and finishes it faster than them
@@businessmanager7670 No, that's just your 2 cents prediction, which is probably only applicable to very simple use cases, the first problem being how do you input what you really want (that part is ironically commonly called "code") and then the result also have to match that, which is far from actually being accurate, also needs to be coherent with the full picture which is another challenge Same for your artistic example which may not suit the standards the team is aiming for, producing an output doesn't mean producing an output you like, unless you're speaking about simple use cases and aren't picky at all.
@@heroe1486 not it can be applicable in complex cases too depending on how much better the ai systems get in intelligence. we have only Seen improvement and increase in intelligence and not the other way around. so my prediction is based on Evidence. and for the 2nd point you made, sure when subjectivity is involved then it is not necessarily the case that one will like the output produced by the ai although it may accomplish the tasks that you intended, but that's in the same exact way that another human who is smart can make something for me and although his product accomplishes my task, i may not necessarily like it or the way it works. LOL
As an AI engineer with an experience of running businesses I feel the need to comment. I think saying " you should differentiate by building a more complex solution" is not necessarily a good advice. Many times you will find people that sell very naive and simple software and printing money because they have really good distribution. You can also find an amazing piece of software without clients because of very poor distribution and go-to-market strategy. I think the best scenario is when you have differentiation in your software AND good distribution. Which is what you're doing by creating content on your youtube channel about your AI solution ;)
Given that Open AI just killed a bunch of startups and the dreams of people hoping to make a quick buck, this advice is very timely. Well said, excellent advice that really hits home.
I've heard this wording so much, but they didn't kill anything. Those startups were not sound from the very start. Basing your product on the single API of another company to do most of the work with no exit plan is incredibly risky for a business.
@@brainmaxxing1agreed, it's basically just repackaging and selling for a higher price. Even in the non software market, that's incredibly risky, especially if you only have 1 product from 1 supplier!
Yeah, this wording misrepresents what’s been happening here. OpenAI absolutely should continue to improve and iterate upon ChatGPT. The fact that an entire startup can be ‘killed’ by the addition of a feature should be the tell here
In my experience, 90% of what customers are asking for is “put a chatbot on my website”. They’d rather drop a few cents per interaction and throw the problem at OpenAI APIs, than pay a dev to setup and maintain a chain, do fine tuning, train a model etc. If it ever scales up, then the value add would be optimization. But in the case of building a product business around AI, I totally agree with you.
Well that’s a good example of a model which is not worth training yourself: chat. It’s fast enough, probably cost enough due to the speed at which a user can ask questions and definitely worth it (it’s cost effective vs paying an agent)
I built my AI API on top of vector Databases with zero-shot classfication to determine which AI model to use for the specific task.. You can pass label variables such as ["Greetings","Coding task","Search query"] . This can reduce the cost of running your AI chatbot service. Plus if you design a smaller local model that knows when it doesn't know the answer you can outsource to a different model with zero-shot classification.
I love that you encourage being conservative about AI usage. I feel like way too many projects have AI just because it's trending, without considering the utility vs computational cost / determinism.
This is always the way with trending technologies and wonder how many developers out there still wade through hell every single day because someone decided their tiny startup needs to be powered by microservices for no good reason.
As someone who has been writing software for 3 decades, I agree with your approach. You can’t just throw AI at a problem. Automation is more than just AI.
Love your tone, speed, clarity, style of presentation and most importantly, confidence in your voice because you have done it yourself and did a great job. Kudos to you Steve !!
Already read the blog post this morning, found it very good ! So good that, when I saw the video on my youtube feed, I click on it to be sure to not miss any infos ! Great work 🔥
FYI there is a failure of direct retrieval with GPT-4 using the new OpenAI Assistant API. GPT tokenizes text and creates its own vector embeddings based on its specific training data. The new terms and sequences may not connect well to the pretrained knowledge in GPT's weight tensors. There was no semantic similarity between the new API terms and GPT's existing vector space. This is a fundamental issue with retrieval augmentation systems like Rag - external knowledge is not truly integrated into the model's learned weights. Adding more vector stores cannot solve this core problem.
very good point. hadn't really thought of indexing like that before. Its kind of like a memory cache that stores extra info on top of the LLM, but its not permanent and needs to be added to any request sent to the model
Was having the same thoughts, was thinking about some AI projects and when searching on Google to see if it has already been done there were always plenty of results who were just basically an input form and a submit button to feed it to GPT3/4 API and get a basic response. It's all about tge whole pipeline to get something useful and original, thanks for the real world showcase
Excellent video. One issue you haven't mentioned is explainability (even though it's related to debuggability). Very important and becomes nearly impossible with a big black box model.
Hey man great video, but when editing and/or cutting out filler words can you cut more of the dead space as well? The dialogue ends up with weird pauses every 10 or so syllables
Great video! Adding your own value is what differentiates a good business from a poor one! How did you train the image recognition model to the point it’s accurate without the very pricey GPU power needed for training an accurate model? Is the more specific your problem is, the less data you need to train it?
Nice video! In my project I use only one LLM but I do a lot of embedding beforehand, what are your thoughts on this approach? It is worth noting that my input output is entirely text based, usually one paragraph both ways.
"Use the location of the image as output data, and the screenshot of the image as the input data" 🧠. Great video as always. Would you be able to expand a bit on what your LLM do btw? I'm still not sure what happens between the "initial code" and the "customized code", I haven't tried Mitosis, but I'm assuming it generates good enough code? 🤔 Mitosis itself is pretty interesting, the fact that you transform your generic code into multiple frameworks 👌✨
thanks for the info.. i'm thinking that they are not charging you for how much data the model is trained on, cause it doesn't affect inference cost.. but i'm not sure maybe they amortize in the training cost
00:01 Building AI products that are unique, valuable, and fast requires a different approach than what everyone else is doing. 01:31 Large language models are costly and slow for most use cases 03:09 LLMs have limitations in terms of customization and performance 04:45 AI products are not built with a single super smart model, but with a tool chain of specialized models connected with normal code. 06:26 Build AI solutions with a counterintuitive approach and start by exploring the problem space using normal programming practices. 08:06 Automating the generation of data for training AI models using Puppeteer. 09:44 Using specialized AI models and layering techniques, Figma is able to produce high-quality code for responsive websites with a wide variety of options 11:21 Use AI for as little as possible in AI products Crafted by Merlin AI.
I could also see a benefit to creating a "toolchain" with only certain parts using AI as the ability to override that AI if it makes a mistake. I think one of the biggest flaws of AI is that while it can produce incredible results, it is never "guraranteed" to work in the same way that a lot of normal algorithms can be guaranteed to work through rigorous logic. This is why AI is best suited for subjective applications where you really need to understand user's intent in a way not possible with normal code. By using AI only in specific points, it opens the way for a user to more easily tweak the results to their liking. For example, in the website builder product, if a full stack AI produced a website which combined images in an awkward way the user didn't intend, the only recourse would be to either go in and fully edit the code to fix the issue, or to edit the FIGMA model and hope the AI interpreted it better the next time. But with the AI being localized to the specific selection of images, it would probably be possible to have a manual "merge elements into image" function which would specifically override the AI's decision about which elements combine to form an image, and then plug this result into the rest of the model which can create this image in the resulting code. I've been saying for a while that while AI is really useful as a tool, it is not always the most accurate and any AI product needs to consider what AI is good at, what it might get wrong, and how best to implement it in a way that allows the user to check/fix the result of the AI if needed.
That's what im trying to tell everybody but people just seem too invested in it, from professors to students, to random business people, if AI takes over, there will be no Quality, especially in software
One of the most insightful talks with example that I have seen on AI and productisation. This aligns with the books and talks that I have consumed over the years, most notably from Andrew Ng's lectures and textbook. For all those reasons precisely outlined in the talk, it's first principle to see any system composed of specialised components, a philosophy Unix itself demonstrate. Thank you for all your finding and sharing!
I do the same approach. Your tool has some good auto responsive, but others are really bad. I would suggest that instead of trying responsive directly, try to generate first a wireframe from the component and try to do the responsive there and convert back. The wireframes are easier to do responsive, and there are a lot of examples out there. Also, converting to wireframe I think it is an easy task. You can find opensource plugins for auto wireframe or plugins for skeleton and start from there. Thanks for sharing your thoughts.
Awesome. I think exactly like you about how to use AI to develop inovative products. Can you detail a little more which model you used as a base and some steps on how to train it?
To make sure that I understand your point, you mean to avoid only connecting with LLM such as open AI but to build a unique customized product by keep training the machine to reach the fine tuning level of product right?
that was amazing, thank you! definitely going to be checking in on all your other videos! keep it up! *cheers! might need some advice tbh... but shall see... thank again!
Not to rain on your parade but that autonomous driving example is factually wrong - Tesla FSD v12 specifically IS a one model to rule them all. They did start from multiple specialized models for each task, as you stated, but they have eliminated more and more heuristics to the point that FSD 12 had something like 95% of all glue code gone. Elon famously said that the latest model is literally photons in-direct car controls out, which obviously applies for training as well, which doesn't work with bunch of smaller models. Now, Tesla is bit of a different beast compared to a small company building their first AI product, so definitely agree on the swarm/team way of working for those use-cases.
To the choice of example with self driving cars I want to say: It's true that systems are built this way at the moment. Compositions of small highly specialized AI models. But that is no indication that it's the right or even a good way to build these systems. The best analogy I have is, if you want to build a chess engine. Do you build an AI that becomes super good at moving rooks? And another one that is an expert at moving pawns and queens and so on? And then glue them together using conventional code? Off course not. My end to end system that plays any piece using a large network will always beat the ensemble of expert models.
I have to say this is such an effective way to market your product. Offering legitimate advice and not just trying to beg people to be interested in it. Or maybe it’s not even your intention to market it because I don’t know if people who would watch these videos are in the UI / UX space but either way it comes across as genuine and helpful!
WOW what a video... 🦾 You just solved 3 of my current problems with one video: 1. How to train your own web code creator? 2. Do I have to use OpenAI or shall I build and train a custom model and flow? 3. How can I generate the code from an example website?
This cleared so much confusions! i'm gonna refer this video as one of the best Videos on AI build.I have an incredible passion for genearative AI. Your words are gem.
This is absolutely amazing. And you're 💯 correct. I am building a product myself, and yes, I have learnt recently myself, that let alone custom trained models, even GOFAI (Good old-fashioned AI) or expert systems can often get a job done in a fraction of the time. So yes, I guess this video just gave me more confidence in my own approach. I'll absolutely try to build what I can from scratch before going to the API call. All the potential pitfalls you highlighted such as performance and cost are absolutely essential to consider for my case, since I am building for a 3rd world market and a real-time use case. Thanks a lot. You're awesome to share this. This was an the easiest subscribe of the year for me. Keep it coming my man. You rock.
Given that events that have occured in the last few days im now very weary of using OpenAI going further. With Sam Altman's departure the company seems like it could fail miserably now, and microsfot has just brought him in to build in house tech.
very smart solution how you used the html to locate their position and overlay it with the screenshot. Your approach of not relying on AI is something I highly share. I am developing a SaaS of which the current codebase is 100% code but I know AI can improve it massively trough Image classification. I am really curious how to implement the AI once you train it. Is this done trough an API route ?