ML Engineer

21
91 030

Hello Folks 👋
My name is Sascha and I am excited you are interested in the content I provide.
I work as a Senior Machine Learning Engineer at DoiT International, 2020 + 2022 Google Cloud Global Partner of the Year.
I had the honor to work with over 60+ different customers helping them apply ML and put their ML solutions into production.

My video covers best practices I learned working with many very talented teams around the world. Really looking forward to sharing this with you.

If you like my content please subscribe :)

Комментарии

@Simba-qm5qs 21 день назад

Is it necessary to go for Cloudbuild ? Can I just create a GitHub action to achieve this ?

@ml-engineer 20 дней назад

Hi Simba Two things are necessary with those custom containers. 1. You need to build the container. This can be done locally, cloud build but also with GitHub actions. You are fully flexible here. 2. The docker image needs to be uploaded to Google cloud container registry / artifact registry.

@Simba-qm5qs 20 дней назад

@@ml-engineer many thanks 🙏

@AdhvaithG 23 дня назад

@ML Engineer once the experiment is done, how to choose a particular run and do the model registration and deployment?

@ml-engineer 23 дня назад

@@AdhvaithG Hi To find the best experiment you can either use the UI or the API by for example comparing your experiment metrics. Also the experiments SDK allows to store artifacts for each experiment. Those artifacts can be your model file which you can use to import it as a model to the model registry.

@AdhvaithG 23 дня назад

@@ml-engineer thank you so much for your prompt help ❤️🙏

@markosmuche4193 27 дней назад

Your lectures are invaluable sir. Thanks.

@ml-engineer 25 дней назад

Thank you 🫡

@AdhvaithG 2 месяца назад

Hi Sascha.... my name is Srini and I wanted to take a moment to express my appreciation for your videos. They have been incredibly helpful as I learn about GCP, and your clear explanations make complex topics much more accessible. Currently, I am trying to replicate this example in my GCP project. To run the code as it is, I need to push a model to my GCP bucket. Specifically, I am facing issues with the following command: ['cp', '-r', 'gs://doit-vertex-demo/models/sentiment', '.'] Could you please share the model code or provide guidance on how to push it to my project bucket? As a beginner in GCP, your support would be invaluable in helping me build the image successfully. Thank you so much for your amazing content and assistance!

@ml-engineer Месяц назад

Hi Adhvaith for some reasons your comments do not show up in the youtube commenting editor. Therfore I missed to answer for 4 weeks. The model I used was trained as part of another video and article. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-GM-tibVly_A.html You can follow it and you end up with the same model. Let me know if that works.

@AdhvaithG 3 месяца назад

@ml-engineer can you please share the pipeline demo notebook?

@prakashsaragadam 3 месяца назад

Why it was running inside the Training Pipeline? All custom container pipeline run under training pipeline?

@ml-engineer 3 месяца назад

Hi the custom container run under custom jobs within Vertex AI Training. Its not a training pipeline.

@SahlEbrahim 3 месяца назад

that sitemap.xml file is not found. what to do now?

@khanna-vijay 3 месяца назад

Excellent explanation @ML Engineer. Thank you for the efforts.

@ml-engineer 3 месяца назад

Glad you liked it!

@christopheryoungbeck8837 3 месяца назад

I'm a junior engineer intern for a startup called Radical AI and I am doing exactly this process right now lol. You do a better job of explaining everything than my seniors.

@ml-engineer 3 месяца назад

appreciate your feedback Christopher.

@SahlEbrahim 2 месяца назад

Hi, did you run this code? I can't access the XML file as it is not found. How did you run this code.

@Pirake123 3 месяца назад

This is better than the GCP videos, amazing thankyou!!

@ml-engineer 3 месяца назад

Thank you Pirake

@aoezdTchibo 3 месяца назад

Reminds me very much of MLflow Experiments :)

@ml-engineer 3 месяца назад

Indeed the usage (API,SDK) is very similar

@RedoineZITOUNI 4 месяца назад

Hi Sascha, Thank you for this video ! I have one question : does a custom model which inherits from sklearn.base.BaseEstimator fit the pre-built container use case ? Or do I have to deploy a custom container with my custom model ?

@RedoineZITOUNI 4 месяца назад

Hi Sascha, thanks for your video :) Just one question. Does a custom model written by inheriting sklearn.base.BaseEstimator fit to this use case ?

@ml-engineer 3 месяца назад

Yes since it's a docker container it works with anything no limitations.

@HailayKidu 4 месяца назад

Nice , does the data is in xml format ?

@ml-engineer 3 месяца назад

the data format is flexible, its not limited to xml

@eduardomontero2901 5 месяцев назад

Is there anyway to log roc curve and custom tables?

@ml-engineer 3 месяца назад

as custom artifacts but you cannot visualize them in the Vertex AI Experiments UI. The only way to use them is by fetching them and render them yourself for example in a notebook.

@rajathrshettigar9808 5 месяцев назад

Can we set up alerts for vertex ai endpont to monitor the resource utilization (CPU and memory) and the number of instances during auto scaling ?

@ml-engineer 3 месяца назад

Yes those metrics are logged and you can use them to setup monitoring and alerts

@MOHAMMADAUSAF 6 месяцев назад

Hey awesome starter, just a question, given i have a index created with a bucket, if i were to add new files to the same bucket, will the index reflect the new data files, either by itself or even by triggering ? or simply put, how can i add new data from a bucket to an existing index without rebuilding entire index again, something equivalent of pinecone or weaviate upsert functionalities ? the docs arent helping me here

@rinkugangishetty5713 6 месяцев назад

I have content of nearly 100 pages. Each page have nearly 4000 characters. What chunk size I can choose and what retrieval method I can you for optimised answers?

@ml-engineer 6 месяцев назад

The chunk size depends on the model you are planning to use. But generally I highly recommend to have text overlapping within your chunks. Also finding the right sizes chunk size can be treated like a hyperparameter. To large chunks might add to much noise to the context. To narrow chunks my miss information.

@user-sz1by8hg8k 6 месяцев назад

Great video! can you do that inside a CustomContainerTrainingJob?

@enricocompagno3513 7 месяцев назад

Nice video! What one needs to change to run batch prediction with a custom container?

@ml-engineer 7 месяцев назад

No need to change anything you can use the same container. Google spins up distributed infrastructure to run predictions in parallel.

@evanklim-t Месяц назад

@@ml-engineer Hi! I have created a custom prediction container with help from your video. It functions as expected for online predictions. However, for batch predictions, it seems to only perform health checks. My logs repeatedly show "GET /health HTTP/1.1" 200" and nothing else after that. Is there a code change I need to make in order to handle batch predictions, or could the problem be coming from somewhere else? Thank you!

@ml-engineer Месяц назад

What have you defined as batch output BigQuery or CloudStorage check there. In case of errors you should get an error file there. Logging with Batch Prediction jobs is unfortunatly not working out of the box. I wrote an article about that topic: medium.com/google-cloud/add-cloud-logging-to-your-vertex-ai-batch-prediction-jobs-9bd7f6db1be2

@rbd2024 7 месяцев назад

what is the difference between Vertex AI Experiments and Hyperparamter Tuning?

@ml-engineer 6 месяцев назад

Hyperparameter tuning can scale your training and find best performing parameters. Whole experiments is for tracking experiments. Therefore there is a small overlap.

@zbynekba 7 месяцев назад

Sascha, you can significantly enhance the intelligibility of your presentation by improving the audio quality. The distracting sound reflections from your office walls make listening stressful. The easiest no-cost remedy is close-miking, such as using a headset microphone for recording. Alternatively, if you prefer speaking to a distant microphone during recording, you could consider some acoustic treatment for your office space.

@kanavdua4587 8 месяцев назад

Hi Sascha. I have been facing an error for the last 3 days. Please help me resolve it.

@ml-engineer 8 месяцев назад

Hi What kind of error?

@kanavdua4587 8 месяцев назад

I am not able to write it as a comment. I don't know why.

@kanavdua4587 8 месяцев назад

The DAG failed because some tasks failed. The failed tasks are: [concat].; Job (project_id = practice-training, job_id = 125471868915286016) is failed due to the above error.; Failed to handle the job: {project_number = 385236764312, job_id = 125471868915286016}

@ml-engineer 8 месяцев назад

@@kanavdua4587 you can check what happened in the logs for each step/ component in your pipeline.

@kanavdua4587 8 месяцев назад

@@ml-engineer Please can you guide me a little 🙏🏻🙏🏻. @component() def concat(a:str,b:str)->str: Logging.info(f"concatenating '{a}' and '{b}' results in '{a+b}' ") return a+b I am a beginner. I don't have any knowledge. Please help. return

@abdulsami6117 8 месяцев назад

Really Helpful Video!

@ml-engineer 8 месяцев назад

Thank you

@Tech_Inside. 8 месяцев назад

Hey, sir I want to know if I have any company documents locally so how can we use it ? And load data and one more thing does it provides answer exactly mentioned in pdf or documents or perform any type of text generation on output?

@ml-engineer 8 месяцев назад

You are fully flexible where you store your data, this could be local or on a Cloud Storage Bucket, a website all possible. The RAG approach takes your question is looking for relevant documents. Those documents are then passed together with your initial question to the LLM and the LLM answers the question based on this context. It is generating text yes. You can tweak that output further with prompt engineering. BTW google released a new feature that requires less implementation effort medium.com/google-cloud/vertex-ai-grounding-large-language-models-8335f838990f It is less flexible but works for many use cases.

@carthagely122 8 месяцев назад

Thank you for your job

@ml-engineer 8 месяцев назад

This could be also interesting for your Google just released a more or less managed RAG / Grounding solution: medium.com/google-cloud/vertex-ai-grounding-large-language-models-8335f838990f

@bivasbisht1244 8 месяцев назад

Thank you for the explanation , really liked it !! I was wondering if we use DPR (Dense Passage Retrieverl) on our own data and want to evaluate its performance like precision , recall and F1 score, if we have a small reference data which can serve as ground truth. Can we do that ? I am confused on the fact that since DPR is trained only on wiki data as far as i know, will it be nice to measure the efficiency of the DPR retrieval , when i follow this RAG approach ?

@jon200y 8 месяцев назад

Great video!

@ml-engineer 8 месяцев назад

Thank you

@sridattamarati 9 месяцев назад

Is palm 2 not open source ?

@Smart-ls6xi 9 месяцев назад

Hello, I have a question. If I am working with a team, is it one person who is supposed to have a vertexAI account that will be charged? Or will each user, though sharing the same project, be charged in their account?

@Smart-ls6xi 9 месяцев назад

@user-ti3xs1xx6l 9 месяцев назад

Can you give an example of creating an endpoint with GPU?

@ml-engineer 9 месяцев назад

I go into it in my article medium.com/google-cloud/serving-machine-learning-models-with-google-vertex-ai-5d9644ededa3 And here is also a code example that uses a serving container with GPU github.com/SaschaHeyer/image-similarity-search It pins down to the base image, which needs to support GPUs. This could be either for example the TensorFlow GPU image or pytorch what ever you prefer. For example gcr.io/deeplearning-platform-release/pytorch-gpu Regarding the deployment you only need to add an accelerator !gcloud ai endpoints deploy-model 7365738345634201600 \ --project=sascha-playground-doit \ --region=us-central1 \ --model=8881690002430361600 \ --traffic-split=0=100 \ --machine-type="n1-standard-16" \ --accelerator=type="nvidia-tesla-t4,count=1" \ --display-name=image-similarity-embedding

@user-xs1ek4tq5m 9 месяцев назад

Hi, I am getting below exception when i try to send a request with 2000 queries. Each query has one datapoint. grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNKNOWN details = "Stream removed" debug_error_string = "UNKNOWN:Error received from peer {created_time:"2023-11-10T17:52:57.98338902-05:00", grpc_status:2, grpc_message:"Stream removed"}" When I test for 300 queries, it works first several tries and then it halts by giving the same error. query = aiplatform_v1.FindNeighborsRequest.Query( datapoint=dp1, neighbor_count = 6 ) What could be the reason?

@emilioortega5992 9 месяцев назад

Can you share the google Colab script?

@ml-engineer 9 месяцев назад

Hi Emil sure I also have a deep dive article, video and code available article medium.com/google-cloud/google-vertex-ai-the-easiest-way-to-run-ml-pipelines-3a41c5ed153 video ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-gtVHw5YCRhE.html code colab.research.google.com/drive/1x2EGrZ1WdgBVfsihB5ihxnnMfL2mJJZo?usp=sharing

@AyushMandloi 9 месяцев назад

What is need to endpoints ? When u will be uploading more videos ?

@ml-engineer 9 месяцев назад

Hi Ayush what do you mean with your endpoint question? Recording 4 new videos about Generative AI on Google Cloud at the moment will be released in the next weeks.

@campbellslund 10 месяцев назад

Hi! Thanks for making this walkthrough, it was super helpful as a beginner. I was able to follow all the steps you detailed, however, when I try running the final product it produces the same context every time - regardless of the question I prompt. Do you have any idea why that might be? Thanks in advance!

@ml-engineer 9 месяцев назад

Hi Have a look into the embeddings to see if they are actually different. If not there is an issue during embedding creation.

@sheetaljoshi6740 10 месяцев назад

Can we run another pipeline from within a pipeline synchronously?

@sheetaljoshi6740 10 месяцев назад

can we call pipeline within pipeline?

@ml-engineer 10 месяцев назад

Yes absolutely using the vertex AI SDK or API from with the other pipeline.

@DSwithBhaiya 10 месяцев назад

hi , can you please tell me for vertex Api we have to pay ?

@ml-engineer 10 месяцев назад

Yes there is nothing for free on this planet except my videos =D cloud.google.com/vertex-ai/pricing

@tyronehou3553 10 месяцев назад

Great tutorial! Can you update algorithm parameters like leafNodeEmbeddingCount and leafNodesToSearchPercent on the fly? I tried using the gcloud update index command, but nothing changes when I describe the index afterward, even when the operation is complete

@TarsiJMaria 10 месяцев назад

Hey Sascha, I've been playing around a lot more, but i've run into accuracy issues that i wanted to solve using MMR (max marginal relevance) search. It looks like teh Vertex ai Vector store (in Langchain) doesn't support this, at least not the NodeJs version but if i'm not mistaken it's the same in python. Do you know what the best approach would be? As a workaround i'm overriding the default similarity search and filtering the results before passing it as context

@TarsiJMaria 10 месяцев назад

@@ml-engineer The problem is that i'm planning on having a service where a user can add any kind of document to for example a google drive and then embedding iets content. So it can range from docs, pdf, presentations. My current workaround works, where i filter out results lower than a specific score. But thats something you would want to solve on the vector store side instead of on my server side. Is there no way to set a score threshold when matching results?

@user-fp8hw2cr9c 10 месяцев назад

Hey, thanks for sharing this great video. My question is, what would happen if the answer to my query is in multiple documents, like more general questions related to all the documents?

@ml-engineer 10 месяцев назад

If it is in multiple documents the retrieval process will return multiple documents. If it is in all documents you will run into issue due to multiple reasons 1. The matching index / vector database are built in a way to return the top X matching documents. You can increase this value but if it is in all documents there is no need to find matching documents anymore, because its anway in all documents. 2. The matching documents are used as context when running our prompt. The context size depends on the model for normal models like Googles PaLM or OpenAI GPT it is around 8000 token. There are also larger version of 32000 token up to 100000 token but those come at a higher cost. In the end you need to evaluate if the number of documents fit your context.

@user-du9tp4xi6o 11 месяцев назад

Just noticed this channel. Great content, with code walkthroughs. Appreciate your effort!! Have got a question @ml-engineer : Is it possible to Question-Answer separate documents with only one index for the bucket? While retrieval or questioning from vector search, I want to specify which document/datapoint_id I want to query from. Currently when I add data points for multiple documents to same index, the retrieval response for a query match is based on globally from all the documents, instead of the required one. P.s. : I am using MatchingEngine utility maintained by Google.

@ml-engineer 11 месяцев назад

Yes you can use the filtering that matching engine is offering. cloud.google.com/vertex-ai/docs/vector-search/filtering with that the vector matching only operates on the documents that are part of the filtering results.

@user-du9tp4xi6o 11 месяцев назад

@@ml-engineer Thanks for the response. I believe we are supposed to add allow and deny list while adding datapoint_ids to index. And when we retrieve nearest neighbours, the "restricts" metadata is also returned.Then either we can filter directly OR pass the document chunk to llm with restricts metadata (Former is done in MatchingUtility of google -> similarity_search()) But there is still a change of getting datapoint_ids of other documents saved in the index, instead of the one I want to query from. I was looking for something where vector search engine automatically filters the datapoints based on input query. or GCP bucket where my chunks and datapoints are stored.

@ml-engineer 11 месяцев назад

Exactly you add those deny and allow lists when adding the documents to the index. After that you can filter based on query runtime. Can you describe this in more detail? - But there is still a change of getting datapoint_ids of other documents saved in the index, instead of the one I want to query from. Do I understand you are asking how you can automatically get the documents from Cloud Storage based on the retrieved documents from the Matching Engine back into the LLMs context?

@TarsiJMaria 11 месяцев назад

Hey, Thanks for the great content! I had 2 questions: With this setup, what needs to happen if you want to add new data to the vectore store? First we chunk the new document and create new embeddings and upload to the GS bucket, is that all or does something need to happen with the Matching Engine / Index? Other question, do you know if the LangChain Javascript library has any limitations in this use case?

@ml-engineer 11 месяцев назад

@@TarsiJMaria yes they are. If you already have it in python you at least can save yourself a bit of time.

@user-tv2yt4eb1x 11 месяцев назад

The build fails for the service account with an error "...access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist)"

@user-os8je7hq1v 11 месяцев назад

Hi Pls guide me how to run the same in vertex ai environment

@rezamohajerpoor8092 11 месяцев назад

Hi Sascha Thanks for the great video GCR is deprecated in GCP how should we modify the Cloudbuild.yaml to comply with the new requirements of Artifact registry?

@shivayavashilaxmipriyas9401 11 месяцев назад

Great Video !! Does the LLM gets trained here ? This is a major doubt here. Or is it just used as an engine for answering based on the embeddings and similarity ?

@o_o610 11 месяцев назад

Thank you so much for the video ! Do you know if Vertex AI Pipeline handle Pipeline versioning or historize the evolution of the pipeline ?

@ml-engineer 11 месяцев назад

I always recommend to put your pipeline code into git. This way you have the perfect pipeline version over time available. Is that what you meant with versioning?