No video :(

Serving Machine Learning models with Google Vertex AI

ML Engineer

Подписаться 2 тыс.

Просмотров 9 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

27 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 33

@markosmuche4193 27 дней назад

Your lectures are invaluable sir. Thanks.

@ml-engineer 25 дней назад

Thank you 🫡

@linkawaken Год назад

Hi Sascha! I think a great differentiator between your videos and the official ones is your ability to be the official "unofficial guide" to ML engineering on GCP i.e. the most-trusted source with less fluff and more practice. Please continue mentioning tradeoffs, limitations etc. without reservation, as the official videos are not as quick to make these clear upfront. Some content suggestions: I would love to see a deep dive on Custom Prediction Routines that demo's more sophisticated use of the *preprocessing* of features, including how to manage dependencies appropriately (e.g. `setup.py`?) - I didn't see this covered in the labs / blog / notebook examples. Perhaps an overall strategic view of Vertex AI - how to think of all the components as a set of Legos that can be pieced together. This isn't emphasized enough out there.

@ml-engineer Год назад

Thank you so much for this amazing feedback. Love your content suggestions around custom predictions routines and the overall overview of Vertex AI.

@linkawaken Год назад

@@ml-engineer In terms of Custom Prediction Routines, what's not obvious and would benefit from a code example at least is how to address the preprocessing on the training side: CPR will facilitate the preprocessing for the prediction container (serving), but what about putting the training preprocessing into a container so that this can all be automated e.g. in a Pipeline? It seems a best practice would be to write the preprocessing once, and then make this available to both the training and serving containers, but CPR only addresses the latter, at least in docs. It is almost as if the expectation with CPR is that one would do their training in a notebook, but I am pretty sure that's not technically necessary.

@linkawaken Год назад

@@ml-engineer Another note: you mentioned as a best practice to bake the model into the container, although I suppose a tradeoff would be that you have rebuild the container each time that the model is trained? This seems to be a tradeoff to consider depending on the expected amount of retraining vs autoscaling, and maybe some other things I didn't consider. For instance, consider a Kubeflow pipeline that retrains a small model prior to making the latest batch of predictions, doing this every time.

@ml-engineer 11 месяцев назад

Yes it depends on the model size. If it is a small model few MB there is no need to embedd the model into the container. Though the additional effort to do that is just one line of code in your couldbuild.yaml even if model is small the additional impoementation efforts are small.

@user-it1sf6ml9u Год назад

Hi! Sascha, thank you very much for the video, it was very useful for our team! Can I ask a question about pricing the you've mentioned? Does the price from video only include custom container functionality or it is total price for any AI solution which could be made with trained vertex model? - because from my understanding, in order to use vertex model (for image recognition, for example) you should deploy your model to google endpoint (which will cost you 2$ per 1hour which means - around 1500$ per month ). Am I right?

@ml-engineer Год назад

Hi Михаил, the costs I mentioned in the video are only for the custom container. If you use other ML products on GCP you have a different pricing. I assume you are refering to the AutoML capabilities of Vertex AI? Those have a dedicated pricing model. cloud.google.com/vertex-ai/pricing#automl_models. For example AutoML Object Dectection costs around $2.002 USD per node hour, this sums up to your calculated $1500 thats correct.

@bristobalpy Год назад

Can you make a video implementing the third option ? Using custom prediction routines ?

@ml-engineer Год назад

Hi Christóbal yes, you're not the first one asking for a custom prediction routine video. I start preparing it in th next couple of days.

@chetanmunugala8457 5 месяцев назад

@@ml-engineer were you ever able to figure this out? i am frustrated with the fact that i cant take the exported model from vertex ai and run it locally

@ml-engineer 5 месяцев назад

@@chetanmunugala8457 what exported model? Are you referring to an AutoML model?

@RedoineZITOUNI 4 месяца назад

Hi Sascha, thanks for your video :) Just one question. Does a custom model written by inheriting sklearn.base.BaseEstimator fit to this use case ?

@ml-engineer 3 месяца назад

Yes since it's a docker container it works with anything no limitations.

@user-tv2yt4eb1x 11 месяцев назад

The build fails for the service account with an error "...access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist)"

@ml-engineer 11 месяцев назад

Did you check if the service account has indeed the right permission and the bucket is existing, and also check if you are referencing the correct project.

@Simba-qm5qs 21 день назад

Is it necessary to go for Cloudbuild ? Can I just create a GitHub action to achieve this ?

@ml-engineer 20 дней назад

Hi Simba Two things are necessary with those custom containers. 1. You need to build the container. This can be done locally, cloud build but also with GitHub actions. You are fully flexible here. 2. The docker image needs to be uploaded to Google cloud container registry / artifact registry.

@Simba-qm5qs 20 дней назад

@@ml-engineer many thanks 🙏

@enricocompagno3513 7 месяцев назад

Nice video! What one needs to change to run batch prediction with a custom container?

@ml-engineer 7 месяцев назад

No need to change anything you can use the same container. Google spins up distributed infrastructure to run predictions in parallel.

@evanklim-t Месяц назад

@@ml-engineer Hi! I have created a custom prediction container with help from your video. It functions as expected for online predictions. However, for batch predictions, it seems to only perform health checks. My logs repeatedly show "GET /health HTTP/1.1" 200" and nothing else after that. Is there a code change I need to make in order to handle batch predictions, or could the problem be coming from somewhere else? Thank you!

@ml-engineer Месяц назад

What have you defined as batch output BigQuery or CloudStorage check there. In case of errors you should get an error file there. Logging with Batch Prediction jobs is unfortunatly not working out of the box. I wrote an article about that topic: medium.com/google-cloud/add-cloud-logging-to-your-vertex-ai-batch-prediction-jobs-9bd7f6db1be2

@user-ti3xs1xx6l 9 месяцев назад

Can you give an example of creating an endpoint with GPU?

@ml-engineer 9 месяцев назад

I go into it in my article medium.com/google-cloud/serving-machine-learning-models-with-google-vertex-ai-5d9644ededa3 And here is also a code example that uses a serving container with GPU github.com/SaschaHeyer/image-similarity-search It pins down to the base image, which needs to support GPUs. This could be either for example the TensorFlow GPU image or pytorch what ever you prefer. For example gcr.io/deeplearning-platform-release/pytorch-gpu Regarding the deployment you only need to add an accelerator !gcloud ai endpoints deploy-model 7365738345634201600 \ --project=sascha-playground-doit \ --region=us-central1 \ --model=8881690002430361600 \ --traffic-split=0=100 \ --machine-type="n1-standard-16" \ --accelerator=type="nvidia-tesla-t4,count=1" \ --display-name=image-similarity-embedding

@realsushi_official1116 Год назад

Hello great video. I have been overlooking the models/schemata step so now I'm figuring out that the parsing of request & response was incorrect. Any technical details available on this, is it a feature from autoML? Another solution would be to export the schemata from openapi.json to yaml and provide it at model upload. Not tried yet though.

@ml-engineer Год назад

Thank you. Are you referring to the format that the request and response need to follow?

@realsushi_official1116 Год назад

@@ml-engineer Yeah so based on a standard application, route functions needed to be modified alongside with models. I wasn't rigorous enough to notice it from the start. in fact the changes are very simple.