Тёмный

Visual Studio Code Extension for Databricks 

Dustin Vannoy
Подписаться 2,5 тыс.
Просмотров 13 тыс.
50% 1

In this video I show how I installed, configured, and tested out the VS Code extension for Azure Databricks. This provides a way to develop PySpark code in your Visual Studio Code IDE and run the code on a Databricks cluster. It works well with Databricks Git Repos so you can keep your team in sync whether they work in VS Code or in Notebooks on the Databricks workspace.
IMPORTANT UPDATE to how I explained this in the video:
The repo used for syncing from local will not be an existing Databricks repo if using the update version (0.3.0+). This is to avoid overwriting work done in the workspace. Instead it creates a Databricks repo to use only for syncing your code used when developing locally with this extension.
Please see the documentation for more details. Specifically, the warning in the documentation is, "After you set the repository and then begin synchronizing, any existing files in your remote workspace repo that have the same filenames in your local code project will have their contents forcibly overwritten. This is because the Databricks extension for Visual Studio Code treats the files in your local code project as the “single source of truth” for both your local code project and its connected remote repo within your Azure Databricks workspace."
learn.microsoft.com/en-us/azu...
* All thoughts and opinions are my own *
Databricks blog: www.databricks.com/blog/2023/...
Download from Marketplace: marketplace.visualstudio.com/...

Наука

Опубликовано:

 

20 фев 2023

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 18   
@alex_no_handle
@alex_no_handle Год назад
awesome content Dustin, was really helpfull for me that i'm a data eng jr, is good to know many ways to run or explore it downloading right now
@jaikermontoya8891
@jaikermontoya8891 Год назад
Thank you !!!. So helpfull
@RajanieshKaushikk
@RajanieshKaushikk Год назад
Very nice explanation and Video
@dhananjaypunekar5853
@dhananjaypunekar5853 23 дня назад
Thansk for the explanation! Is there any way to view exported DBC files in VS code?
@usmanrahat2913
@usmanrahat2913 8 дней назад
How do you enable intellisense?
@9966989500
@9966989500 Год назад
This is the best one ... I am fed up with running Note book from browser. Need to explore more on this
@ibrahimkhaleelullahshaik3506
@ibrahimkhaleelullahshaik3506 8 месяцев назад
I'm unable to install that particular databricks ,says corruptziperror:central command signature not found,any help on this?
@jasonweng2659
@jasonweng2659 Год назад
Great video! Is there any way we can run notebooks specific cell so that we don’t have to rerun something over and over? Thanks!
@DustinVannoy
@DustinVannoy Год назад
The extension doesn't support that in the versions I have tested.
@jyothsnareddy9689
@jyothsnareddy9689 Год назад
Hi, thank you so much . This helped me a lot! But can this code be deployed and run on cluster?
@DustinVannoy
@DustinVannoy Год назад
When you kick off from VS Code extension it runs against the interactive cluster you set, and you can run as a workflow so its easy to see it from the workspace. DBX is still an option for configuring jobs (with job compute) and deploying from local.
@CodeCraft-ve8bo
@CodeCraft-ve8bo 7 месяцев назад
Can we use it for AWs databricks as well?
@DustinVannoy
@DustinVannoy 7 месяцев назад
Yes, it works with AWS.
@brandonvolesky9867
@brandonvolesky9867 Год назад
How does one debug when running from databricks using dbx?
@DustinVannoy
@DustinVannoy Год назад
Databricks Connect v2 just became public preview and that is my preference for debugging code running on Databricks from local IDE. I have some other items to focus on but may come back with a better explanation of the role VS Code extension, DBX and Databricks Connect play in development. They compliment each other well.
@UltimaWeaponz
@UltimaWeaponz Год назад
@@DustinVannoy Hi, we are currently considering these options in my team atm. Specifically we are looking for the ability to degun on the DB cluster while inside VSCode. The extension is a limitation bc of its dependency on Unity Catalog. Did you end up releasing this video? Would be interesting to see your take on the 3 tools.
@gardnmi
@gardnmi Год назад
If your building packages then dbx is the obvious choice and without notebook support I'm not sure who this extension is intended for. Also, it's a bit surprising how featureless it is compared to the other databricks extension offered by paiqo.
@DustinVannoy
@DustinVannoy Год назад
Thanks for sharing your thoughts and calling out other options people can try. I see value in running PySpark code that are not notebooks in interactive mode with this extension. Packaging up the code and installing each time was a pain which is where databricks-connect was helpful. Once all the tests check out then committing and deploying would be a final step, ideally. And notebooks can be run they are just run as workflows. I am not sure if there is a good way to render the notebooks how VS Code renders .ipynb files since I usually do notebook development in the browser and Python/PySpark development in an IDE.
Далее
Databricks Extension for VS Code: A Hands-On Tutorial
16:12
Unity Catalog setup for Azure Databricks
9:40
Просмотров 14 тыс.
How to Set up VS Code for Data Science & AI
22:53
Просмотров 291 тыс.
VSCode Extension for Databricks!
19:24
Просмотров 15 тыс.