How to Run Airflow Locally without Docker!

The Data Guy

Подписаться 9 тыс.

Просмотров 6 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

2 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 31

@oiwelder 7 месяцев назад

Yesterday I needed to carry out a local test on my machine and ended up testing AirflowCTL, the installation was practical and quick, I even had time to create an architectural medallion using Spark.

@thedataguygeorge 7 месяцев назад

Wow awesome to hear, will pass the positive feedback along to the developers!

@bananaboydan3642 10 месяцев назад

Thank u so much for this dude, thought I’d have to do all the docket config stuff for my simple script to just extract from an API weekly and load the parquet files into Redshift

@bananaboydan3642 10 месяцев назад

docker*

@thedataguygeorge 10 месяцев назад

No worries man, thats exactly what this approach is for! Get a nice easy set and forget local environment to run your jobs!

@danielopanubi2525 6 месяцев назад

This is the quickest airflow installation I have EVER seen, It makes me a bit giddy, cant wait to try it out. You mentioned its not great for production scale, hows that?

@thedataguygeorge 6 месяцев назад

Hahahah, def try it out! Mainly because it's pretty lightweight and only supports local executor, so trying to run a lot of DAG's this way will quickly fall apart.

@BOSS-AI-20 8 месяцев назад

On windows 10 I'm getting these Errors ERROR: Could not find a version that satisfies the requirement airlowctl (from versions: none) ERROR: No matching distribution found for airlowctl Any solution ?

@thedataguygeorge 8 месяцев назад

I'm not sure it supports windows machines, so that might be why!

@danielopanubi2525 6 месяцев назад

If the error you posted here was copied directly from your terminal, it could be because you misspelled "airflowctl", I see airlowctl, could be the issue.

@BOSS-AI-20 6 месяцев назад

@@danielopanubi2525 Now I'm using airflow directly on docker without any command line tool. It is better and easy

@BOSS-AI-20 6 месяцев назад

@@danielopanubi2525 Also as a data engineer you should know docker and containerization concepts

@BOSS-AI-20 8 месяцев назад

which is best to run airflow on linux or docker?

@thedataguygeorge 8 месяцев назад

I think docker is always best!

@claudioroberto8129 9 месяцев назад

Man! Very, very cool! Muito Foda! In portuguese translation...Bom Demais...rsrs Thank u so much!

@thedataguygeorge 9 месяцев назад

Um grande obrigado a você também! Muito amor do outro lado da lagoa!

@thedataguygeorge 9 месяцев назад

Portugal is one of my all time favorite countries, so glad to hear I have supporters over there!

@SuperDoc3000 9 месяцев назад

This is amazing! Thank you for the video 🙌

@thedataguygeorge 9 месяцев назад

Thanks Superdoc!!

@mediapinger 7 месяцев назад

what is the difference using airflow without docker, and with docker? if my project just using airflow, so docker is not necessary?

@thedataguygeorge 7 месяцев назад

Docker gives you a more production-like environment, and is more stable when deployed for long periods of time. If you're planning to host your project, definitely would recommend using a docker image!

@danielopanubi2525 6 месяцев назад

Can you help? whenever I try to initialize a new project I get error ScannerError: while scanning a double-quoted scalar in "Documents\airflow\test1\settings.yaml", line 10, column 16 expected escape sequence of 8 hexadecimal numbers, but found 's' in "Documents\airflow\test1\settings.yaml", line 10, column 21 I have looked at the yaml file and it seems fine, I even activated the venv and it worked okay, this is really bugging me

@thedataguygeorge 6 месяцев назад

Would you mind showing me the file? Try with a fresh yaml file, sometimes characters can get encoded wrong and it can look fine while still causing issues.

@aravindakrishna8640 2 месяца назад

i got the same issue how did you resolve it

@lostfrequency89 Год назад

Hey nice video. Btw need a suggestion, is there a way to unzip a file in s3 and move the unzipped file to another bucket without using lambda function

@thedataguygeorge Год назад

Oh yeah definitely, since I don't think S3 supports unzipping a file within a bucket, you'd want to create a python task in airflow that uses the boto3 library to ingest the file, unzip it in a buffer or on some attached/on-disk storage if you're using a custom xcom backend that's big enough, before uploading the unzipped file into the second bucket! Check out this example below for the python task part of it! If you're still stuck, just let me know and I can help figure out what code you need! medium.com/@johnpaulhayes/how-extract-a-huge-zip-file-in-an-amazon-s3-bucket-by-using-aws-lambda-and-python-e32c6cf58f06

@lostfrequency89 Год назад

@@thedataguygeorge wow thank you so much! I was able to do it with buffer memory. 🫡😃. However, is there any limitation with buffer? I will be processing mainly around 20-30 mb sizes of file. It’s okay right ?