Yesterday I needed to carry out a local test on my machine and ended up testing AirflowCTL, the installation was practical and quick, I even had time to create an architectural medallion using Spark.
Thank u so much for this dude, thought I’d have to do all the docket config stuff for my simple script to just extract from an API weekly and load the parquet files into Redshift
This is the quickest airflow installation I have EVER seen, It makes me a bit giddy, cant wait to try it out. You mentioned its not great for production scale, hows that?
Hahahah, def try it out! Mainly because it's pretty lightweight and only supports local executor, so trying to run a lot of DAG's this way will quickly fall apart.
On windows 10 I'm getting these Errors ERROR: Could not find a version that satisfies the requirement airlowctl (from versions: none) ERROR: No matching distribution found for airlowctl Any solution ?
If the error you posted here was copied directly from your terminal, it could be because you misspelled "airflowctl", I see airlowctl, could be the issue.
Docker gives you a more production-like environment, and is more stable when deployed for long periods of time. If you're planning to host your project, definitely would recommend using a docker image!
Can you help? whenever I try to initialize a new project I get error ScannerError: while scanning a double-quoted scalar in "Documents\airflow\test1\settings.yaml", line 10, column 16 expected escape sequence of 8 hexadecimal numbers, but found 's' in "Documents\airflow\test1\settings.yaml", line 10, column 21 I have looked at the yaml file and it seems fine, I even activated the venv and it worked okay, this is really bugging me
Would you mind showing me the file? Try with a fresh yaml file, sometimes characters can get encoded wrong and it can look fine while still causing issues.
Oh yeah definitely, since I don't think S3 supports unzipping a file within a bucket, you'd want to create a python task in airflow that uses the boto3 library to ingest the file, unzip it in a buffer or on some attached/on-disk storage if you're using a custom xcom backend that's big enough, before uploading the unzipped file into the second bucket! Check out this example below for the python task part of it! If you're still stuck, just let me know and I can help figure out what code you need! medium.com/@johnpaulhayes/how-extract-a-huge-zip-file-in-an-amazon-s3-bucket-by-using-aws-lambda-and-python-e32c6cf58f06
@@thedataguygeorge wow thank you so much! I was able to do it with buffer memory. 🫡😃. However, is there any limitation with buffer? I will be processing mainly around 20-30 mb sizes of file. It’s okay right ?