2024. 11. 22. 08:31ㆍ클라우드/GCP
'다른 리전에서 bigquery 테이블을 복사하기' 인데 composer에 대해 공부하고 싶어서 수행했다.
Task 1. Create a Cloud Composer environment
수행 과제 :
Click the dropdown for Show Advanced Configuration and select Airflow database zone as us-east4-b.
Task 2. Create Cloud Storage buckets
composer 생성시 자동으로 storage 생성됨
Create a bucket in US
Create a bucket in EU
Task 3. Create the BigQuery destination dataset
Task 4. Airflow and core concepts, a brief introduction
Airflow is a platform to programmatically author, schedule and monitor workflows.
Use airflow to author workflows as directed acyclic graphs (DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies.
Core concepts
DAG - A Directed Acyclic Graph is a collection of tasks, organized to reflect their relationships and dependencies.
Operator - The description of a single task, it is usually atomic. For example, the BashOperator is used to execute bash command.
Task - A parameterised instance of an Operator; a node in the DAG.
Task Instance - A specific run of a task; characterized as: a DAG, a Task, and a point in time. It has an indicative state: running, success, failed, skipped, ...
Task 5. Define the workflow
Task 6. View environment information
composer의 상태 체크
생성한지 얼마 안되어 색이 칠해진 부분이 짧다
cloud shell 접속
Creating a virtual environment
1. Install the virtualenv environment:
$ sudo apt-get install -y virtualenv |
$ python3 -m venv venv |
$ source venv/bin/activate |
Task 7. Create a variable for the DAGs Cloud Storage bucket
DAGS_BUCKET=<your DAGs bucket name> |
Task 9. Upload the DAG and dependencies to Cloud Storage
cd ~ gcloud storage cp -r gs://spls/gsp283/python-docs-samples . |
2.Upload a copy of the third party hook and operator to the plugins folder of your Composer DAGs Cloud Storage bucket:
gcloud storage cp -r python-docs-samples/third_party/apache-airflow/plugins/* gs://$DAGS_BUCKET/plugins |
gcloud storage cp python-docs-samples/composer/workflows/bq_copy_across_locations.py gs://$DAGS_BUCKET/dags gcloud storage cp python-docs-samples/composer/workflows/bq_copy_eu_to_us_sample.csv gs://$DAGS_BUCKET/dags |
Task 10. Explore the Airflow UI
Key | Value | Details |
table_list_file_path | /home/airflow/gcs/dags/bq_copy_eu_to_us_sample.csv | CSV file listing source and target tables, including dataset |
gcs_source_bucket | {UNIQUE ID}-us | Cloud Storage bucket to use for exporting BigQuery tabledest_bbucks from source |
gcs_dest_bucket | {UNIQUE ID}-eu | Cloud Storage bucket to use for importing BigQuery tables at destination |
+ 이건 airflow 콘솔 화면
composer와 airflow에 대해 더 공부해야겠다. 아는 것이 너무 없다 ㅜ
참고 : https://www.cloudskillsboost.google/focuses/3528?parent=catalog
'클라우드 > GCP' 카테고리의 다른 글
16. Connect an App to a Cloud SQL for PostgreSQL Instance (0) | 2024.11.24 |
---|---|
15. Migrate to Cloud SQL for PostgreSQL using Database Migration Service (0) | 2024.11.23 |
13. Cloud Logging and Monitoring for BigQuery (0) | 2024.11.21 |
그림으로 배우는 구글 클라우드 101 (2) | 2024.11.20 |
12. BigQuery Soccer Data Analytical Insight (0) | 2024.11.20 |