Тёмный

Question 14: Interview question for data engineers  

pysparkpulse
Подписаться 1,8 тыс.
Просмотров 361
50% 1

In this video I have discussed on the question that was asked in an MNC interview for data engineer checking whether the interviewee have worked with JSON files or not.
You are tasked with processing a JSON file containing information about sales transactions. Each transaction record consists of the transaction ID, the customer ID, the product ID, the quantity sold, and the timestamp of the transaction. Your goal is to analyze this data using PySpark and perform the following tasks:
Calculate the total sales revenue generated from each product.
Identify the top-selling product.
Determine the total number of transactions for each customer.
Find the customer who made the most transactions.
Sample Json
[
{"transaction_id": 1, "customer_id": 101, "product_id": 1, "quantity": 2, "timestamp": "2024-01-01 08:00:00"},
{"transaction_id": 2, "customer_id": 102, "product_id": 2, "quantity": 1, "timestamp": "2024-01-01 08:30:00"},
{"transaction_id": 3, "customer_id": 103, "product_id": 1, "quantity": 3, "timestamp": "2024-01-01 09:00:00"},
{"transaction_id": 4, "customer_id": 101, "product_id": 3, "quantity": 1, "timestamp": "2024-01-01 10:00:00"},
{"transaction_id": 5, "customer_id": 102, "product_id": 1, "quantity": 2, "timestamp": "2024-01-01 10:30:00"},
{"transaction_id": 6, "customer_id": 103, "product_id": 2, "quantity": 2, "timestamp": "2024-01-01 11:00:00"}
]
To create dataframe
sales_df = spark.read.option("multiline",True).json("dbfs:/FileStore/transaction.json")
#pyspark #mnc #dataengineer #azure #databricks #interview #questions #bigdata #bigdataquestions #json

Опубликовано:

 

13 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 2   
@rawat7203
@rawat7203 6 месяцев назад
Thankyou sir
@pysparkpulse
@pysparkpulse 6 месяцев назад
Thank you for your appreciation 😊
Далее
A small kitten was dumped #cat #kitten #cutecat
00:41
10 PySpark Product Based Interview Questions
39:46
Просмотров 18 тыс.
What does a Data Analyst actually do? (in 2024) Q&A
14:27