#spark #interviewquestions #dataengineers #pyspark #sparksql
question: Count the number of movies in each genre?
df = spark.createDataFrame([('The Shawshank Redemption',['Drama', 'Crime']),
('The Godfather', ['Drama', 'Crime']),
('Pulp Fiction', ['Drama', 'Crime','Thriller']),
('The Dark Knight', ['Drama', 'Crime','Thriller','Action']),
],["name", "genres"])
checkout:
/ poojatripathi0697
9 июн 2024