ะขั‘ะผะฝั‹ะน

Data Engineer PySpark Data Bricks Session Day 5 

Code with Kristi
ะŸะพะดะฟะธัะฐั‚ัŒัั 1,3 ั‚ั‹ั.
ะŸั€ะพัะผะพั‚ั€ะพะฒ 119
50% 1

๐Ÿ“ ๐’๐ญ๐š๐ซ๐ญ ๐š ๐’๐ฉ๐š๐ซ๐ค ๐’๐ž๐ฌ๐ฌ๐ข๐จ๐ง : Set up the PySpark environment.
๐Ÿงฃ ๐‚๐ซ๐ž๐š๐ญ๐ž ๐š ๐‹๐ข๐ฌ๐ญ : Define the list with three elements.
๐Ÿ“ข ๐๐š๐ซ๐š๐ฅ๐ฅ๐ž๐ฅ๐ข๐ณ๐ž ๐ญ๐ก๐ž ๐‹๐ข๐ฌ๐ญ : Distribute the list across the cluster nodes.
๐Ÿ”” ๐‚๐จ๐ง๐ฏ๐ž๐ซ๐ญ ๐ญ๐จ ๐ƒ๐š๐ญ๐š๐…๐ซ๐š๐ฆ๐ž : Convert the distributed RDD to a DataFrame.
๐Ÿ”‹ ๐๐ž๐ซ๐Ÿ๐จ๐ซ๐ฆ ๐๐ซ๐จ๐œ๐ž๐ฌ๐ฌ๐ข๐ง๐  : Show the contents and perform any desired operations.
๐Ÿ“ This video will explain how to write first program in PySpark.
๐Ÿ“ข Video Link: lnkd.in/gmE_dAcG
LinkedIn Profile of author:
/ sachin-saxena-graphic-...
Code Source Link:
lnkd.in/g67a4kY3
๐„๐ฑ๐ฉ๐ฅ๐š๐ง๐š๐ญ๐ข๐จ๐ง ๐จ๐Ÿ ๐ญ๐ก๐ž ๐‚๐จ๐๐ž :
๐Ÿ. ๐’๐ฉ๐š๐ซ๐ค ๐’๐ž๐ฌ๐ฌ๐ข๐จ๐ง : The SparkSession is created to provide an entry point for Spark functionality.
๐Ÿ. ๐‹๐ข๐ฌ๐ญ ๐‚๐ซ๐ž๐š๐ญ๐ข๐จ๐ง : A list of three elements is defined.
๐Ÿ‘. ๐๐š๐ซ๐š๐ฅ๐ฅ๐ž๐ฅ๐ข๐ณ๐ž : The list is parallelized with numSlices=3, which ensures that each element is assigned to a different partition in the RDD. This is how we can distribute it across the three nodes.
๐Ÿ’. ๐‚๐จ๐ง๐ฏ๐ž๐ซ๐ญ ๐ญ๐จ ๐ƒ๐š๐ญ๐š๐…๐ซ๐š๐ฆ๐ž : The RDD is mapped to a tuple format to convert it into a DataFrame. The column is named "element".
๐Ÿ“. ๐ƒ๐ข๐ฌ๐ฉ๐ฅ๐š๐ฒ ๐ƒ๐š๐ญ๐š๐…๐ซ๐š๐ฆ๐ž : The contents of the DataFrame are printed using df.show(), which will display each element as a separate row.
๐Ÿ”. ๐‚๐จ๐ฎ๐ง๐ญ : The total number of elements is counted and printed.
๐Ÿ•. ๐…๐ฎ๐ซ๐ญ๐ก๐ž๐ซ ๐๐ซ๐จ๐œ๐ž๐ฌ๐ฌ๐ข๐ง๐  : An optional step is included to filter the DataFrame for elements containing "1" and display the result.
๐Ÿ–. ๐’๐ญ๐จ๐ฉ ๐’๐ฉ๐š๐ซ๐ค ๐’๐ž๐ฌ๐ฌ๐ข๐จ๐ง Finally, the Spark session is stopped to release resources.
3:54 Databricks source
6:00 Show the number of students in the file
16:00 Map and Flatmap in PySpark
29:00 GroupBy in PySpark
30:00 Show the total marks achieved by Female and Male students
32:00 Show the total number of students that have passed and failed.
33:10 filter data as 50+ marks are required to pass the course
40:00 Show the total number of students enrolled per course
51:00 Show the total marks that students have achieved per course
52:00 Show the average marks that students have achieved per course
55:00 Show the minimum and maximum marks achieved per course
57:00 Show the average age of male and female students

ะžะฟัƒะฑะปะธะบะพะฒะฐะฝะพ:

 

24 ะพะบั‚ 2024

ะŸะพะดะตะปะธั‚ัŒัั:

ะกัั‹ะปะบะฐ:

ะกะบะฐั‡ะฐั‚ัŒ:

ะ“ะพั‚ะพะฒะธะผ ััั‹ะปะบัƒ...

ะ”ะพะฑะฐะฒะธั‚ัŒ ะฒ:

ะœะพะน ะฟะปะตะนะปะธัั‚
ะŸะพัะผะพั‚ั€ะตั‚ัŒ ะฟะพะทะถะต
ะšะพะผะผะตะฝั‚ะฐั€ะธะธ    
ะ”ะฐะปะตะต
Data Engineer PySpark Data Bricks Session Day 6
54:08
ะŸั€ะพัะผะพั‚ั€ะพะฒ 128
What is Kubernetes? | Kubernetes Explained
48:22
ะŸั€ะพัะผะพั‚ั€ะพะฒ 26 ั‚ั‹ั.
โ–ผ ะ•ะ”ะฃ ะ’ ะขะ˜ะฅะžะกะ ะะะกะš ๐Ÿ’ช
37:00
ะŸั€ะพัะผะพั‚ั€ะพะฒ 439 ั‚ั‹ั.
ะ“ั€ะฐะฒะธั€ะพะฒะบะฐ ะฝะฐ iPhone, iPad ะธ Apple Watch
00:40
ะŸั€ะพัะผะพั‚ั€ะพะฒ 331 ั‚ั‹ั.
Microservices with Databases can be challenging...
20:52
ะŸั€ะพัะผะพั‚ั€ะพะฒ 73 ั‚ั‹ั.
Data Engineer PySpark Data Bricks Session Day 7
47:41
ะŸั€ะพัะผะพั‚ั€ะพะฒ 146
How to become a Data Analyst FAST (By 2025)
15:41
ะŸั€ะพัะผะพั‚ั€ะพะฒ 25 ั‚ั‹ั.
You Should Know This Before Using Page Numbers on Your API
10:45
ะŸั€ะพัะผะพั‚ั€ะพะฒ 2,2 ั‚ั‹ั.
Database Sharding and Partitioning
23:53
ะŸั€ะพัะผะพั‚ั€ะพะฒ 92 ั‚ั‹ั.
Think Fast, Talk Smart: Communication Techniques
58:20
ะŸั€ะพัะผะพั‚ั€ะพะฒ 41 ะผะปะฝ
Master Databricks and Apache Spark Step by Step: Lesson 1 - Introduction
32:23
ะŸั€ะพัะผะพั‚ั€ะพะฒ 125 ั‚ั‹ั.
Apache Kafka Architecture
11:19
ะŸั€ะพัะผะพั‚ั€ะพะฒ 34 ั‚ั‹ั.
How to Check if a User Exists Among Billions! - 4 MUST Know Strategies
12:44
ะŸั€ะพัะผะพั‚ั€ะพะฒ 101 ั‚ั‹ั.
โ–ผ ะ•ะ”ะฃ ะ’ ะขะ˜ะฅะžะกะ ะะะกะš ๐Ÿ’ช
37:00
ะŸั€ะพัะผะพั‚ั€ะพะฒ 439 ั‚ั‹ั.