Тёмный

Tutorial 1-Pyspark With Python-Pyspark Introduction and Installation 

Подписаться
Просмотров 312 тыс.
% 3 512

Apache Spark is written in Scala programming language. To support Python with Spark, Apache Spark community released a tool, PySpark. Using PySpark, you can work with RDDs in Python programming language also. It is because of a library called Py4j that they are able to achieve this.
⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite for a few months and I love it! www.kite.com/get-kite/?
Subscribe my vlogging channel
ru-vid.com/show-UCjWY5hREA6FFYrthD0rZNIw
Please donate if you want to support the channel through GPay UPID,
Gpay: krishnaik06@okicici
Telegram link: t.me/joinchat/N77M7xRvYUd403DgfE4TWw
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
ru-vid.com/show-UCNU_lfiiWBdtULKOw6X0Digjoin
Connect with me here:
Twitter: Krishnaik06
Facebook: krishnaik06
instagram: krishnaik06

Опубликовано:

 

4 май 2021

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 359   
@rlmclaughlinmusic
@rlmclaughlinmusic 3 года назад
Everything about this series is perfect. The pace, the information, and the clarity of the descriptions are as good as it gets. I've watched about 4-5 pyspark tutorials, from various instructors, and they don't even come close to the greatness of these videos. Thank you for providing such top notch content and using a no-nonsense approach. I thoroughly enjoyed these and learned a lot.
@lananajera1081
@lananajera1081 Год назад
I am 9 minutes into the first video and let me tell you it is already better than the last 10 I have tried. It's great for real beginners like me and challenging enough too. Thank you for posting these!!
@arjunsai08
@arjunsai08 2 месяца назад
Krish I am a big fan of yours. You are an amazing teacher and have taught me numerous concepts in Data Science. Thanks a lot for the social service you do!!
@AInamedMia
@AInamedMia 3 года назад
We can like these videos even before we see them cause we know they are bound to be extremely useful.
@mbmathematicsacademic7038
@mbmathematicsacademic7038 Месяц назад
Amazing😂one thing about your channel is that I get confused whenever I get here,I wanted to learn Feature Engineering for the day here I am enjoying pyspark
@islamicinterestofficial
@islamicinterestofficial 2 года назад
please make a video how to install pyspark. We installed it but its not importing on jupyter notebook. On terminal, its importing fine
@amanmehrotra44
@amanmehrotra44 3 года назад
Sir ek hi dil hai, kitni baar jeetenge ! Once again hats-off to your efforts in uplifting the entire data science community across the globe.
@aryanraj768
@aryanraj768 6 месяцев назад
the kind stuff that he taught is already there on the doc which is readable by anyone in the world
@sreekanthn1023
@sreekanthn1023 2 года назад
Hi Sir, When I am trying to import sparksession and sparkcontext it is throwing an error. The error is module Java.base doesnot support sun.nio.ch to unnamed module. Could you please resolve this Thank you
@ryandraanditto3665
@ryandraanditto3665 2 года назад
same with me, can anybody help us?
@rajeshkumarmandal8422
@rajeshkumarmandal8422 3 года назад
Thanks for this, but i am getting error while running the spark and the error is "Exception: Java gateway process exited before sending its port number". Can you tell me how to resolve it.
@sahilshetty8640
@sahilshetty8640 3 года назад
Hi, I faced the same issue and found the solution...all u got to do is download JDK version 8 and set it to path and make sure you uninstall any other versions of Java from your system. Let me know if u need any further help. Good luck!
@anuvratshukla7061
@anuvratshukla7061 3 года назад
@@sahilshetty8640 How to set path after downloading JDK?
@pankajdhut46
@pankajdhut46 Год назад
​@@sahilshetty8640I do set the path still showing the same error
@chinmayagokhale6341
@chinmayagokhale6341 8 месяцев назад
How to resolve this error..
@chinmayagokhale6341
@chinmayagokhale6341 8 месяцев назад
@sahilshetty8640 how to resolve this error
@ujjwalgoel6359
@ujjwalgoel6359 8 месяцев назад
after wasting 2 hours on youtube at last found someone telling from scratch and what i was looking for
@Nishanthts
@Nishanthts 3 года назад
Thanks for this .. kindly provide complete playlist
@RedShipsofSpainAgain
@RedShipsofSpainAgain 3 года назад
Timestamps: 6:45 Create new environment and install spark via pip install 7:13 importing pyspark 9:34 Import SparkSession 9:47 Create SparkSession ...
@parammani4717
@parammani4717 2 года назад
Hi first time looking this video, where he is creating new environment. Is this any cloud platform
@salmansiddiqui8893
@salmansiddiqui8893 3 года назад
Getting below error after running spark=SparkSession.builder.appName('Practise').getOrCreate(), > Py4JError: org.apache.spark.api.python.PythonUtils.isEncryptionEnabled does not exist in the JVM
@arjunsubramaniyan1675
@arjunsubramaniyan1675 3 года назад
Much waited playlist!!
@SynonAnon-vi1ql
@SynonAnon-vi1ql 7 месяцев назад
Hi Krish! Great tutorial! Thanks for this! One (probably stupid) question and I'm a novice here. How did you enable the auto-suggest functionality in your jupyter notebook? Mine doesn't work. Could you please help? Thank you!
@prashanthpaul2713
@prashanthpaul2713 3 года назад
So glad that you started this new series, Krish! Looking forward for new videos in this series. Any idea when you would be uploading? :)
@ankushv2642
@ankushv2642 9 месяцев назад
can you tell me how he got that jupyter screen where he is installing the pyspark
@rhevathivijay2913
@rhevathivijay2913 3 года назад
Really When i am doing search in ur encyclopedia playlist,I miss this..Thank you for uploading sir
@sachinkapoor2424
@sachinkapoor2424 3 года назад
Sir ek hi toh dil hai kitni baar jitoge🙏
@singhjagbir1210
@singhjagbir1210 Год назад
I am stuck while creating Spark Session getting this error PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number.. Please help
@optimistic_guy313
@optimistic_guy313 2 года назад
I am having some problems with thinking. Can you share how you tackle thinking and do fast thinking?
@kapilbisht1119
@kapilbisht1119 2 года назад
Hi Krish, After installing Spark When I run spark session, I m getting below error. RuntimeError Traceback (most recent call last) Cell In [11], line 1 ----> 1 spark = SparkSession.builder.appName('Practice').getOrCreate() File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspark\sql\session.py:269, in SparkSession.Builder.getOrCreate(self) 267 sparkConf.set(key, value) 268 # This SparkContext may be an existing one. --> 269 sc = SparkContext.getOrCreate(sparkConf) 270 # Do not update `SparkConf` for existing `SparkContext`, as it's shared 271 # by all sessions. 272 session = SparkSession(sc, options=self._options) File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspark\context.py:483, in SparkContext.getOrCreate(cls, conf) 481 with SparkContext._lock: 482 if SparkContext._active_spark_context is None: --> 483 SparkContext(conf=conf or SparkConf()) 484 assert SparkContext._active_spark_context is not None 485 return SparkContext._active_spark_context File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspark\context.py:195, in SparkContext.__init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls, udf_profiler_cls) 189 if gateway is not None and gateway.gateway_parameters.auth_token is None: 190 raise ValueError( 191 "You are trying to pass an insecure Py4j gateway to Spark. This" 192 " is not allowed as it is a security risk." 193 ) --> 195 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) 196 try: 197 self._do_init( 198 master, 199 appName, (...) 208 udf_profiler_cls, 209 ) File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspark\context.py:417, in SparkContext._ensure_initialized(cls, instance, gateway, conf) 415 with SparkContext._lock: 416 if not SparkContext._gateway: --> 417 SparkContext._gateway = gateway or launch_gateway(conf) 418 SparkContext._jvm = SparkContext._gateway.jvm 420 if instance: File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\pyspark\java_gateway.py:106, in launch_gateway(conf, popen_kwargs) 103 time.sleep(0.1) 105 if not os.path.isfile(conn_info_file): --> 106 raise RuntimeError("Java gateway process exited before sending its port number") 108 with open(conn_info_file, "rb") as info: 109 gateway_port = read_int(info) RuntimeError: Java gateway process exited before sending its port number
@amanahmed6057
@amanahmed6057 Год назад
bro don't use this use google collab
@PritiModi-o8o
@PritiModi-o8o Год назад
Hello sir i am not able to create Pyspark session, while i am generating session i am getting follwing error :: Py4JError: org.apache.spark.api.python.PythonUtils.getPythonAuthSocketTimeout does not exist in the JVM can you give me solution of this problem
@AbhijitPaulYT
@AbhijitPaulYT 2 месяца назад
Its 2024 Sir, and still your video contents are unmatchable. My bad luck is that the moment I joined your iNeuron course, you separated away from it, but my only reason joining the course was to learn from only you! SAD :(
@deveshsharma8407
@deveshsharma8407 4 месяца назад
Sir last two lines code are not working in my system it shows ---- AttributeError: 'NoneType' object has no attribute 'printSchema' everything is all right even i restarted kernel
@sakthikumaranvg2668
@sakthikumaranvg2668 Год назад
I am getting some errors while creating spark session PySparkruntime error[Java gate way exited] Java gateway process exited before sending it's port number
@shashikantchikhle9128
@shashikantchikhle9128 Год назад
Please advise RuntimeError: Java gateway process exited before sending its port number
@shashanktiwari133
@shashanktiwari133 7 месяцев назад
can you share the resolution for this error, i am facing the same issue
@sanketsingh6881
@sanketsingh6881 5 месяцев назад
@@shashanktiwari133 Any luck on this issue?
@awaizmansoor3127
@awaizmansoor3127 2 месяца назад
You should have the latest java jdk and python installed on your pc first.
@suhass6628
@suhass6628 3 года назад
Most awaited!!!!!!! it was music to my years when he said Mlib 0:40
@alihaiderabdi9939
@alihaiderabdi9939 3 года назад
sir waiting for new playlist from a longtime and here it came!!!!
@sushmagoel7854
@sushmagoel7854 3 года назад
The command "!pip install pyspark" got successfully run I got the following error after the command import pyspark "ModuleNotFoundError: No module named 'pyspark'" I had created a new environment in Anaconda and installed pyspark in it. The above error got resolved by running "pip install pyspark" command
@manuelmeekattukulam
@manuelmeekattukulam 2 года назад
This worked for me. Thanks!
@ektaaggarwal3471
@ektaaggarwal3471 2 года назад
Thanks Sushma! I was encountering the same error since last 2 days and was about to give up learning PySpark. Your comment has saved my learning :)
@dheerendrasinghbhadauria9798
@dheerendrasinghbhadauria9798 2 года назад
I am getting an error " Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.lang.NoClassDefFoundError: Could not initialize class org.apache.spark.storage.StorageUtils$ "
@amberkataria9408
@amberkataria9408 Год назад
spark session command : spark = SparkSession.builder.appName('Practiceee').getOrCreate() is taking infinite time. Not able to run code further as it kept on running. What is the solution for this?
@MBayat-l4e
@MBayat-l4e 6 месяцев назад
Hi Krish, Thanks for your videos, I dont know why I get ("Non type ) after correcting the header for pyspark and dose not show me the Schema.
@namanvyas9433
@namanvyas9433 3 года назад
Thanks man, just wanted to start with pyspark.
@MukeshThakur-qp5ft
@MukeshThakur-qp5ft Год назад
when i am trying to create Spark Session getting this error "RuntimeError: Java gateway process exited before sending its port number". Help me in resolving this please
@ViratKohli-gh6ic
@ViratKohli-gh6ic 2 года назад
Intro soundtrack jabardast hai bhai..also content bhi
@vaibhavtiwari1084
@vaibhavtiwari1084 2 года назад
I didn't realise when those 16 minutes ended...interactive n smooth!!
@pyclassy
@pyclassy 3 года назад
Hi Krish I am getting a Py4j error can you upload the reuirements.txt file along with the python version so that I can start
@shivamkashyap559
@shivamkashyap559 2 года назад
Hi, I am getting an error while running the commands as: raise RuntimeError("Java gateway process exited before sending its port number") RuntimeError: Java gateway process exited before sending its port number No answers found. Can you please help?
@dileepk1740
@dileepk1740 Год назад
Hi Krish, I have created new environment for pyspark !pip install pyspark import pyspark are successful but import pandas as pd give error as: No module named 'pandas' what needs to do ?
@jatinsharma9101
@jatinsharma9101 2 года назад
Hi Krish, RuntimeError: Java gateway process exited before sending its port number I was getting this error in SparkSession.builder.appName('practice').getOrCreate() Please help me
@VP_SOTWMC
@VP_SOTWMC 3 года назад
When I am adding SparkSession code, I am getting below error. Exception: Java gateway process exited before sending its port number How to fix this
@awaizmansoor3127
@awaizmansoor3127 2 месяца назад
You should have the latest version of the java jdk installed in your pc.
@hardikvegad3508
@hardikvegad3508 3 года назад
It's been ages...... I had waited for this from you krishhhhhhh😭😭😭😭😭🤩...Thank you💥
@darshanayenkar
@darshanayenkar Год назад
i am getting this error: [WinError 10061] No connection could be made because the target machine actively refused it can you plz help me to solve?
@areebakhtar9841
@areebakhtar9841 2 года назад
Hi I am getting following error while executing spark = SparkSession.builder.appName('learning').getOrCreate() RuntimeError: Java gateway process exited before sending its port number
@asawanted
@asawanted 3 года назад
Sir I am having issue when calling SparkSession.builder on local machine. The cell runs forever and nothing happens. I created a new environment and repeated the process. Still the cell gets stuck and doesn't proceed. Sir please reply
@balachandar3587
@balachandar3587 3 года назад
you need to install jdk 8(Uninstall if any other is being used). after that restart your laptop. this should fix the problem.
@asawanted
@asawanted 3 года назад
How is jdk related to Python and jupyter?
@balachandar3587
@balachandar3587 3 года назад
You need java do execute spark
@ganeshkalbhor3928
@ganeshkalbhor3928 Год назад
Hi @krish, I am getting ' RuntimeError: Java gateway process exited before sending its port number ' this error while starting spark session. could you please help me to resolve this
@bryandiaz__
@bryandiaz__ Год назад
Hello, I keep getting this error and can't move past it could you kindly suggest please File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyspark\java_gateway.py:106, in launch_gateway(conf, popen_kwargs) 103 time.sleep(0.1) 105 if not os.path.isfile(conn_info_file): --> 106 raise RuntimeError("Java gateway process exited before sending its port number") 108 with open(conn_info_file, "rb") as info: 109 gateway_port = read_int(info) RuntimeError: Java gateway process exited before sending its port number
@AakarshanJha
@AakarshanJha Год назад
Java gateway process exited before sending its port number I am getting this error while setting spark session builder. Can anyone help me out in this?
@damodharratnamthappeta2022
@damodharratnamthappeta2022 3 года назад
much waited playlist
@preetamjawaria441
@preetamjawaria441 28 дней назад
what can be done when // spark=SparkSession.builder.appName('Practise').getOrCreate() is keep on running // is keep on running and not getting executed.
@AbhishekTiwari-xw7ux
@AbhishekTiwari-xw7ux 2 года назад
AnalysisException: Path does not exist: file:/C:/Users/abhi/test.csv How to solve this issue ....even i keep my file in the same location
@annikakumar
@annikakumar 3 месяца назад
type(df_pyspark) is always showing nonetype for me. kindly help me how to rectify the error
@yogeshpathak5777
@yogeshpathak5777 Год назад
Trying to run code in jupyter ,but always getting errors.Dont know how to access file from local in jupyter
@ManoharJishu
@ManoharJishu Год назад
facing this error while creating sparkSession: RuntimeError: Java gateway process exited before sending its port number Any suggestions on this?
@jeevankumar9063
@jeevankumar9063 2 года назад
I am getting "py4JJavaerror", how to proceed further?
@xendu-d9v
@xendu-d9v 2 года назад
Same
@ananyanayak7509
@ananyanayak7509 3 года назад
Hello Sir, I got error as :- "Exception: Java gateway process exited before sending its port number" while executing line number 5. How can I resolve it ?
@Abhilash3824
@Abhilash3824 3 года назад
Was eagerly waiting for this playlist. Thank you so much Krish! 🙂
@ankitsaxena565
@ankitsaxena565 3 месяца назад
Hi Sir,this playlist is enough for learning pyspark
@rashmikadre8900
@rashmikadre8900 3 года назад
Omg!! I have been literally been waiting for this!! Krish u r the man!!!
@ajaysaikiran2196
@ajaysaikiran2196 3 года назад
Most awaited playlist
@لاالہالااللہ-ع8و6ز
I am stuck here " spark=SparkSession.builder.appName("Practice").getOrCreat() I am stuck here. It is showing error : AttributeError: 'Builder' object has no attribute 'getOrCreat'
@sandeepnelwade
@sandeepnelwade 2 года назад
Hi Krish I got error when creating sparksession, how I connect with you
@eswaragopal335
@eswaragopal335 3 года назад
Most awaited video from u... Thanks for the starting this session
@yadavanubhav005
@yadavanubhav005 2 года назад
Hi Krish, any idea why my code same as yours is not getting executed. I installed jupyter notebook using anaconda. I wish I could have pasted the screenshot here.
@deveshkumar3504
@deveshkumar3504 3 года назад
I desperately needed this course ! Thanks a lot !
@akhilverma9773
@akhilverma9773 2 года назад
when I run spark = SparkSession.builder.appName('Practise').getOrCreate(). I am getting "Java gateway process exited before sending its port number" this error
@AlDamara-x8j
@AlDamara-x8j Год назад
Thanks for this video. For learning purposes on my own computer, do I need to install apache.spark (spark-3.4.1-bin-hadoop3.tgz) to be able to run spark scripts/notebooks, or just pip install pyspark on my python environment?
@sanjeevkumarsingh4939
@sanjeevkumarsingh4939 Год назад
Hi Krish, Thanks for these amazing videos. I am getting error "RuntimeError: Java gateway process exited before sending its port number" during creation of session in jupyter.
@girishreddyedula2667
@girishreddyedula2667 Год назад
was this resolved? If yes please tell me how
@mrraju9986
@mrraju9986 2 года назад
When I was creating pyspark seeion it's through an erro like this java gateway process exited before sending it's port number
@nlokesh1986
@nlokesh1986 2 года назад
Sir, how are you getting the automatic suggestions in jupyter notebook.. please help me, so that i can do the same with my system. Thanks alot
@sanjaybohr1058
@sanjaybohr1058 3 года назад
how to resolve this "Exception: Java gateway process exited before sending its port number"
@ruthvikrajam.v4303
@ruthvikrajam.v4303 2 года назад
pyspark works only with java 8 version and not the latest java software i.e java 17
@fluffybinibunny
@fluffybinibunny 3 года назад
Hello Krish I am getting the following error - "Java gateway process exited before sending its port number" when performing this code "spark = SparkSession.builder.appName("Practice").getOrCreate()" I have installed jdk as well and still it gives the same error please let me know how to resolve it as I have searched in net and I am not getting much of help. Thanks.
@karan9671
@karan9671 3 года назад
Install JDK 8 and define JAVA_HOME under user variables.
@saisankar25
@saisankar25 3 года назад
installing pyspark and when I am running the code , I am getting an error-" Java gateway process exited before sending its port number". I have set the path of java in environment variable, still getting the same error. If you could assist that would be great , so that we can start testing other videos
@krishnaik06
@krishnaik06 3 года назад
Install findspark
@aviranawat
@aviranawat 3 года назад
@Shubhangi Sakarkar # install jdk 8 and do this. import os os.environ["JAVA_HOME"] = "C:\Java\jdk1.8.0_291" os.environ["PATH"] = os.environ["JAVA_HOME"] + "/bin:" + os.environ["PATH"]
@UttkarshJainmeb
@UttkarshJainmeb 3 года назад
@@aviranawat will it work for mac also these arguments values
@abhishekpawar1207
@abhishekpawar1207 2 года назад
Do i have to install linux with virtual box in order tp work with pyspark?
@bhaskararya5901
@bhaskararya5901 Год назад
my pyspark session is still running for last 2 hours. what to do, i tried other method like update my pip,etc. Did anyone face the same problem? any solution is appreciated.
@maigan007
@maigan007 7 месяцев назад
Bro thank you! I swear other videos made it so complicated!
@harshvardhansingh3862
@harshvardhansingh3862 2 месяца назад
Getting "PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number." error while creating spark session. tried many ways to fix but still getting the same problem
@awaizmansoor3127
@awaizmansoor3127 2 месяца назад
You should have java jdk installed in your pc.
@thearyavartmedia
@thearyavartmedia 15 дней назад
@@awaizmansoor3127 Yes I have. Then what to do? I have also configured paths. Still getting error.
@vallimuthaiyah5098
@vallimuthaiyah5098 3 года назад
Can you please let us know the advantages of using pyspark dataframe over pandas dataframe
@subhajitdey4483
@subhajitdey4483 Год назад
When I am trying it on Jupyter it's showing: " RuntimeError: Java gateway process exited before sending its port number". What should I do now? I have tried it on Python idle also, but same error. Help if anyone has any solution.
@SaiTeja-ob6zg
@SaiTeja-ob6zg Год назад
Same problem.. did u get any solution?
@premsaikarampudi3944
@premsaikarampudi3944 2 года назад
Hi @krish Naik, When i import pyspark, i get an error "Kernal died" can you suggest what to do ?
@rohansrivastwa827
@rohansrivastwa827 2 года назад
for me it is not working also...not able to install pyspark using the command -> !pip install pyspark
@premsaikarampudi3944
@premsaikarampudi3944 2 года назад
@@rohansrivastwa827 Hey, try re-installing anaconda. It worked for me
@manibaddireddy5477
@manibaddireddy5477 2 года назад
Hello sir i'm facing RuntimeError: Java gateway process exited before sending its port number. Please let me know the issue
@ganeshkalbhor3928
@ganeshkalbhor3928 Год назад
Hi @mani I am also facing same issue. have you got solution on this. please let me know. Thanks in advanced
@m2editz816
@m2editz816 3 года назад
I really appreciate your videos. One thing which is missing is that your tutorial starts with python implementation only. If you create a video on how to configure spark in a system and connect with python, that would be a great help
@awaizmansoor3127
@awaizmansoor3127 2 месяца назад
Can't agree more
@zoharbatterywala1974
@zoharbatterywala1974 2 года назад
can you please make a single video merging all individual files as we have internet problem at our place ,(ISPs router is placed in a commercial area) , so downloading one video will help me. PLEASE
@lokeshkambam9416
@lokeshkambam9416 2 года назад
sparksession is taking lot of time to create. Is there any solution for this?
@ansonnn_
@ansonnn_ 3 года назад
Have been searching for good PySpark tutorials and this turned up 👍 Thanks!
@balramthakur9951
@balramthakur9951 3 года назад
Sir, after installing pyspark and when I am running the code , I am getting an error-" Java gateway process exited before sending its port number". I have set the path of java in environment variable, still getting the same error
@anandjha6863
@anandjha6863 3 года назад
im getting same error ...if you find any solution pl reply
@balramthakur9951
@balramthakur9951 3 года назад
@@anandjha6863 not yet
@avinashkar2260
@avinashkar2260 3 года назад
@yadav k Hi, Even after installing jdk 8 I am facing same error. Pls suggest
@avinashkar2260
@avinashkar2260 3 года назад
solved with a restart after jdk 8 installation. Thanks
@i_amanrajput
@i_amanrajput 2 года назад
after installing jdk8, set path where your java installed import os os.environ["JAVA_HOME"] = "C:\Program Files\Java\jdk1.8.0_321" os.environ["PATH"] = os.environ["JAVA_HOME"] + "/bin:" + os.environ["PATH"]
@reenasheoran893
@reenasheoran893 3 года назад
Hey Krish, this nullable=True means it's not a primary key as we are working with SQL dataframe?
@nigamaveena4211
@nigamaveena4211 3 года назад
What's the difference between python and pyspark sir? The work done is same in both cases. Then y do we have to use pyspark. Are there any advantages? (A small doubt)
@abhishekgopinath4608
@abhishekgopinath4608 3 года назад
It's used for huge datasets spanning gbs of data... python + spark = pyspark. Spark is used to deal with the huge dataset in a distributed environment...like storing data in different systems and processing it there and the using map reduce to bring it all together
@nigamaveena4211
@nigamaveena4211 3 года назад
@@abhishekgopinath4608 tq :)
@GladstonLeon
@GladstonLeon 3 года назад
Spark is a distributed env,where we use python or scala for programming. Python also used in ML and DL without spark ,ie non-distributed env
@nigamaveena4211
@nigamaveena4211 3 года назад
@@GladstonLeon tq sir :)
@mandarbirwadkar
@mandarbirwadkar 3 года назад
Hi Krish i am getting error while accessing file from D drive i kept csv in D drive .
@ayaansk99
@ayaansk99 2 года назад
Session builder is taking lot of time in executing and still not executed in jupyter notebook
@raghuls9010
@raghuls9010 3 года назад
i get spark output like this further unable to read the dataset
@muhammadsalmanhassan7544
@muhammadsalmanhassan7544 3 года назад
What we can divide dataset into multiple chunks in pandas and train the model on it is this good practice or bad practice?
@rajashekarchandrashekar3898
@rajashekarchandrashekar3898 2 года назад
Hey, I am trying to follow this.. and I am not getting the autocomplete feature. Is there any library I need to install?
@xendu-d9v
@xendu-d9v 2 года назад
Your computer is slow then
@xendu-d9v
@xendu-d9v 2 года назад
Write something + TAB
@akashchauhan8436
@akashchauhan8436 3 года назад
How to create a timeseries in pyspark. Say for example I have a column named start_date wit the format (YYYY-MM) for some event, but its not continuous, i.e. I have 2015-01, 2015-04, 2015-07. Then how do I fill the missing dates between them and assign the values to other columns as 0 in pyspark? It was easy in pandas where I could just set this column as index and then resample the dataframe.
@biswanandanpattanayak6083
@biswanandanpattanayak6083 3 года назад
It's very important playlist. One querry about clustering. Which I faced in interview. How can you know which cluster is good??
@lucianomilo358
@lucianomilo358 3 года назад
Dont know if anyone gives a shit but if you're stoned like me during the covid times you can watch all of the latest movies on InstaFlixxer. I've been watching with my brother for the last couple of days =)
@bryankristian1428
@bryankristian1428 3 года назад
@Luciano Milo Yea, been using Instaflixxer for years myself :D
@life_sway
@life_sway 2 года назад
bhai jupyter kaise kara install .. ???? python kaise kiya install??
@girishreddyedula2667
@girishreddyedula2667 Год назад
Hi Krish , I am facing this error RuntimeError: Java gateway process exited before sending its port number @krishnaik
@MultiHarsh5
@MultiHarsh5 3 года назад
Sir why you chose python instead of scala? Is scala is not preferred in industry?
@krishnaik06
@krishnaik06 3 года назад
U can use scala also...
@swaraj2235
@swaraj2235 3 года назад
Very much useful.. Thanks Krish.