Sunday, May 3, 2020

How to Setup PySpark on Windows

Before getting into any Spark implementation or testing you would need a correct Spark environment/set up. In this post, I am going to tell you how to set up the spark in your Windows environment.

The steps are very simple, as the title says our objective is to setup PySpark on windows, there is no specific prerequisite is required. So to avoid all misconceptions we just need to follow the below steps to get this set-up ready. Here I have assumed you already have Java installed.

Steps 1-
          

  • You can download any latest stable release, as I have highlighted "spark-2.4.5-bin-hadoop2.7.tgz". As it is a pre-built of Hadoop, you will also get the Spark with Scala 2.11. 
  • Extract this .tgz file in your C:\ directory, in my case I have WinRAR installed by which I can easily extract this .tgz file 
         

 Step 2-
          C:\spark-2.4.0-bin-hadoop2.7\bin

         

Step 3-
  • Now setup the environment paths for Spark.
  • Go to "Advanced System Settings" and set below paths
  • JAVA_HOME="C:\Program Files\Java\jdk1.8.0_181"
  • HADOOP_HOME="C:\spark-2.4.0-bin-hadoop2.7"
  • SPARK_HOME="C:\spark-2.4.0-bin-hadoop2.7"
  • Also add their bin path into PATH system variable, as shown below
        


Step 4-

At this point, we have done with the setup but I found something very important from this blog which is optional but to avoid some errors when you work with SPark with the hive.

Optional: Some tweaks to avoid future errors

  • Create folder C:\tmp\hive
  • Open your Command Prompt CMD as an administrator 
         
    
  • Give the full right to this temp hive directory using below command
           winutils.exe chmod -R 777 D:\tmp\hive
  • Check the given permission
          winutils.exe ls -F D:\tmp\hive

         


Step 5- Check the installation

  • Open your cmd and run command "spark-shell"
         
Congratulations! all set, you can now start your coding.


2 comments:

  1. The Best Casinos in Oklahoma | No Deposit Bonus Codes
    1. Hollywood Casino at Kansas 넷텔러 Speedway. Hollywood Casino at 아 샤벳 Charles Town Races. Hollywood Casino at 강원 랜드 여자 앵벌이 Charles Town 999betasia Races. Hollywood 포커 스트레이트 Casino at Charles Town Races.

    ReplyDelete
  2. J. R. R. Casino - Kansas City - KT Hub
    J. 제주 출장샵 R. Casino, Kansas City, 남원 출장샵 MO. 제주도 출장마사지 248611-4611. www.jtm.com/Casinos/J. 김해 출장샵 R. 대전광역 출장마사지 Casino.

    ReplyDelete