About 10,000,000 results
Open links in new tab
  1. pyspark - How to use AND or OR condition in when in Spark

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on …

  2. PySpark: multiple conditions in when clause - Stack Overflow

    Jun 8, 2016 · Very helpful observation when in pyspark multiple conditions can be built using & (for and) and | (for or). Note:In pyspark t is important to enclose every expressions within …

  3. Rename more than one column using withColumnRenamed

    Since pyspark 3.4.0, you can use the withColumnsRenamed() method to rename multiple columns at once. It takes as an input a map of existing column names and the corresponding …

  4. Filtering a Pyspark DataFrame with SQL-like IN clause

    Mar 8, 2016 · Filtering a Pyspark DataFrame with SQL-like IN clause Asked 9 years, 9 months ago Modified 3 years, 8 months ago Viewed 123k times

  5. Pyspark replace strings in Spark dataframe column

    Pyspark replace strings in Spark dataframe column Asked 9 years, 7 months ago Modified 1 year ago Viewed 315k times

  6. How to change dataframe column names in PySpark?

    I come from pandas background and am used to reading data from CSV files into a dataframe and then simply changing the column names to something useful using the simple command: …

  7. Pyspark: Parse a column of json strings - Stack Overflow

    I have a pyspark dataframe consisting of one column, called json, where each row is a unicode string of json. I'd like to parse each row and return a new dataframe where each row is the …

  8. spark dataframe drop duplicates and keep first - Stack Overflow

    Aug 1, 2016 · 2 I just did something perhaps similar to what you guys need, using drop_duplicates pyspark. Situation is this. I have 2 dataframes (coming from 2 files) which are exactly same …

  9. Pyspark: display a spark data frame in a table format

    Pyspark: display a spark data frame in a table format Asked 9 years, 3 months ago Modified 2 years, 4 months ago Viewed 413k times

  10. Comparison operator in PySpark (not equal/ !=) - Stack Overflow

    Aug 24, 2016 · The selected correct answer does not address the question, and the other answers are all wrong for pyspark. There is no "!=" operator equivalent in pyspark for this …