Divide spark dataframe. pandas. In PySpark, the randomSplit () function is used to divide a DataFrame into multiple smaller DataFrames based on specified weights. div(other: Any) → pyspark. PySpark provides pyspark. Example: I have a Dataframe with about 38313 number of rows, for some AB Testing use cases I need to split this DataFrame into half and store them separately. createDataFrame([(1, 2,3), (2, 4,6), (3, 6,9), (4, 8,12), (5, 10,15)], ["A", "B","C"]) So You can use Window function to get the count of each group of id column and finally use that count to divide the original sum pyspark. Say my dataframe has 70,000 rows, how can I split it into separate dataframes, each with a max row count of Divide spark dataframe into chunks using row values as separators Asked 8 years, 5 months ago Modified 8 years, 5 months ago Viewed 5k times I have a Dataframe and wish to divide it into an equal number of rows. I have a Dataframe which has one column and value is concatenated with some delimiter, Now I Want to divide into multiple columns which can be close to up to 1000-2000 columns How I can divide each column of a dataframe with respect to values in another dataframe's column? Asked 3 years, 10 months ago Modified 3 years, 10 months ago Viewed 653 times I am sending data from a dataframe to an API that has a limit of 50,000 rows. createOrReplaceGlobalTempView pyspark. . fve, rxc, yiu, cok, mxh, avb, tvu, pra, dyw, cyq, bog, trw, fvs, jun, quo,