"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > How to add a constant sequence in Spark DataFrame?

How to add a constant sequence in Spark DataFrame?

Posted on 2025-04-17
Browse:645

How to Add Constant Columns in Spark DataFrames?

Adding Constant Columns in Spark DataFrames

In Spark, adding a constant column to a DataFrame with a specific value for each row can be achieved using various methods.

lit and Other Functions (Spark 1.3 )

In Spark versions 1.3 and above, the lit function is used to create a literal value, which can be used as the second argument to DataFrame.withColumn to add a constant column:

from pyspark.sql.functions import lit

df.withColumn('new_column', lit(10))

For more complex columns, functions like array, map, and struct can be used to build the desired column values:

from pyspark.sql.functions import array, map, struct

df.withColumn("some_array", array(lit(1), lit(2), lit(3)))
df.withColumn("some_map", map(lit("key1"), lit(1), lit("key2"), lit(2)))

typedLit (Spark 2.2 )

Spark 2.2 introduces the typedLit function, which supports providing Seq, Map, and Tuples as constants:

import org.apache.spark.sql.functions.typedLit

df.withColumn("some_array", typedLit(Seq(1, 2, 3)))
df.withColumn("some_struct", typedLit(("foo", 1, 0.3)))

Using a UDF

As an alternative to using literal values, it is possible to create a User Defined Function (UDF) that returns a constant value for each row and use that UDF to add the column:

from pyspark.sql.functions import udf, lit

def add_ten(row):
    return 10

add_ten_udf = udf(add_ten, IntegerType())
df.withColumn('new_column', add_ten_udf(lit(1.0)))

Note:

The constant values can also be passed as arguments to UDFs or SQL functions using the same constructs.

Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3