case class Column extends Logging with Product with Serializable
Represents a column or an expression in a DataFrame.
To create a Column object to refer to a column in a DataFrame, you can:
- Use the functions.col function.
- Use the DataFrame.col method.
- Use the shorthand for the DataFrame.apply
method (
<dataframe>("<col_name>")).
For example:
import com.snowflake.snowpark.functions.col df.select(col("name")) df.select(df.col("name")) dfLeft.select(dfRight, dfLeft("name") === dfRight("name"))
This class also defines utility functions for constructing expressions with Columns.
The following examples demonstrate how to use Column objects in expressions:
df .filter(col("id") === 20) .filter((col("a") + col("b")) < 10) .select((col("b") * 10) as "c")
- Since
0.1.0
- Alphabetic
- By Inheritance
- Column
- Serializable
- Product
- Equals
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- def %(other: Any): Column
Remainder.
Remainder. Alias for mod.
- Since
0.1.0
- def &&(other: Any): Column
And.
And. Alias for and.
- Since
0.1.0
- def *(other: Any): Column
Multiply.
Multiply. Alias for multiply.
- Since
0.1.0
- def +(other: Any): Column
Plus.
Plus. Alias for plus.
- Since
0.1.0
- def -(other: Any): Column
Minus.
Minus. Alias for minus.
- Since
0.1.0
- def /(other: Any): Column
Divide.
Divide. Alias for divide.
- Since
0.1.0
- def <(other: Any): Column
Less than.
Less than. Alias for lt.
- Since
0.1.0
- def <=(other: Any): Column
Less than or equal to.
Less than or equal to. Alias for leq.
- Since
0.1.0
- def <=>(other: Any): Column
Equal to.
Equal to. You can use this for comparisons against a null value. Alias for equal_null.
- Since
0.1.0
- def =!=(other: Any): Column
Not equal to.
Not equal to. Alias for not_equal.
- Since
0.1.0
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def ===(other: Any): Column
Equal to.
Equal to. Alias for equal_to. Use this instead of
==to perform an equality check in an expression. For example:lhs.filter(col("a") === 10).join(rhs, rhs("id") === lhs("id"))
- Since
0.1.0
- def >(other: Any): Column
Greater than.
Greater than. Alias for gt.
- Since
0.1.0
- def >=(other: Any): Column
Greater than or equal to.
Greater than or equal to. Alias for geq.
- Since
0.1.0
- def alias(alias: String): Column
Returns a new renamed Column.
Returns a new renamed Column. Alias for name.
- Since
0.1.0
- def and(other: Column): Column
And.
And.
- Since
0.1.0
- def apply(idx: Int): Column
Returns the element (field) at the specified index in a column that contains semi-structured data.
Returns the element (field) at the specified index in a column that contains semi-structured data.
The method applies case-sensitive matching to the names of the specified elements.
This is equivalent to using bracket notation in SQL (
column[index]).- If the column is an ARRAY value, this function extracts the VARIANT value of the array element at the specified index.
- If the index points outside of the array boundaries or if an element does not exist at the specified index (e.g. if the array is sparsely populated), the method returns NULL.
- If the column is a VARIANT value, this function first checks if the VARIANT value contains an ARRAY value.
- If the VARIANT value does not contain an ARRAY value, the method returns NULL.
- Otherwise, the method works as described above.
For example:
import com.snowflake.snowpark.functions.col df.select(col("src")(1)(0)("name")(0))
- idx
index of the subfield to be extracted
- Since
0.2.0
- def apply(field: String): Column
Returns the specified element (field) in a column that contains semi-structured data.
Returns the specified element (field) in a column that contains semi-structured data.
The method applies case-sensitive matching to the names of the specified elements.
This is equivalent to using bracket notation in SQL (
column['element']).- If the column is an OBJECT value, this function extracts the VARIANT value of the element with the specified name from the OBJECT value.
- If the element is not found, the method returns NULL.
- You must not specify an empty string for the element name.
- If the column is a VARIANT value, this function first checks if the VARIANT value contains an OBJECT value.
- If the VARIANT value does not contain an OBJECT value, the method returns NULL.
- Otherwise, the method works as described above.
For example:
import com.snowflake.snowpark.functions.col df.select(col("src")("salesperson")("emails")(0))
- field
field name of the subfield to be extracted. You cannot specify a path.
- Since
0.2.0
- def as(alias: String): Column
Returns a new renamed Column.
Returns a new renamed Column. Alias for name.
- Since
0.1.0
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def asc: Column
Returns a Column expression with values sorted in ascending order.
Returns a Column expression with values sorted in ascending order.
- Since
0.1.0
- def asc_nulls_first: Column
Returns a Column expression with values sorted in ascending order (null values sorted before non-null values).
Returns a Column expression with values sorted in ascending order (null values sorted before non-null values).
- Since
0.1.0
- def asc_nulls_last: Column
Returns a Column expression with values sorted in ascending order (null values sorted after non-null values).
Returns a Column expression with values sorted in ascending order (null values sorted after non-null values).
- Since
0.1.0
- def between(lowerBound: Column, upperBound: Column): Column
Between lower bound and upper bound.
Between lower bound and upper bound.
- Since
0.1.0
- def bitand(other: Column): Column
Bitwise and.
Bitwise and.
- Since
0.1.0
- def bitor(other: Column): Column
Bitwise or.
Bitwise or.
- Since
0.1.0
- def bitxor(other: Column): Column
Bitwise xor.
Bitwise xor.
- Since
0.1.0
- def cast(to: String): Column
Casts the values in the Column to the specified data type.
Casts the values in the Column to the specified data type.
Examples
val df = Seq(123, 456, 789).toDF("a") df.select(col("a").cast("string").as("casted")).schema.toString // res: String = "StructType[StructField(CASTED, String, Nullable = false)]"
- to
A string representing the target data type.
- returns
A new Column with values cast to the specified data type.
- Since
1.18.0
- Exceptions thrown
IllegalArgumentExceptionIf the provided string does not represent a valid data type.
- def cast(to: DataType): Column
Casts the values in the Column to the specified data type.
Casts the values in the Column to the specified data type.
- Since
0.1.0
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @HotSpotIntrinsicCandidate() @native()
- def collate(collateSpec: String): Column
Returns a copy of the original Column with the specified
collationSpecproperty, rather than the original collation specification property.Returns a copy of the original Column with the specified
collationSpecproperty, rather than the original collation specification property.For details, see the Snowflake documentation on collation specifications.
- Since
0.1.0
- def desc: Column
Returns a Column expression with values sorted in descending order.
Returns a Column expression with values sorted in descending order.
- Since
0.1.0
- def desc_nulls_first: Column
Returns a Column expression with values sorted in descending order (null values sorted before non-null values).
Returns a Column expression with values sorted in descending order (null values sorted before non-null values).
- Since
0.1.0
- def desc_nulls_last: Column
Returns a Column expression with values sorted in descending order (null values sorted after non-null values).
Returns a Column expression with values sorted in descending order (null values sorted after non-null values).
- Since
0.1.0
- def divide(other: Any): Column
Divide.
Divide.
- Since
0.1.0
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equal_nan: Column
Is NaN.
Is NaN.
- Since
0.1.0
- def equal_null(other: Any): Column
Equal to.
Equal to. You can use this for comparisons against a null value.
- Since
0.1.0
- def equal_to(other: Any): Column
Equal to.
Equal to. Same as
===.- Since
0.1.0
- def geq(other: Any): Column
Greater than or equal to.
Greater than or equal to.
- Since
0.1.0
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- def getName: Option[String]
Returns the column name (if the column has a name).
Returns the column name (if the column has a name).
- Since
0.2.0
- def gt(other: Any): Column
Greater than.
Greater than.
- Since
0.1.0
- def in(df: DataFrame): Column
Returns a conditional expression that you can pass to the filter or where method to perform a WHERE ...
Returns a conditional expression that you can pass to the filter or where method to perform a WHERE ... IN query with a specified subquery.
The expression evaluates to true if the value in the column is one of the values in the column of the same name in a specified DataFrame.
For example, the following code returns a DataFrame that contains the rows where the column "a" of
df2contains one of the values from column "a" indf1. This is equivalent to SELECT * FROM table2 WHERE a IN (SELECT a FROM table1).val df1 = session.table(table1) val df2 = session.table(table2) df2.filter(col("a").in(df1))
- Since
0.10.0
- def in(values: Seq[Any]): Column
Returns a conditional expression that you can pass to the filter or where method to perform the equivalent of a WHERE ...
Returns a conditional expression that you can pass to the filter or where method to perform the equivalent of a WHERE ... IN query with a specified list of values.
The expression evaluates to true if the value in the column is one of the values in a specified sequence.
For example, the following code returns a DataFrame that contains the rows where the column "a" contains the value 1, 2, or 3. This is equivalent to SELECT * FROM table WHERE a IN (1, 2, 3).
df.filter(df("a").in(Seq(1, 2, 3)))
- Since
0.10.0
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isNull: Column
Wrapper for is_null function.
Wrapper for is_null function.
- Since
1.10.0
- def is_not_null: Column
Is not null.
Is not null.
- Since
0.1.0
- def is_null: Column
Is null.
Is null.
- Since
0.1.0
- def leq(other: Any): Column
Less than or equal to.
Less than or equal to.
- Since
0.1.0
- def like(pattern: Column): Column
Allows case-sensitive matching of strings based on comparison with a pattern.
Allows case-sensitive matching of strings based on comparison with a pattern.
For details, see the Snowflake documentation on LIKE.
- Since
0.1.0
- def log(): Logger
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logDebug(msg: String, throwable: Throwable): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logDebug(msg: String): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logError(msg: String, throwable: Throwable): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logError(msg: String): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logInfo(msg: String, throwable: Throwable): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logInfo(msg: String): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logTrace(msg: String, throwable: Throwable): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logTrace(msg: String): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logWarning(msg: String, throwable: Throwable): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def logWarning(msg: String): Unit
- Attributes
- protected[internal]
- Definition Classes
- Logging
- def lt(other: Any): Column
Less than.
Less than.
- Since
0.1.0
- def minus(other: Any): Column
Minus.
Minus.
- Since
0.1.0
- def mod(other: Any): Column
Remainder.
Remainder.
- Since
0.1.0
- def multiply(other: Any): Column
Multiply.
Multiply.
- Since
0.1.0
- def name(alias: String): Column
Returns a new renamed Column.
Returns a new renamed Column.
- Since
0.1.0
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def not_equal(other: Any): Column
Not equal to.
Not equal to.
- Since
0.1.0
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @HotSpotIntrinsicCandidate() @native()
- def or(other: Column): Column
Or.
Or.
- Since
0.1.0
- def over(): Column
Returns a windows frame, based on an empty WindowSpec expression.
Returns a windows frame, based on an empty WindowSpec expression.
- Since
0.1.0
- def over(window: WindowSpec): Column
Returns a windows frame, based on the specified WindowSpec.
Returns a windows frame, based on the specified WindowSpec.
- Since
0.1.0
- def plus(other: Any): Column
Plus.
Plus.
- Since
0.1.0
- def productElementNames: Iterator[String]
- Definition Classes
- Product
- def regexp(pattern: Column): Column
Returns true if this Column matches the specified regular expression.
Returns true if this Column matches the specified regular expression.
For details, see the Snowflake documentation on regular expressions.
- Since
0.1.0
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
Returns a string representation of the expression corresponding to this Column instance.
Returns a string representation of the expression corresponding to this Column instance.
- Definition Classes
- Column → AnyRef → Any
- Since
0.1.0
- def unary_!: Column
Unary not.
Unary not.
- Since
0.1.0
- def unary_-: Column
Unary minus.
Unary minus.
- Since
0.1.0
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- def withExpr(newExpr: Expression): Column
- Attributes
- protected
- def withinGroup(cols: Seq[Column]): Column
Returns a Column expression that adds a WITHIN GROUP clause to sort the rows by the specified sequence of columns.
Returns a Column expression that adds a WITHIN GROUP clause to sort the rows by the specified sequence of columns.
This method is supported on Column expressions returned by some of the aggregate functions, including functions.array_agg, LISTAGG(), PERCENTILE_CONT(), and PERCENTILE_DISC().
For example:
import com.snowflake.snowpark.functions._ import session.implicits._ // Create a DataFrame from a sequence. val df = Seq((3, "v1"), (1, "v3"), (2, "v2")).toDF("a", "b") // Create a DataFrame containing the values in "a" sorted by "b". df.select(array_agg(col("a")).withinGroup(Seq(col("b")))) // Create a DataFrame containing the values in "a" grouped by "b" // and sorted by "a" in descending order. df.select( array_agg(Seq(col("a"))) .withinGroup(col("a").desc) .over(Window.partitionBy(col("b"))) )
For details, see the Snowflake documentation for the aggregate function that you are using (e.g. ARRAY_AGG).
- Since
0.6.0
- def withinGroup(first: Column, remaining: Column*): Column
Returns a Column expression that adds a WITHIN GROUP clause to sort the rows by the specified columns.
Returns a Column expression that adds a WITHIN GROUP clause to sort the rows by the specified columns.
This method is supported on Column expressions returned by some of the aggregate functions, including functions.array_agg, LISTAGG(), PERCENTILE_CONT(), and PERCENTILE_DISC().
For example:
import com.snowflake.snowpark.functions._ import session.implicits._ // Create a DataFrame from a sequence. val df = Seq((3, "v1"), (1, "v3"), (2, "v2")).toDF("a", "b") // Create a DataFrame containing the values in "a" sorted by "b". val dfArrayAgg = df.select(array_agg(col("a")).withinGroup(col("b"))) // Create a DataFrame containing the values in "a" grouped by "b" // and sorted by "a" in descending order. var dfArrayAggWindow = df.select( array_agg(col("a")) .withinGroup(col("a").desc) .over(Window.partitionBy(col("b"))) )
For details, see the Snowflake documentation for the aggregate function that you are using (e.g. ARRAY_AGG).
- Since
0.6.0
- def ||(other: Any): Column
Or.
Or. Alias for or.
- Since
0.1.0
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)