pyspark.pandas.DataFrame.values¶
-
property
DataFrame.
values
¶ Return a Numpy representation of the DataFrame or the Series.
Warning
We recommend using DataFrame.to_numpy() or Series.to_numpy() instead.
Note
This method should only be used if the resulting NumPy ndarray is expected to be small, as all the data is loaded into the driver’s memory.
- Returns
- numpy.ndarray
Examples
A DataFrame where all columns are the same type (e.g., int64) results in an array of the same type.
>>> df = ps.DataFrame({'age': [ 3, 29], ... 'height': [94, 170], ... 'weight': [31, 115]}) >>> df age height weight 0 3 94 31 1 29 170 115 >>> df.dtypes age int64 height int64 weight int64 dtype: object >>> df.values array([[ 3, 94, 31], [ 29, 170, 115]])
A DataFrame with mixed type columns(e.g., str/object, int64, float32) results in an ndarray of the broadest type that accommodates these mixed types (e.g., object).
>>> df2 = ps.DataFrame([('parrot', 24.0, 'second'), ... ('lion', 80.5, 'first'), ... ('monkey', np.nan, None)], ... columns=('name', 'max_speed', 'rank')) >>> df2.dtypes name object max_speed float64 rank object dtype: object >>> df2.values array([['parrot', 24.0, 'second'], ['lion', 80.5, 'first'], ['monkey', nan, None]], dtype=object)
For Series,
>>> ps.Series([1, 2, 3]).values array([1, 2, 3])
>>> ps.Series(list('aabc')).values array(['a', 'a', 'b', 'c'], dtype=object)