pyspark.pandas.DataFrame.plot.hist#

plot.hist(bins=10, **kwds)#

Draw one histogram of the DataFrame’s columns.

A histogram is a representation of the distribution of data. This function calls plotting.backend.plot(), on each series in the DataFrame, resulting in one histogram per column. This is useful when the DataFrame’s Series are in a similar scale.

Parameters

binsinteger or sequence, default 10: Number of histogram bins to be used. If an integer is given, bins + 1 bin edges are calculated and returned. If bins is a sequence, it gives bin edges, including left edge of first bin and right edge of last bin. In this case, bins are returned unmodified.
**kwds: All other plotting keyword arguments to be passed to plotting backend.

Returns

plotly.graph_objs.Figure: Return an custom object when backend!=plotly. Return an ndarray when subplots=True (matplotlib-only).

Examples

Basic plot.

For Series:

>>> s = ps.Series([1, 3, 2])
>>> s.plot.hist()  

For DataFrame:

>>> df = pd.DataFrame(
...     np.random.randint(1, 7, 6000),
...     columns=['one'])
>>> df['two'] = df['one'] + np.random.randint(1, 7, 6000)
>>> df = ps.from_pandas(df)
>>> df.plot.hist(bins=12, alpha=0.5)