商务统计_6 用图表演示数据 - 定性数据

目录

  • 定性数据
    • 饼图
    • 条形图


定性数据

  • 饼图(pie chart)
    多个扇形组成的圆,每个扇形的大小比例为各类别的相对频数或百分比。
    通常表示定类、定序数据
    • 优缺
      优:直观;可以展示每类数据的占比
      缺:不能分类太多;不能比较组间数据;丢失部分数据,如最大值
# _*_ coding: utf-8 _*_
# Python
# 构造数据集
import math
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

def lis():
	# 随机整数序列  50个
	np.random.seed(110)
	return np.random.randint(low=70, high=130, size=50)

def group():
	# 分多少组?
	k = 1
	while math.pow(2, k) < len(lis()):
		k += 1
	return k + 1

def gap_group():
	# 组距?
	if (max(lis()) - min(lis())) % group() == 0:
		return (max(lis()) - min(lis()))//group()
	else:
		return (max(lis()) - min(lis()))//group() + 1

def lis_group_str():
	# 节点值,上下限
	return [min(lis()) + gap_group() * i for i in range(group() + 1)]

def flag2(df):
	# 据节点值分组
	for n, k in enumerate(lis_group_str()):
		try:
			df.loc[df['Random'] >= k, 'flag'] = str(n) + '-' + str(lis_group_str()[n]) + '-' + str(lis_group_str()[n+1])
		except:
			df.loc[df['Random'] >= k, 'flag'] = str(n-1) + '-' + str(lis_group_str()[n-1]) + '-' + str(lis_group_str()[n])
	return df

df = flag2(pd.DataFrame(lis(), columns=['Random']))

help(pd.DataFrame.plot)
df.plot(x=None, y=None, kind=‘line’, ax=None, subplots=False, sharex=None, sharey=False, layout=None, figsize=None, use_index=True, title=None, grid=None, legend=True, style=None, logx=False, logy=False, loglog=False, xticks=None, yticks=None, xlim=None, ylim=None, rot=None, fontsize=None, colormap=None, table=False, yerr=None, xerr=None, secondary_y=False, sort_columns=False, **kwds)

help(plt.pie)
pie(x, explode=None, labels=None, colors=None, autopct=None, pctdistance=0.6, shadow=False, labeldistance=1.1, startangle=None, radius=None, counterclock=True, wedgeprops=None, textprops=None, center=(0, 0), frame=False, rotatelabels=False, hold=None, data=None)

# autopct, explode
df.groupby(by='flag', as_index=True, sort=True).count().plot(y='Random', kind='pie', autopct='%.1f%%', explode=(0, 0.1, 0, 0, 0, 0, 0), figsize=(10, 8), title='Pie Demo', shadow=True)
plt.legend(loc=3)       
plt.show()

商务统计_6 用图表演示数据 - 定性数据_第1张图片

  • 条形图(水平、垂直、复式、堆积)
    用等宽直条的长短来表示各指标大小。
    适用于定类、定序数据
    优缺:

    • 优:直观;可以比较多个数据集。
    • 缺:只适用于定性变量。

    对比直方图:条形图可以不连续,直方图必须连续;条形图->定性数据,直方图->定量数据。

def df2():
    dic = {'Things 1':{2001: 38, 2002: 40, 2003:43, 2004:42, 2005:46},
           'Things 2':{2001: 40, 2002: 40, 2003:36, 2004:44, 2005:52},
           'Things 3':{2001: 78, 2002: 73, 2003:77, 2004:43, 2005:40}}
    return pd.DataFrame(dic)

df1 = df.groupby(by='flag', as_index=True, sort=True).count()

fig = plt.figure(figsize=(10, 12), dpi=80)
ax1 = fig.add_subplot(2, 3, 1)
df1.plot(ax=ax1, kind='bar', y='Random', use_index=True, title='Bar Demo', align='center', fontsize=6)

ax2 = fig.add_subplot(2, 3, 2)
df1.plot(ax=ax2, kind='barh', y='Random', use_index=True, title='Bar Demo', align='center')

ax3 = fig.add_subplot(2, 3, 3)
df2().plot(ax=ax3, kind='bar', use_index=True, title='Clustered bar')

ax4 = fig.add_subplot(2, 3, 4)
df2().plot(ax=ax4, kind='barh', use_index=True, title='Clustered barh')

ax5 = fig.add_subplot(2, 3, 5)
df2().plot(ax=ax5, kind='bar', use_index=True, stacked=True, title='Stacked bar')

ax6 = fig.add_subplot(2, 3, 6)
df2().plot(ax=ax6, kind='barh', use_index=True, stacked=True, title='Stacked barh')

plt.show()

商务统计_6 用图表演示数据 - 定性数据_第2张图片

  • 附1. df.plot()参数
  • 附2. plt.bar
>>> help(plt.bar)
Help on function bar in module matplotlib.pyplot:

bar(*args, **kwargs)
    Make a bar plot.
    
    Call signatures::
    
       bar(x, height, *, align='center', **kwargs)
       bar(x, height, width, *, align='center', **kwargs)
       bar(x, height, width, bottom, *, align='center', **kwargs)
    
    The bars are positioned at *x* with the given *align* ment. Their
    dimensions are given by *width* and *height*. The vertical baseline
    is *bottom* (default 0).
    
    Each of *x*, *height*, *width*, and *bottom* may either be a scalar
    applying to all bars, or it may be a sequence of length N providing a
    separate value for each bar.
    
    
    Parameters
    ----------
    x : sequence of scalars
        The x coordinates of the bars. See also *align* for the
        alignment of the bars to the coordinates.
    
    height : scalar or sequence of scalars
        The height(s) of the bars.
    
    width : scalar or array-like, optional
        The width(s) of the bars (default: 0.8).
    
    bottom : scalar or array-like, optional
        The y coordinate(s) of the bars bases (default: 0).
    
    align : {'center', 'edge'}, optional, default: 'center'
        Alignment of the bars to the *x* coordinates:
    
        - 'center': Center the base on the *x* positions.
        - 'edge': Align the left edges of the bars with the *x* positions.
    
        To align the bars on the right edge pass a negative *width* and
        ``align='edge'``.
    
    Returns
    -------
    `.BarContainer`
        Container with all the bars and optionally errorbars.
    
    Other Parameters
    ----------------
    color : scalar or array-like, optional
        The colors of the bar faces.
    
    edgecolor : scalar or array-like, optional
        The colors of the bar edges.
    
    linewidth : scalar or array-like, optional
        Width of the bar edge(s). If 0, don't draw edges.
    
    tick_label : string or array-like, optional
        The tick labels of the bars.
        Default: None (Use default numeric labels.)
    
    xerr, yerr : scalar or array-like of shape(N,) or shape(2,N), optional
        If not *None*, add horizontal / vertical errorbars to the bar tips.
        The values are +/- sizes relative to the data:
    
        - scalar: symmetric +/- values for all bars
        - shape(N,): symmetric +/- values for each bar
        - shape(2,N): separate + and - values for each bar
    
        Default: None
    
    ecolor : scalar or array-like, optional, default: 'black'
        The line color of the errorbars.
    
    capsize : scalar, optional
       The length of the error bar caps in points.
       Default: None, which will take the value from
       :rc:`errorbar.capsize`.
    
    error_kw : dict, optional
        Dictionary of kwargs to be passed to the `~.Axes.errorbar`
        method. Values of *ecolor* or *capsize* defined here take
        precedence over the independent kwargs.
    
    log : bool, optional, default: False
        If *True*, set the y-axis to be log scale.
    
    orientation : {'vertical',  'horizontal'}, optional
        *This is for internal use only.* Please use `barh` for
        horizontal bar plots. Default: 'vertical'.
    

你可能感兴趣的:(Py,数据分析)