调用上述函数得到每个部门下影响最大的类目:
final=max_lm(result,df)
如果我们想把结果导出到excel表中,可以这样写:
writer=pd.ExcelWriter('output_today.xlsx')result.to_excel(writer, sheet_name='max_departments')final.to_excel(writer, sheet_name='max_itemcats')writer.save()
总结
准备好python开发环境,即 “代码编辑器+ipython”;
写代码前清晰定义实现最终目标的每个步骤;
写代码的过程中识别做事的模式,创造工具简化复用性操作;
利用for循环遍历列表。
最终代码
import pandas as pd import datetimetoday="%s"%(datetime.date.today())yesterday="%s"%(datetime.date.today()-datetime.timedelta(days=1))qiantian="%s"%(datetime.date.today()-datetime.timedelta(days=2))print("昨天和前天的日期分别是:")print(yesterday,qiantian)df=pd.read_excel('/Users/xiangzhendong/Downloads/income_data.xls')def max_change(df, rows): df_pivot=df.pivot_table('income',index=rows,columns='date',aggfunc='sum') df_pivot['change_amt']=df_pivot[yesterday]-df_pivot[qiantian] df_pivot['change_pct']=abs(df_pivot['change_amt']/df_pivot['change_amt'].sum()) return df_pivot[df_pivot['change_pct']>=0.1]result=max_change(df,'department')print(result)def max_lm(result,df): frames=[] for i in range(len(result)): result_lm=max_change(df[df['department']==result.index[i]],'itemcats') result_lm['department']=result.index[i]