基于python分析你的上网行为看看你平时上网都在干嘛(2)-木庄网络博客

当前第2页返回上一页

# 绘制 页面访问频率排名 柱状图
def plot_bar_website_count_rank(value, history_data):
  # 频率字典
  dict_data = {}
  # 对历史记录文件进行遍历
  for data in history_data:
    url = data[1]
    # 简化url
    key = url_simplification(url)
    if (key in dict_data.keys()):
      dict_data[key] += 1
    else:
      dict_data[key] = 0
  # 筛选出前k个频率最高的数据
  k = convert_to_number(value)
  top_10_dict = get_top_k_from_dict(dict_data, k)

  figure = go.Figure(
    data=[
      go.Bar(
        x=[i for i in top_10_dict.keys()],
        y=[i for i in top_10_dict.values()],
        name='bar',
        marker=go.bar.Marker(
          color='rgb(55, 83, 109)'
        )
      )
    ],
    layout=go.Layout(
      showlegend=False,
      margin=go.layout.Margin(l=40, r=0, t=40, b=30),
      paper_bgcolor='rgba(0,0,0,0)',
      plot_bgcolor='rgba(0,0,0,0)',
      xaxis=dict(title='网站'),
      yaxis=dict(title='次数')
    )
  )

  return figure

该函数的代码流程为:

首先，对解析完数据库文件后返回的history_data进行遍历，获得url数据，并调用url_simplification(url)对齐进行简化。接着，依次将简化后的url存入字典中。
调用get_top_k_from_dict(dict_data, k)，从字典dict_data中获取前k个最大值的数据。
接着，开始绘制柱状图了。使用go.Bar()绘制柱状图，其中，x和y代表的是属性和属性对应的数值，为list格式。xaxis和yaxis`分别设置相应坐标轴的标题
返回一个figure对象，以便于传输给前端。
而assets目录下包含的数据为image和css，都是用于前端布局。

5. 后台部署

与后台部署有关的文件为app_callback.py文件。这个文件使用回调的方式对前端页面布局进行更新。

首先，我们看看关于页面访问频率排名的回调函数：

# 页面访问频率排名
@app.callback(
  dash.dependencies.Output('graph_website_count_rank', 'figure'),
  [
    dash.dependencies.Input('input_website_count_rank', 'value'),
    dash.dependencies.Input('store_memory_history_data', 'data')
  ]
)
def update(value, store_memory_history_data):

  # 正确获取到历史记录文件
  if store_memory_history_data:
    history_data = store_memory_history_data['history_data']
    figure = plot_bar_website_count_rank(value, history_data)
    return figure
  else:
    # 取消更新页面数据
    raise dash.exceptions.PreventUpdate("cancel the callback")

该函数的代码流程为:

首先确定好输入是什么(触发回调的数据)，输出是什么(回调输出的数据)，需要带上什么数据。dash.dependencies.Input指的是触发回调的数据，而dash.dependencies.Input('input_website_count_rank', 'value')表示当id为input_website_count_rank的组件的value发生改变时，会触发这个回调。而该回调经过update(value, store_memory_history_data)的结果会输出到id为graph_website_count_rank的value，通俗来讲，就是改变它的值。

对于def update(value, store_memory_history_data)的解析。首先是判断输入数据store_memory_history_data是否不为空对象，接着读取历史记录文件history_data，接着调用刚才所说的app_plot.py文件中的plot_bar_website_count_rank()，返回一个figure对象，并将这个对象返回到前端。至此，前端页面的布局就会显示出页面访问频率排名的图表了。
还有一个需要说的就是关于上次文件的过程，这里我们先贴出代码：

# 上传文件回调
@app.callback(
  dash.dependencies.Output('store_memory_history_data', 'data'),
  [
    dash.dependencies.Input('dcc_upload_file', 'contents')
  ]
)
def update(contents):
  if contents is not None:
    # 接收base64编码的数据
    content_type, content_string = contents.split(',')
    # 将客户端上传的文件进行base64解码
    decoded = base64.b64decode(content_string)
    # 为客户端上传的文件添加后缀，防止文件重复覆盖
    # 以下方式确保文件名不重复
    suffix = [str(random.randint(0,100)) for i in range(10)]
    suffix = "".join(suffix)
    suffix = suffix + str(int(time.time()))
    # 最终的文件名
    file_name = 'History_' + suffix
    # print(file_name)
    # 创建存放文件的目录
    if (not (exists('data'))):
      makedirs('data')

    # 欲写入的文件路径
    path = 'data' + '/' + file_name

    # 写入本地磁盘文件
    with open(file=path, mode='wb+') as f:
      f.write(decoded)


    # 使用sqlite读取本地磁盘文件
    # 获取历史记录数据
    history_data = get_history_data(path)
    
    # 获取搜索关键词数据
    search_word = get_search_word(path)

    # 判断读取到的数据是否正确
    if (history_data != 'error'):
      # 找到
      date_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))
      print('新接收到一条客户端的数据, 数据正确, 时间:{}'.format(date_time))
      store_data = {'history_data': history_data, 'search_word': search_word}
      return store_data
    else:
      # 没找到
      date_time = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))
      print('新接收到一条客户端的数据, 数据错误, 时间:{}'.format(date_time))
      return None
  return None

该函数的代码流程为:

首先判断用户上传的数据contents是否不为空，接着将客户端上传的文件进行base64解码。并且，为客户端上传的文件添加后缀，防止文件重复覆盖，最终将客户端上传的文件写入本地磁盘文件。
写入完毕后，使用sqlite读取本地磁盘文件，若读取正确，则返回解析后的数据，否则返回None

如何运行

在线演示程序:http://39.106.118.77:8090(普通服务器，勿测压)

运行本程序十分简单，只需要按照以下命令即可运行：

# 跳转到当前目录
cd 目录名
# 先卸载依赖库
pip uninstall -y -r requirement.txt
# 再重新安装依赖库
pip install -r requirement.txt
# 开始运行
python app.py
# 运行成功后，通过浏览器打开http://localhost:8090

补充

完整版源代码存放在github上，有需要的可以下载

项目持续更新，欢迎您star本项目

标签：SQLite

返回前面的内容