题目要求:
使用SQL进行数据分析
1、 SQL数据分析
考核条件如下:
(1) 环境中已经安装mysql5.7,用户名root,密码123456,开启数据库服务
在Linux命令行中输入以下命令:
service mysqld start
在做题之前,还需要做一些准备工作,即使用navicat连接竞赛方提供的mysql数据库,并按照题目要求导入数据表并对导入的表进行相关处理。
①使用navicat连接mysql
配置“常规”:
配置“SSH”:
配置完成后,点击“测试连接”,若d出“连接成功”提示信息则已经建立与服务器中mysql数据库的连接,点击“确定”。
②导入数据表并对导入的表进行相关处理
新建bigdata数据库:
新建一个查询,在查询中输入以下代码
create database bigdata;
use bigdata;
导入数据表:
此处要注意字段类型的设置
出现finished successfully提示,说明数据已经导入成功,点击“关闭”。
接下来按照题目要求,在刚才新建的查询中输入以下代码并执行
-- 添加日期时间datetime列:
alter TABLE UserBehavior add datetime datetime;
-- 数据转换:
update UserBehavior set datetime = FROM_UNIXTIME(timestamp);
-- 添加日期字段:
alter TABLE UserBehavior add date date;
-- 数据转换:
update UserBehavior set date = FROM_UNIXTIME(timestamp,'%y-%m-%d');
(2) 分析日浏览量pv,计算浏览最活跃的日期及其pv数,结果存入视图table1。(字段:date,pv)
create view table1
as
select date,count(*) as pv
from userbehavior
where type='pv'
group by date
order by pv desc
limit 1;
(3) 分析日访客量uv,计算访客最多的日期及其uv数,结果存入视图table2。(字段:date,uv ,注意用户去重)
create view table2
as
select date ,count(distinct user_id) as uv
from userbehavior
group by date
order by uv desc;
(4) 分析人均访问量,计算人均访问量最多的的日期及其人均访问数perpv,结果存入视图table3。(字段:date,perpv,计算方式参考“关键指标”)
create view table3
as
select date ,count(*)/count(distinct user_id) as perpv
from
where type = 'pv'
group by date
order by perpv desc;
(5) 计算全量数据购买转化率(购买/浏览),结果存入视图table4。(字段: pv,buy,buypv)
create view table4
as
select sum(case when type='pv' then 1 else 0 end) as pv,
sum(case when type='buy' then 1 else 0 end) as buy,
sum(case when type='buy' then 1 else 0 end) /sum(case when type='pv' then 1 else 0 end) as buypv
from UserBehavior;
(6) 计算全量数据加购转化率(加购/浏览),结果存入视图table5。(字段:pv,cart,cartpv)
create view table5
as
select sum(case when type='pv' then 1 else 0 end) as pv,
sum(case when type='cart' then 1 else 0 end) as cart,
sum(case when type='cart' then 1 else 0 end) /sum(case when type='pv' then 1 else 0 end) as cartpv
from UserBehavior;
(7) 计算全量数据收藏转化率(收藏/浏览),结果存入视图table6。(字段:pv,fav,favpv)
create view table6
as
select sum(case when type='pv' then 1 else 0 end) as pv,
sum(case when type='fav' then 1 else 0 end) as fav,
sum(case when type='fav' then 1 else 0 end) /sum(case when type='pv' then 1 else 0 end) as favpv
from UserBehavior;
(8) 查看全量数据中浏览次数最多的五个商品及浏览次数,结果写入视图table7。(字段:item_id,sum)
create view table7
as
select item_id,count(item_id) as sum
from UserBehavior
where type = 'pv'
group by item_id
order by sum desc
limit 5;
(9) 根据上述结果查看浏览次数最多的五个产品的购买转化率buypv,结果写入视图table8。(字段:item_id,pv,buy,buypv)
create view table8
as
select item_id,
sum(case when type='pv' then 1 else 0 end) as pv,
sum(case when type='buy' then 1 else 0 end) as buy,
sum(case when type='buy' then 1 else 0 end) /sum(case when type='pv' then 1 else 0 end) as buypv
from UserBehavior
group by item_id
order by pv desc limit 5;
(10) 找出复购次数最高的用户top5和及其复购次数times,结果写入视图table9。(字段:user_id,times)
create view table9
as
select user_id,count(user_id) as times
from UserBehavior
where type = 'buy'
group by user_id
order by times desc limit 5;
(11) 计算上述分析结果,针对复购top1用户,分析其购买商品购买次数times,结果写入视图table10。(字段:item_id,times)
create view table10
as
select item_id,count(item_id) as times
from UserBehavior
where user_id = 1003983 -- 此处id由第10题查询中得到
group by item_id
order by times desc;
以上就是第四届智警杯中国刑警学院校内选拔赛mysql部分的全部内容
另:附数据表的下载链接
链接:https://pan.baidu.com/s/11A8-QsnbSWas0Gvj-XkgnA?pwd=z1tk
提取码:z1tk
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)