什么叫抽取的更快?和什么比更快?你现在是怎么做的?
数据库性能是和很多因素有关的:
想要数据库响应的快,首先要有好的服务器。
如果数据库是在远程服务器上,还要有充足和流畅的带宽网络。
合理安排表的结构,建立索引。
针对你这个,800万条数据如果在一个表里,要有个整数型的ID作为主键,并做索引。如果数据是从不同的表里抽出来再组合起来的,表与表之间的链接键尽量用整数型并做索引。
然后生成10000个随机数,在ID里查找这1万个数字,取出对应的数据。
处理过程放到数据库端。
针对你这个,10000个随机数的生成函数用存储过程的形式存在服务器端。
MYSQL 取随机数2010年04月26日 星期一 09:48
mysql 取随机数
--对一个表取任意随机数
SELECT *
FROM TMP_XF_TEST
WHERE ID >= (SELECT FLOOR(RAND() * (SELECT MAX(ID) FROM TMP_XF_TEST)))
order by id LIMIT 1
--有条件性的取随机数
SELECT *
FROM TMP_XF_TEST
WHERE ID >= (SELECT FLOOR(RAND() *
((SELECT MAX(ID) FROM TMP_XF_TEST WHERE GID = 9) -
(SELECT MIN(ID) FROM TMP_XF_TEST WHERE GID = 9))) +
(SELECT MIN(ID) FROM TMP_XF_TEST WHERE GID = 9))
AND GID = 9
ORDER BY ID LIMIT 1
--gid上存在索引
或者
SELECT *
FROM TMP_XF_TEST AS t1 JOIN
(SELECT ROUND(RAND() * ((SELECT MAX(id) FROM TMP_XF_TEST WHERE GID = 9)-(SELECT MIN(id) FROM TMP_XF_TEST WHERE GID = 9))
+(SELECT MIN(id) FROM TMP_XF_TEST WHERE GID = 9)) AS id) AS t2
WHERE t1.id >= t2.id AND t1.GID = 9
ORDER BY t1.id LIMIT 1
#########
不要用下面的杯具写法
mysql>insert into tmp_xf_test(user_nick,gid,item_id,gmt_create,gmt_modified,memo)
->select user_nick,gid,item_id,gmt_create,gmt_modified,memo from tmp_xf_test
Query OK, 165888 rows affected (9.65 sec)
Records: 165888 Duplicates: 0 Warnings: 0
mysql>SELECT *
->FROM `tmp_xf_test`
->WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `tmp_xf_test` )
->ORDER BY id LIMIT 1
+-----+-----------+-----+---------+---------------------+---------------------+--------------------+
| id | user_nick | gid | item_id | gmt_create | gmt_modified| memo |
+-----+-----------+-----+---------+---------------------+---------------------+--------------------+
| 467 | 玄风 | 9 | 123 | 2010-04-26 14:56:39 | 2010-04-26 14:56:39 | 玄风测试使用的数据 |
+-----+-----------+-----+---------+---------------------+---------------------+--------------------+
1 row in set (51.12 sec)
mysql>explain SELECT *
->FROM `tmp_xf_test`
->WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `tmp_xf_test` )
->ORDER BY id LIMIT 1\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: tmp_xf_test
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: UNCACHEABLE SUBQUERY
table: tmp_xf_test
type: index
possible_keys: NULL
key: idx_tmp_xf_test_gid
key_len: 4
ref: NULL
rows: 331954
Extra: Using index
2 rows in set (0.01 sec)
---
mysql>SELECT * FROM `tmp_xf_test` t1 join
->(SELECT FLOOR( MAX(id) * RAND()) as id FROM `tmp_xf_test` ) as t2
->where t1.id >=t2.id
->ORDER BY t1.id LIMIT 1
+-------+-----------+-----+---------+---------------------+---------------------+--------------------+-------+
| id| user_nick | gid | item_id | gmt_create | gmt_modified| memo | id|
+-------+-----------+-----+---------+---------------------+---------------------+--------------------+-------+
| 40311 | 玄风 | 9 | 123 | 2010-04-28 15:47:19 | 2010-04-28 15:47:19 | 玄风测试使用的数据 | 40311 |
+-------+-----------+-----+---------+---------------------+---------------------+--------------------+-------+
1 row in set (0.14 sec)
##############
mysql>SELECT * FROM `tmp_xf_test`
->WHERE id >= (SELECT floor(RAND() * (SELECT MAX(id) FROM `tmp_xf_test`)))
->ORDER BY id LIMIT 1
+------+-----------+-----+---------+---------------------+---------------------+--------------------+
| id | user_nick | gid | item_id | gmt_create | gmt_modified| memo |
+------+-----------+-----+---------+---------------------+---------------------+--------------------+
| 1352 | 玄风 | 9 | 123 | 2010-04-28 15:47:19 | 2010-04-28 15:47:19 | 玄风测试使用的数据 |
+------+-----------+-----+---------+---------------------+---------------------+--------------------+
1 row in set (0.00 sec)
mysql>explain SELECT * FROM `tmp_xf_test`
->WHERE id >= (SELECT floor(RAND() * (SELECT MAX(id) FROM `tmp_xf_test`)))
->ORDER BY id LIMIT 1\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: tmp_xf_test
type: index
possible_keys: NULL
key: PRIMARY
key_len: 8
ref: NULL
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 3
select_type: SUBQUERY
table: NULL
type: NULL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra: Select tables optimized away
2 rows in set, 1 warning (0.00 sec)
对应的另外一种杯具写法是:
SELECT *
FROM TMP_XF_TEST
WHERE ID >= (SELECT FLOOR(RAND() * (MAX(ID) - MIN(ID))) + MIN(ID) MID
FROM TMP_XF_TEST
WHERE GID = 9)
AND GID = 9 LIMIT 1
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)