MySQLSQL极简教程 · MySQL · MyBatis · JPA 技术笔记 教程 总结

【MySQL】浅谈MySQL的LOAD DATA

2019-10-19  本文已影响0人  loannes
mysql_001.jpg

前言

好久没碰MySQL了,这次碰巧在研究superset的时候需要将一份csv格式的数据文件导入到数据库中。正好借此机会可以重温下MySQL。

数据来源
网盘密码 : g5xa

开发环境

准备工作

在开始之前需要对源数据做一次清洗:


在这之前先简单阅读下官方文档:MySQL Documentation

可以很方便的找到LOAD DATA的表达式:

LOAD DATA
    [LOW_PRIORITY | CONCURRENT] [LOCAL]
    INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [PARTITION (partition_name [, partition_name] ...)]
    [CHARACTER SET charset_name]
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number {LINES | ROWS}]
    [(col_name_or_user_var
        [, col_name_or_user_var] ...)]
    [SET col_name={expr | DEFAULT},
        [, col_name={expr | DEFAULT}] ...]

Get Started

我们知道导入数据首先得有表,所以在开始导入之前,我们先建个表。这个简单:

mysql> create table da_inv( 
id int not null auto_increment primary key, 
city varchar(20) not null, 
companyFullName varchar(255) not null, companyId varchar(100) not null, companyLabelList varchar(255) not null default '', 
companyShortName varchar(100) not null default '', 
companySize varchar(100) not null, businessZones varchar(255) not null, 
firstType varchar(100) not null, 
secondType varchar(100) not null, 
education varchar(100) not null, 
industruField varchar(100) not null,
positionId varchar(100) not null, positionAdvantage varchar(255) not null, positionName varchar(255) not null, positonLabels varchar(255) not null, 
salary varchar(100) not null, 
workYear varchar(100) not null, 
topSalary varchar(100) not null, 
bottomSalary varchar(100) not null, averageSalary varchar(100) not null);

ok...
那么问题来了,表我们建好了,数据也准备好了 。

带着这两个问题继续查看文档:

If LOCAL is not specified, the file must be located on the server host and is read directly by the server.

我们先照着上面这句的方式,把文件放在MySQL Server中:

mysql> select @@datadir;
+-----------------------+
| @@datadir             |
+-----------------------+
| /usr/local/var/mysql/ |
+-----------------------+
1 row in set (0.01 sec)

ok.. 把数据放到对应的数据库目录下

cd /usr/local/var/mysql/inv_data
➜  inv_data  mkdir data

为了区分,我在这里新建了个data的文件夹,把数据放到这里了。

mysql> use inv_data;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>
load data infile 'inv_data/data/DataAnalyst.csv' into table da_inv fields terminated by ',' lines terminated by '\r\n';
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement

报了个错误,emmm
MySQL还加了什么限制吗?看下文档:

mysql> select @@secure_file_priv;
+--------------------+
| @@secure_file_priv |
+--------------------+
| NULL |
+--------------------+
1 row in set (0.00 sec)
secure_file_priv may be set as follows: 
*   If empty, the variable has no effect. This is not a secure setting.

*   If set to the name of a directory, the server limits import and export operations to work only with files in that directory. The directory must exist; the server will not create it.

*   If set to `NULL`, the server disables import and export operations.

文档中解释到由于secure_file_privNULL,无法进行导入导出操作。我们需要把它改为指定路径。

通过修改my.cnf文件,找到mysqld设置secure-file-priv/usr/local/var/mysql/inv_data

mdfind -name "my.cnf"
/private/etc/my.cnf

找到mysqld,添加secure-file-priv = /usr/local/var/mysql/inv_data

[mysqld]
# skip-grant-tables
character-set-server=utf8
init_connect='SET NAMES utf8
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 16M
max_allowed_packet = 1M
table_open_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
character-set-server=utf8
init_connect='SET NAMES utf8'
secure-file-priv = /usr/local/var/mysql/inv_data

保存并重启mysql

mysql.server restart
Shutting down MySQL
..... SUCCESS!
Starting MySQL
.... SUCCESS!


mysql> load data infile 'inv_data/data/DataAnalyst.csv' into table da_inv fields terminated by ',' lines terminated by '\r\n';

Query OK, 6876 rows affected (0.80 sec)
Records: 6876  Deleted: 0  Skipped: 0  Warnings: 0

搞定!

对于LOAD DATA的工作原理和一些思考

先读文档:LOCAL INFILE Request

可以看到下面这张图表:


inline_umlgraph_49.png

基本上可以这么解读:

问题来了

客户端如果如果想要发送文件必须先知道这个文件的完整路径,但是在本例中我们只告诉了客户端inv_data/data/DataAnalyst.csv这样不完整的路径。所以需要服务端再返回给客户端该文件的详细地址才能把文件发送过去。就好比快递员接受的你的订单,但是你给他的地址并不齐全这个时候快递员就会打电话过来问你要具体的地址然后才能把快递寄到你家里。

MySQL Server的原理解释清楚了,回到一开始的问题:

如何发送本地文件到MySQL Server

If LOCAL is specified, the file is read by the client program on the client host and sent to the server. The file can be given as a full path name to specify its exact location. If given as a relative path name, the name is interpreted relative to the directory in which the client program was started.

load data local infile '/Users/weijiezhu/Desktop/DataAnalyst.txt' into table da_inv fields terminated by ',' lines terminated by '\r\n';
ERROR 1148 (42000): The used command is not allowed with this MySQL version

发现并不能顺利使用,反而告诉我这条命令被禁掉了。
可以看这篇文档 Security Issues with LOAD DATA LOCAL

For the mysql client, local data loading is disabled by default. To disable or enable it explicitly, use the --local-infile=0 or --local-infile[=1] option.

这里按照官方文档设置为1没用,不知道什么原因。改为ON就没问题了

SET GLOBAL local_infile = 'ON';
load data local infile '/Users/weijiezhu/Desktop/DataAnalyst.txt' into table da_inv fields terminated by ',' lines terminated by '\r\n';
Query OK, 0 rows affected, 6876 warnings (0.25 sec)
Records: 6876  Deleted: 0  Skipped: 6876  Warnings: 6876

ok 搞定了!

导入本地文件的总结

上一篇下一篇

猜你喜欢

热点阅读