使用sqlalchemy将csv文件加载到数据库中

使用sqlalchemy将csv文件加载到数据库中,第1张

使用sqlalchemy将csv文件加载到数据库中

由于SQLAlchemy的强大功能,我还在项目中使用了它。它的强大功能来自于与数据库“对话”的面向对象的方式,而不是硬编码难以管理的SQL语句。更不用说,它也快很多。

坦率地回答您的问题,是的!使用SQLAlchemy将数据从CSV存储到数据库中简直是小菜一碟。这是一个完整的工作示例(我使用了SQLAlchemy
1.0.6和Python 2.7.6):

from numpy import genfromtxtfrom time import timefrom datetime import datetimefrom sqlalchemy import Column, Integer, Float, Datefrom sqlalchemy.ext.declarative import declarative_basefrom sqlalchemy import create_enginefrom sqlalchemy.orm import sessionmakerdef Load_Data(file_name):    data = genfromtxt(file_name, delimiter=',', skip_header=1, converters={0: lambda s: str(s)})    return data.tolist()base = declarative_base()class Price_History(base):    #Tell SQLAlchemy what the table name is and if there's any table-specific arguments it should know about    __tablename__ = 'Price_History'    __table_args__ = {'sqlite_autoincrement': True}    #tell SQLAlchemy the name of column and its attributes:    id = Column(Integer, primary_key=True, nullable=False)     date = Column(Date)    opn = Column(Float)    hi = Column(Float)    lo = Column(Float)    close = Column(Float)    vol = Column(Float)if __name__ == "__main__":    t = time()    #Create the database    engine = create_engine('sqlite:///csv_test.db')    base.metadata.create_all(engine)    #Create the session    session = sessionmaker()    session.configure(bind=engine)    s = session()    try:        file_name = "t.csv" #sample CSV file used:  http://www.google.com/finance/historical?q=NYSE%3AT&ei=W4ikVam8LYWjmAGjhoHACw&output=csv        data = Load_Data(file_name)        for i in data: record = Price_History(**{     'date' : datetime.strptime(i[0], '%d-%b-%y').date(),     'opn' : i[1],     'hi' : i[2],     'lo' : i[3],     'close' : i[4],     'vol' : i[5] }) s.add(record) #Add all the records        s.commit() #Attempt to commit all the records    except:        s.rollback() #Rollback the changes on error    finally:        s.close() #Close the connection    print "Time elapsed: " + str(time() - t) + " s." #0.091s

(注意:这不一定是执行此 *** 作的“最佳”方法,但我认为这种格式对于初学者来说可读性很强;它也非常快:插入251条记录时为0.091秒!)

我认为,如果您逐行进行 *** 作,您会发现使用起来很轻松。注意缺少SQL语句-哎呀!我还随意使用numpy在两行中加载CSV内容,但是如果您愿意,也可以不使用它。

如果您想与传统方式进行比较,请参考以下完整示例:

import sqlite3import timefrom numpy import genfromtxtdef dict_factory(cursor, row):    d = {}    for idx, col in enumerate(cursor.description):        d[col[0]] = row[idx]    return ddef Create_DB(db):          #Create DB and format it as needed    with sqlite3.connect(db) as conn:        conn.row_factory = dict_factory        conn.text_factory = str        cursor = conn.cursor()        cursor.execute("CREATE TABLE [Price_History] ([id] INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE, [date] DATE, [opn] FLOAT, [hi] FLOAT, [lo] FLOAT, [close] FLOAT, [vol] INTEGER);")def Add_Record(db, data):    #Insert record into table    with sqlite3.connect(db) as conn:        conn.row_factory = dict_factory        conn.text_factory = str        cursor = conn.cursor()        cursor.execute("INSERT INTO Price_History({cols}) VALUES({vals});".format(cols = str(data.keys()).strip('[]'),          vals=str([data[i] for i in data]).strip('[]')         ))def Load_Data(file_name):    data = genfromtxt(file_name, delimiter=',', skiprows=1, converters={0: lambda s: str(s)})    return data.tolist()if __name__ == "__main__":    t = time.time()    db = 'csv_test_sql.db' #Database filename     file_name = "t.csv" #sample CSV file used:  http://www.google.com/finance/historical?q=NYSE%3AT&ei=W4ikVam8LYWjmAGjhoHACw&output=csv    data = Load_Data(file_name) #Get data from CSV    Create_DB(db) #Create DB    #For every record, format and insert to table    for i in data:        record = {     'date' : i[0],     'opn' : i[1],     'hi' : i[2],     'lo' : i[3],     'close' : i[4],     'vol' : i[5] }        Add_Record(db, record)    print "Time elapsed: " + str(time.time() - t) + " s." #3.604s

(注意:即使以“旧”方式,这也绝不是最好的方法,但是它非常易读,并且是SQLAlchemy方式与“旧”方式的“一对一”转换。)

注意SQL语句:一个创建表,另一个插入记录。此外,请注意,与添加简单的类属性相比,维护长的SQL字符串要麻烦一些。到目前为止喜欢SQLAlchemy?

当然,对于您的外键查询。SQLAlchemy也具有执行此 *** 作的能力。这是一个带有外键分配的类属性的示例(假设

ForeignKey
该类也已从
sqlalchemy
模块中导入):

class Asset_Analysis(base):    #Tell SQLAlchemy what the table name is and if there's any table-specific arguments it should know about    __tablename__ = 'Asset_Analysis'    __table_args__ = {'sqlite_autoincrement': True}    #tell SQLAlchemy the name of column and its attributes:    id = Column(Integer, primary_key=True, nullable=False)     fid = Column(Integer, ForeignKey('Price_History.id'))

将“ fid”列指向Price_History的id列的外键。

希望有帮助!



欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5639607.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-16
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存