import re
import pandas as pd
def squeeze(value: str, replace=" ") -> str:
re1 = re.sub(r"[\x00-\x20]+", replace, value).strip() #将所有空格字符序列替换为单个空格.
date = re.findall(r'\d{4}-\d{2}-\d{2}',re1)[:10]
order = re.findall(r'(\d{2})万',re1)
list1 = zip(date,order)
return list(list1)
if __name__ == '__main__':
s = str = "2020-07-15 37万 2020-07-16 30万 2020-07-17 31万 2020-07-18 32万 2020-07-19 33万 2020-07-20 34万 2020-07-21 33万 2020-07-22 32万 2020-07-23 38万 2020-07-24 39万 2020-07-25 40万 2020-07-26 41万 2020-07-27 42万 2020-07-28 41万 2020-07-29 40万 2020-07-30 43万"
ret = squeeze(s)
print(ret)
#保存路径
name = ['date','order_cnt']
test =pd.DataFrame(columns=name,data = ret)
test.to_csv('D:/applied data/test.csv',encoding='gbk')
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)