Panda使用

Pandas的核心为两个数据结构,Series、DataFrame

Series是一列,DataFrame是由Series集合成的多维表

创建DataFrame:

1
2
3
4
5
data = {
'apples': [3, 2, 0, 1],
'oranges': [0, 3, 7, 2]
}
purchases = pd.DataFrame(data)
apples oranges
0 3 2
1 2 3
2 0 7
3 1 2

此时索引为自动创建,从0开始

1
purchases = pd.DataFrame(data, index=['June', 'Robert', 'Lily', 'David'])
apples oranges
June 3 2
Robert 2 3
Liny 0 7
David 1 2

此时索引为手动创建

读取数据:

1
2
3
data = pd.read_csv( 'my_file.csv' )
data = pd.read_json( 'purchases.json' )
data = pd.read_csv( my_file.csv , index_col="Title") # set Title as index

打印其中几行作为参考:

1
2
3
4
data.head() # print 1 - 5 columns
data.head(x) # print 1 - x columns
data.tail() # print sum-4 - sum columns
data.tail(x) # print sum-x+1 - sum columns

panda可与数据库进行连接(这里以mysql为例):

1
2
3
4
5
6
7
8
9
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('mysql+pymysql://root:12345678@localhost:3306/testdb')
sql_query = 'select * from product;'
df_read = pd.read_sql_query(sql_query, engine)
print(df_read)
df_write = pd.DataFrame({'id': [10, 27, 34, 46], 'name': ['张三', '李四', '王五', '赵六'], 'score': [80, 75, 56, 99]})
# 将df储存为MySQL中的表,不储存index列
df_write.to_sql('testdf', engine, index=False)

写回数据:

1
2
df.to_csv('test.csv')
df.to_json('test.json')
  • Copyright: Copyright is owned by the author. For commercial reprints, please contact the author for authorization. For non-commercial reprints, please indicate the source.
  • Copyrights © 2023 J-sycamore