import pandas as pd
DataFrame Initialization
from dictionary
It is perhaps easiest to construct a DataFrame object from a list of dictionaries. I prefer a list of dictionaries over a dictionary of dictionaries because the indices are guaranteed to be unique.
data=[]
for run in runs:
data.append({
'energy':energy_of_run,
'dynamic':True/False,
'force':np.array([Fx,Fy,Fz])
'param1':param1_of_run,
'param2':param2_of_run,
'zpe':zpe,
...
})
import pandas
df=pandas.DataFrame.from_dict(data)
Data selection
select rows with indices “idx1,idx2,idx3”
df.iloc[ [idx1,idx2,idx3] ]
select rows by name
df.loc[ [row1,row2,row3] ]
select columns 1,4
df[ [1,4] ]
same as above, but use meaningful labels
df[ ["energy","param1"] ]
select rows whose column entries satisfy some constraint
df[ df["energy"]<0 ]
same as above, but with multiple constraints
df[ (df["energy"]<0) & (mydf['dynamic']==True) ]
Data Modification
add a column to data base by combining two columns
df['E+ZPE']=df['energy']+df['zpe']
combine dataframes
combineddf=pd.concat([df1,df2])
drop row
df.drop('row name')
drop column
df.drop('column name',1)