Merging a dataframe and a series on 2 columns using Pandas/Python -
i using python/pandas , have dataframe (1) below. have grouped id, , taken max of revision number in each group of revisions against each id produce series (2) below.
i want merge (1) (2) in such way match first 2 columns of (1) corresponding columns of (2), pulling in other column in (2) appropriately [in data set of (1), 'id', 'revision' , 'colour' not consecutive columns, , there other columns].
i treating (2) key , pulling in appropriate data (1).
how do using pandas?
thanks in advance.
max.
(1) dataframe
id revision colour 14446 0 red 14446 0 red 14446 0 red 14466 1 red 14466 1 red 14466 0 red 14466 1 red 14466 1 red 14466 0 red 14466 2 red 14466 0 red 14466 1 red 14466 0 red 14471 0 green 14471 0 green 14471 0 green 14471 0 green 14473 0 blue 14473 1 blue 14473 0 blue
(2) series
id revision 13125 1 13213 0 13266 0 13276 0 13277 1 13278 0 13280 2 13285 0 13287 1 13288 0 13291 1 13292 1
sort revision, group id , take last element each group.
in [2]: df.sort('revision').groupby(level=0).last() out[2]: revision colour id 14446 0 red 14466 2 red 14471 0 green 14473 1 blue
i assumed id
index. if it's column, groupby('id')
instead.
Comments
Post a Comment