拨开荷叶行,寻梦已然成。仙女莲花里,翩翩白鹭情。
IMG-LOGO
主页 文章列表 如何求同年的总值?(熊猫)

如何求同年的总值?(熊猫)

白鹭 - 2022-03-08 1964 0 0

我正在用 Pandas 学习 Python 资料分析

我有一个游戏销售资料框,看起来像这样:

(此资料不真实,仅供提问)

Name                Year    Publisher   Total Sales
GTA V               2013    Rockstar    133000
Super Mario Bros    1985    Nintendo    430500
GTA VI              2025    Rockstar    86000
RDR 3               2025    Rockstar    129030
Super Mario Sister  1985    Nintendo    308900
Super Mario End     2000    Nintendo    112100

然后我洗掉名称并使用以下命令按发布者名称对其进行分组:

df.drop(columns='Name', inplace=True)
df.groupby(['Publisher','Year','Total Sales']).sum().reset_index()

资料框现在看起来像这样:

Publisher   Year    Total Sales
Nintendo    1985    308900
Nintendo    1985    430500
Nintendo    2000    112100
Rockstar    2013    133000
Rockstar    2025    129030
Rockstar    2025    86000

这很好,但我想总结同一出版商同年的总销售额

我希望资料框看起来像这样:

Publisher   Year    Total Sales
Nintendo    1985    739400
Nintendo    2000    86000
Rockstar    2013    129030
Rockstar    2025    215030

有没有办法做到这一点?

这是我的 df 代码:

data = {'Name':['GTA V','Super Mario Bros','GTA VI','RDR 3','Super Mario Sister','Super Mario End'],'Year':['2013','1985','2025','2025','1985','2000'],
        'Publisher':['Rockstar','Nintendo','Rockstar','Rockstar','Nintendo','Nintendo'],'Total Sales':['133000','430500','86000','129030','308900','112100']}

df = pd.DataFrame(data)
df

uj5u.com热心网友回复:

使用pivot_table

>>> df.pivot_table('Total Sales', ['Year', 'Publisher'], aggfunc='sum').reset_index()

   Year Publisher  Total Sales
0  1985  Nintendo       739400
1  2000  Nintendo       112100
2  2013  Rockstar       133000
3  2025  Rockstar       215030

注意:如果Total Sales列包含字符串,请将其转换为int(或float):

>>> df.astype({'Total Sales': int}).pivot_table(...)

uj5u.com热心网友回复:

import pandas as pd

data = {'Name':['GTA V','Super Mario Bros','GTA VI','RDR 3','Super Mario Sister','Super Mario End'],'Year':['2013','1985','2025','2025','1985','2000'],
        'Publisher':['Rockstar','Nintendo','Rockstar','Rockstar','Nintendo','Nintendo'],'Total Sales':['133000','430500','86000','129030','308900','112100']}

df = pd.DataFrame(data)
df['Total Sales'] = df['Total Sales'].astype(int)


df.groupby(['Year', 'Publisher'])['Total Sales'].agg('sum').reset_index()

uj5u.com热心网友回复:

这是一种方法:

df.drop(columns='Name', inplace=True)
df['Total Sales'] = pd.to_numeric(df['Total Sales'])
df2 = df.groupby(['Publisher','Year']).sum().reset_index()
df2
标签:

0 评论

发表评论

您的电子邮件地址不会被公开。 必填的字段已做标记 *