问题描述
我有一个如下所示的数据框:
i have a dataframe like the one displayed below:
# create an example dataframe about a fictional army raw_data = {'regiment': ['nighthawks', 'nighthawks', 'nighthawks', 'nighthawks'], 'company': ['1st', '1st', '2nd', '2nd'], 'deaths': ['kkk', 52, '25', 616], 'battles': [5, '42', 2, 2], 'size': ['l', 'll', 'l', 'm']} df = pd.dataframe(raw_data, columns = ['regiment', 'company', 'deaths', 'battles', 'size'])
我的目标是将数据框中的每个字符串都转换为大写,使其看起来像这样:
my goal is to transform every single string inside of the dataframe to upper case so that it looks like this:
注意:所有数据类型均为对象,不得更改;输出必须包含所有对象.我想避免将每一列一一转换...我想一般在整个数据帧上进行.
notice: all data types are objects and must not be changed; the output must contain all objects. i want to avoid to convert every single column one by one... i would like to do it generally over the whole dataframe possibly.
到目前为止我尝试的是这样做但没有成功
what i tried so far is to do this but without success
df.str.upper()
推荐答案
astype() 会将每个系列转换为 dtype 对象(字符串),然后调用 str() 方法在转换后的系列上从字面上获取字符串并调用函数 upper() 就可以了.请注意,在此之后,所有列的 dtype 都会更改为 object.
astype() will cast each series to the dtype object (string) and then call the str() method on the converted series to get the string literally and call the function upper() on it. note that after this, the dtype of all columns changes to object.
in [17]: df out[17]: regiment company deaths battles size 0 nighthawks 1st kkk 5 l 1 nighthawks 1st 52 42 ll 2 nighthawks 2nd 25 2 l 3 nighthawks 2nd 616 2 m in [18]: df.apply(lambda x: x.astype(str).str.upper()) out[18]: regiment company deaths battles size 0 nighthawks 1st kkk 5 l 1 nighthawks 1st 52 42 ll 2 nighthawks 2nd 25 2 l 3 nighthawks 2nd 616 2 m
您可以稍后使用 to_numeric():
in [42]: df2 = df.apply(lambda x: x.astype(str).str.upper()) in [43]: df2['battles'] = pd.to_numeric(df2['battles']) in [44]: df2 out[44]: regiment company deaths battles size 0 nighthawks 1st kkk 5 l 1 nighthawks 1st 52 42 ll 2 nighthawks 2nd 25 2 l 3 nighthawks 2nd 616 2 m in [45]: df2.dtypes out[45]: regiment object company object deaths object battles int64 size object dtype: object