Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`TypeError: '_LocIndexer' object does not support item assignment` when updating column with computation results

See original GitHub issue

Hi! New to dask here.

I have the following code using pandas:

import pandas as pd

from nltk.corpus import stopwords
from string import punctuation
from unidecode import unidecode


punct_remover = str.maketrans(punctuation, len(punctuation)*" ")

def replace_accented_chars(s):
    return s if s == "" else unidecode(s).translate(punct_remover)

def clean_white_chars(s):
    return s if s == "" else s.replace("\s+", " ").strip()

def convert_to_list(s):
    if s == "": return (0, list())
    
    the_list = s.split()
    return (len(the_list), the_list)

def filter_stopwords(tup):
    if tup[0] == 0: return tup
    
    return (tup[0], list(filter(lambda x: x not in stopwords.words("portuguese"), tup[1])))
    
def rejoin_string(tup):
    return "" if tup[0] == 0 else " ".join(tup[1])

df = pd.DataFrame(data={
    "column1": [pd.np.nan, "isto é um teste", "isto é um teste", "isto é um teste", "isto é um teste"],
    "column2": ["isto é um teste", pd.np.nan, "isto é um teste", "isto é um teste", "isto é um teste"],
    "column3": ["isto é um teste", "isto é um teste", pd.np.nan, "isto é um teste", "isto é um teste"],
    "column4": ["isto é um teste", "isto é um teste", "isto é um teste", pd.np.nan, "isto é um teste"],
    "column5": ["isto é um teste", "isto é um teste", "isto é um teste", "isto é um teste", pd.np.nan],
    # more columns here...
})

# above DataFrame
#            column1          column2          column3          column4          column5
# 0              NaN  isto é um teste  isto é um teste  isto é um teste  isto é um teste
# 1  isto é um teste              NaN  isto é um teste  isto é um teste  isto é um teste
# 2  isto é um teste  isto é um teste              NaN  isto é um teste  isto é um teste
# 3  isto é um teste  isto é um teste  isto é um teste              NaN  isto é um teste
# 4  isto é um teste  isto é um teste  isto é um teste  isto é um teste              NaN


df.loc[:, ["column1", "column2", "column3", "column4", "column5"]] = (
    df[["column1", "column2", "column3", "column4", "column5"]]
    .fillna("")
    .applymap(replace_accented_chars)
    .applymap(clean_white_chars)
    .applymap(convert_to_list)
    .applymap(filter_stopwords)
    .applymap(rejoin_string)
)

# resulting DataFrame from above
#   column1 column2 column3 column4 column5
# 0           teste   teste   teste   teste
# 1   teste           teste   teste   teste
# 2   teste   teste           teste   teste
# 3   teste   teste   teste           teste
# 4   teste   teste   teste   teste

I was trying to rewrite the above using dask, so I tried the following (not rewriting all of it to avoid clutter):


# df = pd.DataFrame...

df_dask = dd.from_pandas(df, npartitions=3)

df_dask.loc[:, ["column1", "column2", "column3", "column4", "column5"]] = (df_dask[["column1", "column2", "column3", "column4", "column5"]]
    .fillna("")
    .applymap(strproc1)
    .applymap(strproc2)
    .applymap(strproc3)
    .applymap(strproc4)
    .applymap(strproc5))

gave me the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-112-e2c3c73cfa35> in <module>()
      5     .applymap(strproc3)
      6     .applymap(strproc4)
----> 7     .applymap(strproc5))

TypeError: '_LocIndexer' object does not support item assignment

I was wondering if this is a bug or a feature, and if I’m doing wrong, how can I do it.

Thanks!

Issue Analytics

State:
Created 6 years ago
Comments:9 (2 by maintainers)

Top GitHub Comments

7reactions

quantumdscommented, May 7, 2018

Already resolved my question. The form of creating the conditional assignation is: ddf['target'] = ddf.target.where(ddf.target >= 1, 1)

2reactions

mrocklincommented, Jun 6, 2017

You can use the assign method if you like. In-place insertions are a bit error-prone in parallel and distributed settings.

Top Results From Across the Web

Dask item assignment. Cannot use loc for item assignment

I have a function where I want to perform item assignment but I can't seem to find any solutions online that qualify as...

dask/dask - Gitter

TypeError : Column assignment doesn't support type DataFrame ... apparently it returns TypeError: '_LocIndexer' object does not support item assignment .

Indexing and Selecting Data — pandas 0.15.0 documentation

pandas now supports three types of multi-axis indexing. .loc is strictly label based, will raise KeyError when the items are not found, allowed...

TypeError: 'str' object does not support item assignment in dot ...

Assuming data is a DataFrame , then data.columns is the columns Index , and data.columns.values is an array with those column names. [2]...

Indexing into Dask DataFrames

Just like Pandas, Dask DataFrame supports label-based indexing with the .loc accessor for selecting rows or columns, and __getitem__ (square brackets) for ...