question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`TypeError: '_LocIndexer' object does not support item assignment` when updating column with computation results

See original GitHub issue

Hi! New to dask here.

I have the following code using pandas:

import pandas as pd

from nltk.corpus import stopwords
from string import punctuation
from unidecode import unidecode


punct_remover = str.maketrans(punctuation, len(punctuation)*" ")

def replace_accented_chars(s):
    return s if s == "" else unidecode(s).translate(punct_remover)

def clean_white_chars(s):
    return s if s == "" else s.replace("\s+", " ").strip()

def convert_to_list(s):
    if s == "": return (0, list())
    
    the_list = s.split()
    return (len(the_list), the_list)

def filter_stopwords(tup):
    if tup[0] == 0: return tup
    
    return (tup[0], list(filter(lambda x: x not in stopwords.words("portuguese"), tup[1])))
    
def rejoin_string(tup):
    return "" if tup[0] == 0 else " ".join(tup[1])

df = pd.DataFrame(data={
    "column1": [pd.np.nan, "isto é um teste", "isto é um teste", "isto é um teste", "isto é um teste"],
    "column2": ["isto é um teste", pd.np.nan, "isto é um teste", "isto é um teste", "isto é um teste"],
    "column3": ["isto é um teste", "isto é um teste", pd.np.nan, "isto é um teste", "isto é um teste"],
    "column4": ["isto é um teste", "isto é um teste", "isto é um teste", pd.np.nan, "isto é um teste"],
    "column5": ["isto é um teste", "isto é um teste", "isto é um teste", "isto é um teste", pd.np.nan],
    # more columns here...
})

# above DataFrame
#            column1          column2          column3          column4          column5
# 0              NaN  isto é um teste  isto é um teste  isto é um teste  isto é um teste
# 1  isto é um teste              NaN  isto é um teste  isto é um teste  isto é um teste
# 2  isto é um teste  isto é um teste              NaN  isto é um teste  isto é um teste
# 3  isto é um teste  isto é um teste  isto é um teste              NaN  isto é um teste
# 4  isto é um teste  isto é um teste  isto é um teste  isto é um teste              NaN


df.loc[:, ["column1", "column2", "column3", "column4", "column5"]] = (
    df[["column1", "column2", "column3", "column4", "column5"]]
    .fillna("")
    .applymap(replace_accented_chars)
    .applymap(clean_white_chars)
    .applymap(convert_to_list)
    .applymap(filter_stopwords)
    .applymap(rejoin_string)
)

# resulting DataFrame from above
#   column1 column2 column3 column4 column5
# 0           teste   teste   teste   teste
# 1   teste           teste   teste   teste
# 2   teste   teste           teste   teste
# 3   teste   teste   teste           teste
# 4   teste   teste   teste   teste           

I was trying to rewrite the above using dask, so I tried the following (not rewriting all of it to avoid clutter):


# df = pd.DataFrame...

df_dask = dd.from_pandas(df, npartitions=3)

df_dask.loc[:, ["column1", "column2", "column3", "column4", "column5"]] = (df_dask[["column1", "column2", "column3", "column4", "column5"]]
    .fillna("")
    .applymap(strproc1)
    .applymap(strproc2)
    .applymap(strproc3)
    .applymap(strproc4)
    .applymap(strproc5))

gave me the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-112-e2c3c73cfa35> in <module>()
      5     .applymap(strproc3)
      6     .applymap(strproc4)
----> 7     .applymap(strproc5))

TypeError: '_LocIndexer' object does not support item assignment

I was wondering if this is a bug or a feature, and if I’m doing wrong, how can I do it.

Thanks!

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

7reactions
quantumdscommented, May 7, 2018

Already resolved my question. The form of creating the conditional assignation is: ddf['target'] = ddf.target.where(ddf.target >= 1, 1)

2reactions
mrocklincommented, Jun 6, 2017

You can use the assign method if you like. In-place insertions are a bit error-prone in parallel and distributed settings.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dask item assignment. Cannot use loc for item assignment
I have a function where I want to perform item assignment but I can't seem to find any solutions online that qualify as...
Read more >
dask/dask - Gitter
TypeError : Column assignment doesn't support type DataFrame ... apparently it returns TypeError: '_LocIndexer' object does not support item assignment .
Read more >
Indexing and Selecting Data — pandas 0.15.0 documentation
pandas now supports three types of multi-axis indexing. .loc is strictly label based, will raise KeyError when the items are not found, allowed...
Read more >
TypeError: 'str' object does not support item assignment in dot ...
Assuming data is a DataFrame , then data.columns is the columns Index , and data.columns.values is an array with those column names. [2]...
Read more >
Indexing into Dask DataFrames
Just like Pandas, Dask DataFrame supports label-based indexing with the .loc accessor for selecting rows or columns, and __getitem__ (square brackets) for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found