question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Keyword "Apache Spark" shows up on Warehouse as two keywords "Apache", "Spark"

See original GitHub issue

Here’s how I set my project’s keywords (ref):

keywords=['Apache Spark'],

So my intention was 1 keyword, “Apache Spark”. However, here on Warehouse this shows up as two keywords, “Apache” and “Spark”.

Is this intentional?

I know that setup() also accepts a single string for keywords, instead of a list of strings, as follows:

keywords='Apache Spark',

In this case I would expect the keywords to be interpreted as whitespace-delimited – that is, two keywords, “Apache” and “Spark”, as they currently are on Warehouse.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
dstufftcommented, Jun 9, 2016

I think that if you use keywords="Apache Spark," you might get the behavior you want, but yea, closing this as WONTFIX. Thanks!

1reaction
karancommented, Jun 4, 2016

This seems to be the doing on format_tags filter.

https://github.com/pypa/warehouse/blob/master/warehouse/filters.py#L101

Specifically

L108: split_tags = re.split(r'\s+', tags)

I think the correct behavior is this:

  • If passed argument tags is a list, then call format_tags recursively (or iteratively), to clean each individual one.
  • If tags is a string, then execute the existing flow.

Legacy pypi doesn’t seem to be doing any formatting - just dumps the keyword as a string on the UI.

@dstufft or @rjwebb can confirm.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Apache Spark Key Terms, Explained - The Databricks Blog
In this blog post, we will discuss some of the key terms one encounters when working with Apache Spark.
Read more >
How to check if a particular keyword exists in Apache Spark
Hi, I have a text file where I need to search for particular keyword exists or not. So can anyone help me how...
Read more >
Spark SQL, DataFrames and Datasets Guide
Spark SQL provides support for both reading and writing Parquet files that automatically preserves the schema of the original data. When writing Parquet...
Read more >
Intro to Apache Spark
Let's get started using Apache Spark, ... 1. create RDDs to filter each line for the keyword. “Spark”. 2. perform a ... MapReduce...
Read more >
How to use LEFT and RIGHT keyword in SPARK SQL
You can use substring function with positive pos to take from the left: import org.apache.spark.sql.functions.substring substring(column, 0, 1).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found