Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Keyword "Apache Spark" shows up on Warehouse as two keywords "Apache", "Spark"

See original GitHub issue

Here’s how I set my project’s keywords (ref):

keywords=['Apache Spark'],

So my intention was 1 keyword, “Apache Spark”. However, here on Warehouse this shows up as two keywords, “Apache” and “Spark”.

Is this intentional?

I know that setup() also accepts a single string for keywords, instead of a list of strings, as follows:

keywords='Apache Spark',

In this case I would expect the keywords to be interpreted as whitespace-delimited – that is, two keywords, “Apache” and “Spark”, as they currently are on Warehouse.

Issue Analytics

State:
Created 7 years ago
Comments:9 (5 by maintainers)

Top GitHub Comments

1reaction

dstufftcommented, Jun 9, 2016

I think that if you use keywords="Apache Spark," you might get the behavior you want, but yea, closing this as WONTFIX. Thanks!

1reaction

karancommented, Jun 4, 2016

This seems to be the doing on format_tags filter.

https://github.com/pypa/warehouse/blob/master/warehouse/filters.py#L101

Specifically

L108: split_tags = re.split(r'\s+', tags)

I think the correct behavior is this:

If passed argument tags is a list, then call format_tags recursively (or iteratively), to clean each individual one.
If tags is a string, then execute the existing flow.

Legacy pypi doesn’t seem to be doing any formatting - just dumps the keyword as a string on the UI.

@dstufft or @rjwebb can confirm.

Top Results From Across the Web

Apache Spark Key Terms, Explained - The Databricks Blog

In this blog post, we will discuss some of the key terms one encounters when working with Apache Spark.

How to check if a particular keyword exists in Apache Spark

Hi, I have a text file where I need to search for particular keyword exists or not. So can anyone help me how...

Spark SQL, DataFrames and Datasets Guide

Spark SQL provides support for both reading and writing Parquet files that automatically preserves the schema of the original data. When writing Parquet...

Intro to Apache Spark

Let's get started using Apache Spark, ... 1. create RDDs to filter each line for the keyword. “Spark”. 2. perform a ... MapReduce...

How to use LEFT and RIGHT keyword in SPARK SQL

You can use substring function with positive pos to take from the left: import org.apache.spark.sql.functions.substring substring(column, 0, 1).