question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Episode regex needs to match against sanitized entry

See original GitHub issue

Expected behaviour:

Entry [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv accepted by Series plugin as per instructed by configured ep_regexp S(\d) - (\d\d).

Actual behaviour:

Episode number is not recognized and therefore Flexget tries and succeeds with season_pack. This is caused by a splitting step before the remaining data string is passed to the parse_episode method. It removes the hyphen character from the string but what would’ve been expected to still be there for matching against the user defined episode regex.

My solution for now is to define the regex without a hyphen and that seems to match against correct entries. Whether this is a documentation issue or a code issue is up to you. I hope this may help someone as it wasn’t clear for me.


Steps to reproduce:

  1. Set ep_regexp for a show as follows: S(\d) - (\d\d)
  2. Execute against an rss feed that contains a show which is marked by <series name> S2 - 123 …

Config:


--- config from task: rssAnime
deluge:
  magnetization_timeout: 30
  password:
  username:
rss: https://nyaa.si/?page=rss&q=720p+%28SubsPlease%7CErai-raws%29&c=1_2&f=0
series:
  720p+:
    - Dr. Stone:
        ep_regexp: S(\d) - (\d\d)
        identified_by: ep
        target: 720p+
set:
  add_paused: 0
  path: /mnt/Downloads/DLTV/{{ series_name }}
  ratio: 5
template:
  - myAnime
  - myDeluge

---

Log:

(click to expand)
2021-02-25 21:03:17 DEBUG    parser_internal rssAnime        Parsing series: `[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv` kwargs: {
	'name': 'Dr. Stone',
	'identified_by': 'ep',
	'alternate_names': [],
	'name_regexps': [],
	'strict_name': False,
	'allow_groups': [],
	'date_yearfirst': None,
	'date_dayfirst': None,
	'special_ids': [],
	'prefer_specials': None,
	'assume_special': None,
	'ep_regexps': ['S(\\d) - (\\d\\d)'],
	'date_regexps': [],
	'sequence_regexps': [],
	'id_regexps': []
}
2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        name: dr stone data: [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv
2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        NAME SUCCESS: ^(?:(?:\[[^\[\]]*\])|(?:HD.720p?:)|(?:HD.1080p?:)|(?:HD.2160p?:))?(?:[^\w&]|_)*(Dr(?:[^\w&]|_)*Stone)(?:\b|_)(?:[^\w&]|_)* matched to [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv
2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        data stripped:  s2 - 07 (720p) [00f5bce4].mkv [subsplease] 
2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        parsing quality ->
2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        quality detected, using remaining data ` s2 - 07 () [00f5bce4].mkv [subsplease] `

2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        data for date/ep/id parsing 's2 07 00f5bce4 mkv subsplease'
2021-02-25 21:03:17 TRACE    seriesparser  rssAnime        season pack regexp (?:season\s?|s)(\d{1,})(?:\s|$)(?!(?:(?:.*?\s)?(?:episode|e|ep|part|pt)\s?(?:\d{1,3}|X{0,3}(?:IX|XI{0,4}|VI{0,4}|IV|V|I{1,4}))|(?:\d{1,3})\s?of\s?(?:\d{1,3}))) match ('2',)
2021-02-25 21:03:17 DEBUG    parser_internal rssAnime        Parsing result: <SeriesParser(data=[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv,name=Dr. Stone,id=(2, 0),season=2,season_pack=True,episode=None,quality=720p,proper=0,status=OK)> (in 1.322040999999885 ms)

2021-02-25 21:03:17 DEBUG    series        rssAnime        `[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv` detected as `<SeriesParseResult(data=[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv,name=Dr. Stone,id=(2, 0),season=2,season_pack=True,episode=0,quality=720p,proper=0,special=False,status=OK)>`, field: `title`
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_parser = <flexget.components.parsing.parsers.parser_common.SeriesParseResult object at 0x7f4982d5e7d0>
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_name = 'Dr. Stone'
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: quality = <Quality(resolution=720p,source=unknown,codec=unknown,audio=unknown)>
2021-02-25 21:03:17 TRACE    metainfo_quality rssAnime        Found quality 720p for [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: quality = <Quality(resolution=720p,source=unknown,codec=unknown,audio=unknown)>
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: proper = False
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: proper_count = 0
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: release_group = None
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: season_pack = True
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_season = 2
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_episodes = 1
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_id = 'S02'
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_id_type = 'ep'
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_identified_by = 'ep'
2021-02-25 21:03:17 TRACE    entry         rssAnime        ENTRY SET: series_exact = False


After changing the regex to match post split syntax


2021-02-25 21:08:08 DEBUG    parser_internal rssAnime        Parsing series: `[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv` kwargs: {
	'name': 'Dr. Stone',
	'identified_by': 'ep',
	'alternate_names': [],
	'name_regexps': [],
	'strict_name': False,
	'allow_groups': [],
	'date_yearfirst': None,
	'date_dayfirst': None,
	'special_ids': [],
	'prefer_specials': None,
	'assume_special': None,
	'ep_regexps': ['S(\\d) (\\d\\d)'],
	'date_regexps': [],
	'sequence_regexps': [],
	'id_regexps': []
}
2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        name: dr stone data: [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv
2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        NAME SUCCESS: ^(?:(?:\[[^\[\]]*\])|(?:HD.720p?:)|(?:HD.1080p?:)|(?:HD.2160p?:))?(?:[^\w&]|_)*(Dr(?:[^\w&]|_)*Stone)(?:\b|_)(?:[^\w&]|_)* matched to [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv
2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        data stripped:  s2 - 07 (720p) [00f5bce4].mkv [subsplease] 
2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        parsing quality ->
2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        quality detected, using remaining data ` s2 - 07 () [00f5bce4].mkv [subsplease] `

2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        data for date/ep/id parsing 's2 07 00f5bce4 mkv subsplease'
2021-02-25 21:08:08 TRACE    seriesparser  rssAnime        found episode number with regexp S(\d) (\d\d) (('2', '07'))
2021-02-25 21:08:08 DEBUG    parser_internal rssAnime        Parsing result: <SeriesParser(data=[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv,name=Dr. Stone,id=(2, 7),season=2,season_pack=None,episode=7,quality=720p,proper=0,status=OK)> (in 1.2556020000000778 ms)

2021-02-25 21:08:08 DEBUG    series        rssAnime        `[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv` detected as `<SeriesParseResult(data=[SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv,name=Dr. Stone,id=(2, 7),season=2,season_pack=None,episode=7,quality=720p,proper=0,special=False,status=OK)>`, field: `title`
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_parser = <flexget.components.parsing.parsers.parser_common.SeriesParseResult object at 0x7faa49b30d50>
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_name = 'Dr. Stone'
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: quality = <Quality(resolution=720p,source=unknown,codec=unknown,audio=unknown)>
2021-02-25 21:08:08 TRACE    metainfo_quality rssAnime        Found quality 720p for [SubsPlease] Dr. Stone S2 - 07 (720p) [00F5BCE4].mkv
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: quality = <Quality(resolution=720p,source=unknown,codec=unknown,audio=unknown)>
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: proper = False
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: proper_count = 0
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: release_group = None
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: season_pack = None
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_season = 2
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_episode = 7
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_episodes = 1
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_id = 'S02E07'
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_id_type = 'ep'
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_identified_by = 'ep'
2021-02-25 21:08:08 TRACE    entry         rssAnime        ENTRY SET: series_exact = False


Additional information:

  • FlexGet version: 3.1.103
  • Python version: 3.7.9
  • Installation method: pip
  • Using daemon (yes/no): no
  • OS and version: Ubuntu 18.04.5 LTS
  • Link to crash log:

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:27 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
BrutuZcommented, Mar 23, 2021

I agree, they are the most common ones.

Roman Numerals for me are the most boring to detect correctly, but I don’t know how to create such good expressions either.

{{series_name|re_replace('\bII\b$','S02')|re_replace('\bIII\b$','S03')|re_replace('\bIV\b$','S04')|re_replace('\bV\b$','S05')|re_replace('\bVI\b$','S06')|re_replace('\bVII\b$','S07')|re_replace('\bVIII\b$','S08')|re_replace('\bIX\b$','S09')}}

1reaction
BrutuZcommented, Mar 23, 2021

If I had to list common suffixes for sequel series that’d be something like

  • #
  • S#
  • #(nd|rd|th) Season
  • Season #
  • Roman Numerals (II|III|IV|V|VI||VII|IIX|IX)

Other than that, they can and will always come up with a suffix than can’t be easily guessed and requires manual numbering anyway, like The Final Season, Banana-hen (Banana arc\chapter), Ni no Shou (The Second Act), or even my personal favorite oddball for this season:

Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I sanitize input before making a regex out of it?
For instance, if the user inputs /me eats , it should match the /me and replace it with <move> . However, Java isn't...
Read more >
Sanitizing input with regex considered harmful
Sanitizing input (as in trying to remove a subset of user input so that the remaining parts become “safe”) is hard to get...
Read more >
IDS08-J. Sanitize untrusted data included in a regular ...
Regular expressions (regex) are widely used to match strings of text. ... Untrusted input should be sanitized before use to prevent regex injection....
Read more >
Regular Expression (Regex) Tutorial
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ )....
Read more >
Regular Expressions: Patterns - Software Carpentry:
In this episode, we'll have a look at a few more patterns you can use to build ... The first element of each...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found