question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Indexing through symlinks

See original GitHub issue

This is initial playing, sorry if this is well-known. I make a robust04 directory and symlinked in my normal locations for the CD45-cr subcollections, but it didn’t work. It seems that the filesystem walker either doesn’t work over symlinks or crossing NFS boundaries:

$ cat log.robust04.pos+docvectors+rawdocs 
2018-11-21 15:21:56,782 INFO  [main] index.IndexCollection (IndexCollection.java:248) - DocumentCollection path: /Users/soboroff/robust04
2018-11-21 15:21:56,783 INFO  [main] index.IndexCollection (IndexCollection.java:249) - Index path: lucene-index.robust04.pos+docvectors
2018-11-21 15:21:56,783 INFO  [main] index.IndexCollection (IndexCollection.java:250) - CollectionClass: TrecCollection
2018-11-21 15:21:56,783 INFO  [main] index.IndexCollection (IndexCollection.java:251) - Generator: JsoupGenerator
2018-11-21 15:21:56,783 INFO  [main] index.IndexCollection (IndexCollection.java:252) - Threads: 16
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:253) - Stemmer: porter
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:254) - Keep stopwords? false
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:255) - Store positions? true
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:256) - Store docvectors? true
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:257) - Store transformed docs? false
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:258) - Store raw docs? true
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:259) - Optimize (merge segments)? false
2018-11-21 15:21:56,784 INFO  [main] index.IndexCollection (IndexCollection.java:260) - Whitelist: null
2018-11-21 15:21:56,799 INFO  [main] index.IndexCollection (IndexCollection.java:291) - Starting indexer...
2018-11-21 15:21:57,117 INFO  [main] index.IndexCollection (IndexCollection.java:314) - 4 files found in /Users/soboroff/robust04
2018-11-21 15:21:57,157 ERROR [pool-2-thread-1] index.IndexCollection$IndexerThread (IndexCollection.java:231) - pool-2-thread-1: Unexpected Exception:
java.io.FileNotFoundException: /Users/soboroff/robust04/FR94 (Is a directory)
	at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_141]
	at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:93) ~[?:1.8.0_141]
	at java.io.FileReader.<init>(FileReader.java:58) ~[?:1.8.0_141]
	at io.anserini.collection.TrecCollection$FileSegment.<init>(TrecCollection.java:83) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:59) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:43) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.index.IndexCollection$IndexerThread.run(IndexCollection.java:187) [anserini-0.2.1-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
2018-11-21 15:21:57,157 ERROR [pool-2-thread-2] index.IndexCollection$IndexerThread (IndexCollection.java:231) - pool-2-thread-2: Unexpected Exception:
java.io.FileNotFoundException: /Users/soboroff/robust04/LATIMES (Is a directory)
	at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_141]
	at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:93) ~[?:1.8.0_141]
	at java.io.FileReader.<init>(FileReader.java:58) ~[?:1.8.0_141]
	at io.anserini.collection.TrecCollection$FileSegment.<init>(TrecCollection.java:83) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:59) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:43) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.index.IndexCollection$IndexerThread.run(IndexCollection.java:187) [anserini-0.2.1-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
2018-11-21 15:21:57,157 ERROR [pool-2-thread-3] index.IndexCollection$IndexerThread (IndexCollection.java:231) - pool-2-thread-3: Unexpected Exception:
java.io.FileNotFoundException: /Users/soboroff/robust04/FT (Is a directory)
	at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_141]
	at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:93) ~[?:1.8.0_141]
	at java.io.FileReader.<init>(FileReader.java:58) ~[?:1.8.0_141]
	at io.anserini.collection.TrecCollection$FileSegment.<init>(TrecCollection.java:83) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:59) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:43) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.index.IndexCollection$IndexerThread.run(IndexCollection.java:187) [anserini-0.2.1-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
2018-11-21 15:21:57,157 ERROR [pool-2-thread-4] index.IndexCollection$IndexerThread (IndexCollection.java:231) - pool-2-thread-4: Unexpected Exception:
java.io.FileNotFoundException: /Users/soboroff/robust04/FBIS (Is a directory)
	at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_141]
	at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[?:1.8.0_141]
	at java.io.FileInputStream.<init>(FileInputStream.java:93) ~[?:1.8.0_141]
	at java.io.FileReader.<init>(FileReader.java:58) ~[?:1.8.0_141]
	at io.anserini.collection.TrecCollection$FileSegment.<init>(TrecCollection.java:83) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:59) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.collection.TrecCollection.createFileSegment(TrecCollection.java:43) ~[anserini-0.2.1-SNAPSHOT.jar:?]
	at io.anserini.index.IndexCollection$IndexerThread.run(IndexCollection.java:187) [anserini-0.2.1-SNAPSHOT.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
2018-11-21 15:21:57,187 INFO  [main] index.IndexCollection (IndexCollection.java:359) - # Final Counter Values
2018-11-21 15:21:57,187 INFO  [main] index.IndexCollection (IndexCollection.java:360) - indexed:                0
2018-11-21 15:21:57,187 INFO  [main] index.IndexCollection (IndexCollection.java:361) - empty:                  0
2018-11-21 15:21:57,188 INFO  [main] index.IndexCollection (IndexCollection.java:362) - unindexed:              0
2018-11-21 15:21:57,188 INFO  [main] index.IndexCollection (IndexCollection.java:363) - unindexable:            0
2018-11-21 15:21:57,188 INFO  [main] index.IndexCollection (IndexCollection.java:364) - skipped:                0
2018-11-21 15:21:57,188 INFO  [main] index.IndexCollection (IndexCollection.java:365) - errors:                 0
2018-11-21 15:21:57,197 INFO  [main] index.IndexCollection (IndexCollection.java:368) - Total 0 documents indexed in 00:00:00

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8 (3 by maintainers)

github_iconTop GitHub Comments

0reactions
isoboroffcommented, Nov 29, 2018

Confirming it detects 0 documents in the DTD, AUX/, and .C files in those trees. Maybe you knew that already.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Can Win10 search through symlink or directory junctions?
Except when using the search bar in File Explorer it can't seem to search the junctioned folder. Google has very little recent info...
Read more >
Symbolic links - voidtools forum
I have few folders, which are symbolic links to different folder on the same drive. Is it possible to somehow find files in...
Read more >
Indexing of Symlinks Not Working · Issue #267 · photoprism ...
Actual Result: No photos are indexed. The indexer does not seem to find any of the photos through the symlinks. Given my large...
Read more >
Ignore symlinks in project (indexing, code-completion etc)
Then find "indexer.follows.symlinks" and disable it. You can use search-as-you-type there to find it. Check if that helps with the issue.
Read more >
Symbolic Links – Soft Links – Symlinks - Ian! D. Allen
3 Creating and Listing SymlinksIndex up to index. You create a symbolic link using the -s option to the link command ln ,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found