Tensorboard Projector - cosine distance "Nearest points in the original space" not correct
See original GitHub issueEnvironment information (required)
--- check: autoidentify
INFO: diagnose_tensorboard.py version 393931f9685bd7e0f3898d7dcdf28819fef54c43
--- check: general
INFO: sys.version_info: sys.version_info(major=3, minor=6, micro=8, releaselevel='final', serial=0)
INFO: os.name: nt
INFO: os.uname(): N/A
INFO: sys.getwindowsversion(): sys.getwindowsversion(major=10, minor=0, build=17763, platform=2, service_pack='')
--- check: package_management
INFO: has conda-meta: False
INFO: $VIRTUAL_ENV: None
--- check: installed_packages
INFO: installed: tensorboard==1.13.1
INFO: installed: tensorflow-gpu==1.13.1
INFO: installed: tensorflow==1.14.0
WARNING: conflicting installations: ['tensorflow', 'tensorflow-gpu']
INFO: installed: tensorflow-estimator==1.13.0
--- check: tensorboard_python_version
INFO: tensorboard.version.VERSION: '1.13.1'
--- check: tensorflow_python_version
INFO: tensorflow.__version__: '1.13.1'
INFO: tensorflow.__git_version__: "b'v1.13.1-0-g6612da8951'"
--- check: tensorboard_binary_path
INFO: which tensorboard: b'F:\\Desktop\\Thesis\\Python3.6\\Scripts\\tensorboard.exe\r\n'
--- check: readable_fqdn
INFO: socket.getfqdn(): 'DESKTOP-LD8UUFN.home'
--- check: stat_tensorboardinfo
INFO: directory: C:\Users\josch\AppData\Local\Temp\.tensorboard-info
INFO: os.stat(...): os.stat_result(st_mode=16895, st_ino=61361544923004089, st_dev=3506408066, st_nlink=1, st_uid=0, st_gid=0, st_size=24576, st_atime=1562950451, st_mtime=1562950451, st_ctime=1560964117)
INFO: mode: 0o40777
--- check: source_trees_without_genfiles
INFO: tensorboard_roots (1): ['F:\\Python3.6\\lib\\site-packages']; bad_roots (0): []
--- check: full_pip_freeze
INFO: pip freeze --all:
absl-py==0.7.1
astor==0.8.0
attrs==19.1.0
backcall==0.1.0
bleach==3.1.0
boto==2.49.0
boto3==1.9.171
botocore==1.12.171
certifi==2019.6.16
chardet==3.0.4
colorama==0.4.1
cycler==0.10.0
decorator==4.4.0
defusedxml==0.6.0
docutils==0.14
entrypoints==0.3
gast==0.2.2
gensim==3.7.3
google-pasta==0.1.7
grpcio==1.21.1
h5py==2.9.0
idna==2.8
ipykernel==5.1.1
ipython==7.5.0
ipython-genutils==0.2.0
ipywidgets==7.4.2
jedi==0.13.3
Jinja2==2.10.1
jmespath==0.9.4
joblib==0.13.2
jsonschema==3.0.1
jupyter==1.0.0
jupyter-client==5.2.4
jupyter-console==6.0.0
jupyter-core==4.5.0
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
Markdown==3.1.1
MarkupSafe==1.1.1
matplotlib==3.1.0
mistune==0.8.4
mock==3.0.5
nbconvert==5.5.0
nbformat==4.4.0
notebook==5.7.8
numpy==1.16.4
pandas==0.24.2
pandocfilters==1.4.2
parso==0.4.0
pickleshare==0.7.5
pip==18.1
prometheus-client==0.7.0
prompt-toolkit==2.0.9
protobuf==3.8.0
Pygments==2.4.2
pyparsing==2.4.0
pyrsistent==0.15.2
python-dateutil==2.8.0
pytz==2019.1
pywinpty==0.5.5
pyzmq==18.0.1
qtconsole==4.5.1
requests==2.22.0
s3transfer==0.2.1
scikit-learn==0.21.2
scipy==1.3.0
Send2Trash==1.5.0
setuptools==41.0.1
six==1.12.0
sklearn==0.0
smart-open==1.8.4
tensorboard==1.13.1
tensorflow==1.14.0
tensorflow-estimator==1.13.0
tensorflow-gpu==1.13.1
termcolor==1.1.0
terminado==0.8.2
testpath==0.4.2
tornado==6.0.2
traitlets==4.3.2
urllib3==1.25.3
wcwidth==0.1.7
webencodings==0.5.1
Werkzeug==0.15.4
wheel==0.33.4
widgetsnbextension==3.4.2
wrapt==1.11.2
xlrd==1.2.0
Issue description
I am currently visualizing word embeddings (shape=60,300) from my TensorFlow model in the TensorBoard Projector and i am having troubles with the cosine distance.
The displayed distances distort the results and doesn’t match the real cosine distances.
This was a test run with different category embeddings:
-
TensorBoard:
-
sklearn:
Both use the same data and the results are not even close.
Is TensorBoard reducing the dimensions from the vectors and the label “Nearest points in the original space” is incorrect?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:12 (4 by maintainers)
Top Results From Across the Web
Why does the TensorBoard display the wrong cosine distance?
i want to visualize word embeddings in the Projector from TensorBoard, but the cosine distances doesnt seem right. If i compute the cosine ......
Read more >Embedding projector
The number of neighbors (in the original space) to show when clicking on a point. distance. COSINE EUCLIDEAN. Nearest points in the original...
Read more >TensorBoard Visualizations - | notebook.community
We will use a built-in Tensorboard visualizer called Embedding Projector in this ... the exact cosine/euclidean distances between them are not preserved, ...
Read more >Word Embeddings and Embedding Projector of TensorFlow
Unlike euclidean distance, cosine similarity does not take the ... “own” is present in the text and a list containing the nearest points....
Read more >t-SNE: T-Distributed Stochastic Neighbor Embedding Explained
To explain it, we will use two-dimensional data points (higher dimensional ... We will use TensorBoard Projector to map higher-dimensional ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes, with a tiny detail that you will have to call np.linalg.norm(N_vectors,axis=-1,keepdims=True) so the division broadcasting works in the last line of code.
Hi!
Yes, the functionality hasn’t changed. Couple of notes:
60,300
the dimensionality of your data, or the number of points? This random projection could lead to loss of information.