question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

--num-procs X and curdoc().session_context.request.arguments don't go well together

See original GitHub issue

If I add these two lines:

args = curdoc().session_context.request.arguments
print(args)

to examples/app/sliders.py and run bokeh with:

bokeh serve sliders.py --num-procs 2

and then reload the page it crashes in roughly 50% of the cases with:

Error running application handler <bokeh.application.handlers.script.ScriptHandler object at 0x7f468eafe978>: 'NoneType' object has no attribute 'arguments'

I guess one of the processes has the request not set properly. It works if I don’t add --num-procs.

This happens with bokeh 0.12.3 and latest master (unrelated: master does not show the example at all, is it currently broken?)

For reference, this is the full example to reproduce:

''' Present an interactive function explorer with slider widgets.

Scrub the sliders to change the properties of the ``sin`` curve, or
type into the title text box to update the title of the plot.

Use the ``bokeh serve`` command to run the example by executing:

    bokeh serve sliders.py

at your command prompt. Then navigate to the URL

    http://localhost:5006/sliders

in your browser.

'''
import numpy as np

from bokeh.io import curdoc
from bokeh.layouts import row, widgetbox
from bokeh.models import ColumnDataSource
from bokeh.models.widgets import Slider, TextInput
from bokeh.plotting import figure

args = curdoc().session_context.request.arguments
print(args)

# Set up data
N = 200
x = np.linspace(0, 4*np.pi, N)
y = np.sin(x)
source = ColumnDataSource(data=dict(x=x, y=y))


# Set up plot
plot = figure(plot_height=400, plot_width=400, title="my sine wave",
              tools="crosshair,pan,reset,save,wheel_zoom",
              x_range=[0, 4*np.pi], y_range=[-2.5, 2.5])

plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6)


# Set up widgets
text = TextInput(title="title", value='my sine wave')
offset = Slider(title="offset", value=0.0, start=-5.0, end=5.0, step=0.1)
amplitude = Slider(title="amplitude", value=1.0, start=-5.0, end=5.0)
phase = Slider(title="phase", value=0.0, start=0.0, end=2*np.pi)
freq = Slider(title="frequency", value=1.0, start=0.1, end=5.1)


# Set up callbacks
def update_title(attrname, old, new):
    plot.title.text = text.value

text.on_change('value', update_title)

def update_data(attrname, old, new):

    # Get the current slider values
    a = amplitude.value
    b = offset.value
    w = phase.value
    k = freq.value

    # Generate the new curve
    x = np.linspace(0, 4*np.pi, N)
    y = a*np.sin(k*x + w) + b

    source.data = dict(x=x, y=y)

for w in [offset, amplitude, phase, freq]:
    w.on_change('value', update_data)


# Set up layouts and add to document
inputs = widgetbox(text, offset, amplitude, phase, freq)

curdoc().add_root(row(inputs, plot, width=800))
curdoc().title = "Sliders"

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:17 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
havocpcommented, Feb 16, 2017

Yeah, something here isn’t quite right. Sorry about that…

Session affinity / sticky sessions are required for this to work, probably. This is a consequence of keeping server-side state so apps can be written in Python (if we kept getting a new server-side context, then we’d make the app development model more complex). The original plan we discussed, IIRC, was always for scaled-out production deployments of Bokeh to have sticky sessions.

There’s kind of an inherent session stickiness due to the websocket (once open, it always goes to the same server process). In simple cases without sticky sessions in the reverse proxy / load balancer, the original http request creates the session state, and then that session state is pretty much discarded in favor of the state the websocket request creates. But then the websocket stays connected to the same app server node and we don’t need to create session state after that second time. So as long as the app doesn’t care about the state created in the initial http request, things behave as if sessions are sticky without special behavior from your reverse proxy or load balancer. This has let us kick the sticky sessions can down the road.

request.arguments breaks this and now it’s necessary to deal with state created by the initial http request.

We actually noticed this when adding request.arguments it looks like in https://github.com/bokeh/bokeh/pull/4858 , I said

A potential point of confusion here may be that the args are always for the session-creating request, not any subsequent request using the same session. But most of the time that shouldn’t matter (?).

However obviously that wasn’t thought through fully; it isn’t just confusing and most of the time it does matter, if using more than one process.

What I forgot when saying “that shouldn’t matter” is probably that there are two requests, the http one and the websocket one, in typical usage. Maybe I was wrongly remembering that in typical usage each session is only created for the websocket.

As you say, this has been figured out for web applications. However, Bokeh can’t easily use the same answer because it isn’t a regular web framework; in general, it hides http entirely! Bokeh gives a Python-data-science type of programming model that doesn’t require people to be web devs (mess with http, JavaScript, and all that). This is done by having a big blob of Python state on the server (the Document) and syncing it to the client automatically, which of course is not how most web apps are written (they would keep state in a database, instead).

request.arguments was bolted on post-initial-Bokeh design, as a little escape hatch to get a little info from http. The problem now is that this sort of cascades; once we introduce the notion that web apps have requests, then we’ve also introduced the issue that each request should be stateless, and now you need stuff like cookies or a database or Redis to store your state across requests… Bokeh of course doesn’t support setting cookies because it doesn’t use the stateless http request/response web app model in the first place.

Some possible solutions:

  • document that request.arguments won’t always be present
  • document possible solutions: sticky sessions; stuff your cross-app-process data in Redis; ?
  • when the initial page load has query params, embed them in the html so the same params are passed to the websocket request ?
  • make Bokeh have some sort of built-in simplified API cross-node shared data (has to be cross-machine, not just cross-process - Redis is a typical solution for stuff like this) - the idea would be to give a facility similar in spirit to Redis, maybe even configurable to use Redis, but much simpler API-wise

I’m a little skeptical of automatically forwarding request.arguments around; after all, if dropping to the http layer, maybe you actually care about this request, and might even want to know that the websocket request did not have arguments. But supporting some way to copy request.arguments into a shared location could be handy.

The danger is that if we go too far down the road of trying to allow writing a full-blown web app with full-blown http access in Bokeh, it will lose track of the actual original point which was to enable writing apps without learning http/javascript/etc.

My instinct is probably to focus on making session affinity work well; the --num-procs stuff should do affinity out of the box, and we should document how to do it with nginx or whatever. I feel like there’s a slippery slope trying to be a general web framework and it’d be better to ensure the assumption is accurate that we have server-side state.

But I don’t know. Hope the above gives someone else some ideas.

Note that you certainly can today use Redis or a database with Bokeh to store stuff keyed by session ID, and that’s no worse than where you’d be with Django or Flask or something, perhaps.

1reaction
leopdcommented, Feb 16, 2017

I think I know what’s going on. First the HTML delivers the javascript to open the websocket, and includes the sessionId so that the websocket request can get access to the Session object that was created with the initial HTML request. But the the websocket request is likely to land on a different server process where that sessionId is meaningless.

I think the demo apps have only been working because they don’t need anything from the original HTTP request, so I’m guessing whatever process the websocket request happens to land on is just creating a brand new session object when it can’t find the one specified by the id.

I think this can’t be solved easily. The server architecture assumes that any incoming websocket connection will be able to find the ServerSession object by its id, which only works if there’s a single shared memory space for all the sessions. I don’t know exactly how tornado does its forking, but I’d be really surprised if the session dictionary is somehow in a shared memory space between all the server processes. The same problem gets even worse if you’re actually scaled out to multiple machines – then the only solution is to have a shared session store like redis/memcache or something, and then the locking gets more complicated. But I think this is a symptom of a basic design flaw. People figured all this stuff out in the 1990’s with web applications, so the correct design patterns should be well known. But they get harder when you’re trying to support real-time updates.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Running a Bokeh server
When creating a session for an application, Bokeh makes the session context available as curdoc().session_context . The most useful function of the session...
Read more >
Bokehサーバの実行
When a session is created for a Bokeh application, the session context is made available as curdoc().session_context . The most useful function of...
Read more >
Bokeh Server Main Py - Dosen.app
The Bokeh Server is besides well-suited to this usage, and you will want to offset ... [b'x']} args = curdoc () . session_context...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found