question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Passing my observed data to distance()

See original GitHub issue

I have a rather simple setup with a list of observed data (plus some other parameters), a custom synthetic data generator function, and a custom and distance estimator between the observed and synthetic data.

I don’t understand how to pass the observed data to the distance() function. Below is a (non-working) minimal example of my setup. Could you point me on how I should modify this to make it work please?

import pyabc
import os
import tempfile
from . import synth_data_generator, custom_dist


# This list contains data required for the synth_data_generator() function
synth_args = [...]
# This list contains my observed data and some other objects required for
# my custom_dist() function
obs_data = [...]

def model(parameter):
    model_p = (parameter['A'], parameter['B'])
    synth_data = synth_data_generator(model_p, *synth_args)
    return synth_data

def distance(synth_data):
    return custom_dist(synth_data, obs_data)


prior = pyabc.Distribution(A=pyabc.RV("uniform", 0, 1), B=pyabc.RV("uniform", 100, 10000))
abc = pyabc.ABCSMC(model, prior, distance, population_size=10)

db_path = ("sqlite:///" + os.path.join(tempfile.gettempdir(), "test.db"))
abc.new(db_path)

history = abc.run(minimum_epsilon=.1, max_nr_populations=10)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
yannikschaeltecommented, Nov 30, 2021

With minimal modifications, the code in your third comment should look like:

import numpy as np
import pyabc
import os
import tempfile


def main(obs_data, synth_args):
    """
    """
    prior = pyabc.Distribution(
        A=pyabc.RV("uniform", 0, 1),
        B=pyabc.RV("uniform", 100, 10000),
    )

    def model(parameter):
        model_p = (parameter['A'], parameter['B'])
        synth_data = synth_data_generator(model_p, synth_args)
        return {"y": synth_data}

    # Y: Two args: Simulation and observation (automatically filled by pyABC)
    def distance(synth, obs): # <-- DUMMY PARAMETER HERE
        return custom_dist(synth['y'], obs["y"])

    abc = pyabc.ABCSMC(model, prior, distance, population_size=10)

    db_path = ("sqlite:///" + os.path.join(tempfile.gettempdir(), "test.db"))
    # Y: Pass observed data (not necessary if distance custom-handles 2nd argument but always better)
    abc.new(db_path, obs_data)

    history = abc.run(minimum_epsilon=-.1, max_nr_populations=10)
    breakpoint()


def synth_data_generator(model_p, synth_args):
    synth_data = np.random.uniform(4.)
    return synth_data


def custom_dist(synth_data, obs_data):
    return np.random.uniform()


if __name__ == '__main__':
    # This list contains data required for the synth_data_generator() function
    synth_args = [0.]
    # This list contains my observed data and some other objects required for
    # my custom_dist() function
    # Y: Observed data must be of same form as model, a dictionary
    obs_data = {"y": synth_data_generator({"A": 0.5, "B": 0.6}, synth_args)}
    main(obs_data, synth_args)
0reactions
Gabriel-pcommented, Dec 3, 2021

Thank you Yannik

Read more comments on GitHub >

github_iconTop Results From Across the Web

MATLAB pdist2 - MathWorks
This MATLAB function returns the distance between each pair of observations in X and Y using the metric specified by Distance.
Read more >
Distance Matrix Computation - R
This function computes and returns the distance matrix computed by using the specified distance measure to compute the distances between the rows of...
Read more >
Cluster Analysis in R - RPubs
The dist() function simplifies this process by calculating distances between our observations (rows) using their features (columns).
Read more >
Regression Diagnostics - SPH - Boston University
The Cook's distance statistic is a measure, for each observation in turn, of the extent of change in model estimates when that particular...
Read more >
Distance Sampling Detection Function and Abundance ...
Format. A data.frame with 112 observations on the following 9 variables. • Sample.Label name of single transect. • Effort transect length (km).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found