question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fitness function from NodeClustering object versus from evaluation module

See original GitHub issue

When I do this

from cdlib import evaluation as ev

...

cs = al.infomap(g)
print(f"Average internal degree {ev.average_internal_degree(g, cs)}")
print(f"Average internal degree {cs.average_internal_degree()}")

I get identical results. When I change the algorithm like this:

from cdlib import evaluation as ev

...

cs = al.leiden(g)
print(f"Average internal degree {ev.average_internal_degree(g, cs)}")
print(f"Average internal degree {cs.average_internal_degree()}")

The latter is FitnessResult(min=0, max=0, score=0.0, std=0.0 while the former looks fine.

And, for other algorithms the latter seems to be giving an error! max_odf with the former method works fine, but with the latter it throws a ValueError.

Am I confused about something or is this a bug?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
Yquetzalcommented, Feb 13, 2020

I think that I found 2 problems. The first one is a bug in the conversion from nx to igraph, if the name of nodes are integers, because in the __from_nx_to_igraph, nodes are added using add_vertices, which expect string (not int). It then makes problem laters in the code, I think some confusions between node names and nodes ID. I propose below a fix transforming to string, although it will somewhat alter performances. @GiulioRossetti , what do you think of this fix?

The second problem is that in our interface with leiden algorithm (and several others I think), we systematically convert the graph to undirected. In your case, the graph is originally directed, so that might introduce changes too. One fix for that would be to systematically conserve the directionality. i.e., by default, the call to __from_nx_to_igraph would conserve directionality, instead of the current behavior where, if no information on directionality is explicitly provided, the graph is converted to undirected. @GiulioRossetti am I right? Should I change the function accordingly?

Proposed changes:

def __from_nx_to_igraph(g, directed=None):
    """
    :param g:
    :param directed:
    :return:
    """

    if ig is None:
        raise ModuleNotFoundError(
            "Optional dependency not satisfied: install igraph to use the selected feature.")
        
    ####ADD TO KEEP DIRECTIONALITY BY DEFAULT
    if directed is None: 
        directed = g.is_directed()
        
    gi = ig.Graph(directed=directed)
    
    #gi.add_vertices(list(g.nodes()))
    gi.add_vertices([str(n) for n in g.nodes()]) ## proposed changed
    gi.add_edges([(str(u),str(v)) for (u,v) in g.edges()]) ## proposed changed
    #gi.add_edges(list(g.edges()))
    edgelist = nx.to_pandas_edgelist(g)
    for attr in edgelist.columns[2:]:
        gi.es[attr] = edgelist[attr]

    return gi

0reactions
Yquetzalcommented, Mar 8, 2020

I checked quickly, 5 of the 6 methods are defined for undirected network in the original article. The only problem is for “em” method, which is defined for directed too. As far as I understood the code, the problem comes from a division by zero due, I think, to nodes having 0 out-degrees. I could not understand in less than 5 minutes if the problem was in the implementation or in the article… I could not find the origin of the code for this method either ? I listed also if they accept weights…

CONGA => undirected, unweighted DER => Undirected, weighted lfm => unweighted, undirected (could be extended to directed, weighted…) em =>DIRECTED, extended to undirected modularity_max (greedy_modularity from networkx) => undirected, unweighted LEMON => Undirected, unweighted

Read more comments on GitHub >

github_iconTop Results From Across the Web

cdlib/node_clustering.py at master · GiulioRossetti/cdlib - GitHub
Quality function designed for directed graphs with overlapping communities. :return: the link modularity score. :Example:.
Read more >
CDlib Documentation
CDlib is a Python software package that allows to extract, compare and evaluate communities from complex net- works. The library provides a ...
Read more >
CDLIB: a python library to extract, compare and evaluate ...
Clustering objects make use of such information to enable specific fitness measures and community comparison scores that will be briefly ...
Read more >
CDLIB: a python library to extract, compare and ... - Gale
The NodeClustering object contains the following information: (i) the list of communities ... Fitness functions implemented in CDLIB ...
Read more >
How to define a Fitness Function in a Genetic Algorithm?
Fitness Function (also known as the Evaluation Function) ... by applying the fitness function to the test, or results obtained from the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found