Skip to content Skip to sidebar Skip to footer

Create An Adjacency List From A Pandas Dataframe Containing Nodes

I have a pandas DataFrame containing rows of nodes that I ultimately would like to connect and turn into a graph like object. For this, I first thought of converting this DataFrame

Solution 1:

A self-join option:

df['adjacency_list'] = df.apply(lambda s: df[(df['start'] == s.end) &
                                             (df['id'] != s.id)].index.tolist(), axis=1)
print(df)

Output:

   id start end          cases adjacency_list
0   0     A   B  [c1, c2, c44]         [1, 6]
1   1     B   C   [c2, c1, c3]             []
2   2     D   F           [c4]            [5]
3   3     A   G           [c1]             []
4   4     X   X       [c1, c7]             []
5   5     F   X           [c4]            [4]
6   6     B   E      [c44, c7]             []

Solution 2:

One option would be to apply the following function - it's not completely vectorised because Dataframes don't particularly like embedding mutable objects like lists, and I don't think you can apply set operations in a vectorised way. It does cut down the number of comparisons needed though.

def f(x):
    check = df[(x["end"] == df["start"])]
    return [
        row["id"]
        for i, row in check.iterrows()
        if not set(row["cases"]).isdisjoint(x["cases"])
    ]


df["adjacency_list"] = df.apply(f, axis=1)

Or, as a big lambda function:

df["adjacency_list"] = df.apply(
    lambda x: [
        row["id"]
        for i, row in df[(x["end"] == df["start"])].iterrows()
        if not set(row["cases"]).isdisjoint(x["cases"])
    ],
    axis=1,
)

Output

   id start end          cases adjacency_list
0   0     A   B  [c1, c2, c44]         [1, 6]
1   1     B   C   [c2, c1, c3]             []
2   2     D   F           [c4]            [5]
3   3     A   G           [c1]             []
4   4     X   X       [c1, c7]            [4]
5   5     F   X           [c4]             []
6   6     B   E      [c44, c7]             []

Solution 3:

TRY:

k=0
def test(x):
    global k
    k+=1
    test_df = df[k:]
    return list(test_df[test_df['start'] == x].index)
df['adjancy_matrix'] = df.end.apply(test,1)

OUTPUT:

   id start end        cases adjancy_matrix
0   0     A   B  [c1,c2,c44]         [1, 6]
1   1     B   C   [c2,c1,c3]             []
2   2     D   F         [c4]            [5]
3   3     A   G         [c1]             []
4   4     X   X      [c1,c7]             []
5   5     F   X         [c4]             []
6   6     B   E     [c44,c7]             []

Post a Comment for "Create An Adjacency List From A Pandas Dataframe Containing Nodes"