Skip to content

Handling of ambiguous updates

Whereas there is a natural way to construct a link-stream graph regardless of the order the updates come in, throwing deletions into the mix can lead to questions about what graph to create. Let's look at the below example.

Order of resolving additions and deletions

Graph

G = PersistentGraph()

G.add_edge(1, "Alice", "Bob")
G.delete_edge(5, "Alice", "Bob")
G.add_edge(3, "Alice", "Bob")
G.delete_edge(7, "Alice", "Bob")

print(G.edges.explode())

Reading this line by line, we may see this as two edges between Alice and Bob which overlap in time: one starting at time 1 and ending at time 5, another starting at time 3 and ending at time 7. After all, the link-stream graphs in Raphtory are allowed to have edges between the same pair of nodes happening at the same instant. However, when we look at the exploded edges of this graph, the following is returned:

Output

Edges(Edge(source=Alice, target=Bob, earliest_time=1, latest_time=3), Edge(source=Alice, target=Bob, earliest_time=3, latest_time=5))

Here, we see two edges which are contingent to each other in time: one from time 1 to time 3 and one from time 3 to time 5. The second deletion at time 7 is ignored. The reason for this is that Raphtory's graph updates are inserted in chronological order, such that the same graph should be constructed regardless of the line order in which the updates are made (apart from events which have the same timestamp, which will be covered shortly). Here, the order is: edge addition at time 1, edge addition at time 3, edge deletion at time 5 and edge deletion at time 7. This second edge deletion is now redundant.

Hanging deletions

We saw on the previous page that adding edges without a deletion afterwards results in an edge which lasts forever. What about a deletion without a prior addition?

Graph

G = PersistentGraph()

G.delete_edge(5, "Alice", "Bob")

print(G.edges.explode())

This results in the following:

Output

Edges(Edge(source=Alice, target=Bob, earliest_time=-9223372036854775808, latest_time=5))

which assumes that an edge was once present for it to be deleted. This can be useful if we have a dataset of changes to a graph but we're missing some starting period of the system which would have once included some original edge additions.

Additions and deletions in the same instant

We have seen that if the update times to an edge are all distinct from each other, the graph that is constructed is fully unambiguous. What about when events have the same timestamp?

Graph

G1 = PersistentGraph()

G1.add_edge(1, 1, 2, properties={"message":"hi"})
G1.delete_edge(1, 1, 2)

G2 = PersistentGraph()

G2.delete_edge(1, 1, 2)
G2.add_edge(1, 1, 2, properties={"message":"hi"})

print(f"G1's edges are {G1.edges.explode()}")
print(f"G2's edges are {G2.edges.explode()}")

Output

G1's edges are Edges(Edge(source=1, target=2, earliest_time=1, latest_time=1, properties={message: hi}))
G2's edges are Edges(Edge(source=1, target=2, earliest_time=-9223372036854775808, latest_time=1), Edge(source=1, target=2, earliest_time=1, latest_time=9223372036854775807, properties={message: hi}))

In this example, it is impossible to infer what the intended update order is (particularly since we allow hanging deletions/additions). In this case, we tie-break the updates by the order in which they are executed. This means that the first graph has an edge which instantaneously appears and disappears at time 1, and in the second an edge which instantaneously disappears at time 1 but is present for all time before and all after. It is important to bear this in mind in your analysis if you are aware of some cases like this.

Interaction with layers

Remember the example earlier with how deletions and additions are resolved?

Graph

G = PersistentGraph()

G.add_edge(1, "Alice", "Bob")
G.delete_edge(5, "Alice", "Bob")
G.add_edge(3, "Alice", "Bob")
G.delete_edge(7, "Alice", "Bob")

print(G.edges.explode())

Output

Edges(Edge(source=Alice, target=Bob, earliest_time=1, latest_time=3), Edge(source=Alice, target=Bob, earliest_time=3, latest_time=5))

Now take a look at this slightly modified example.

Graph

G = PersistentGraph()

G.add_edge(1, "Alice", "Bob", layer="colleagues")
G.delete_edge(5, "Alice", "Bob", layer="colleagues")
G.add_edge(3, "Alice", "Bob", layer ="friends")
G.delete_edge(7, "Alice", "Bob", layer="friends")

print(G.edges.explode())

The only difference is that we've added layer names to the different edge instances. Now let's look at how the output differs.

Output

Edges(Edge(source=Alice, target=Bob, earliest_time=1, latest_time=5), Edge(source=Alice, target=Bob, earliest_time=3, latest_time=7))

Here we have two edges, one starting and ending at 1 and 5 respectively with the 'colleague' layer, the other starting and ending at 3 and 7 on the 'friends' layer. Layering allows different types of interaction to exist, and edges on different layers can have overlapping times in a way that doesn't make sense for edges in the same layer or for edges with no layer.