Saving and loading graphs
The fastest way to ingest a graph is to load one from Raphtory's on-disk format using the load_from_file()
function on the graph. This does require first ingesting via one of the prior methods and saving the produced graph via save_to_file()
or save_to_zip()
, but means for large datasets you do not need to parse the data every time you run a Raphtory script.
Info
You can also pickle Raphtory graphs, which uses these functions under the hood.
In the example below we ingest the edge dataframe from the last section, save this graph and reload it into a second graph. These are both printed to show they contain the same data.
from raphtory import Graph
import pandas as pd
edges_df = pd.read_csv("data/network_traffic_edges.csv")
edges_df["timestamp"] = pd.to_datetime(edges_df["timestamp"])
g = Graph()
g.load_edges_from_pandas(
df=edges_df,
time="timestamp",
src="source",
dst="destination",
properties=["data_size_MB"],
layer_col="transaction_type",
)
g.save_to_file("/tmp/saved_graph")
loaded_graph = Graph.load_from_file("/tmp/saved_graph")
print(g)
print(loaded_graph)
Output
Graph(number_of_nodes=5, number_of_edges=7, number_of_temporal_edges=7, earliest_time=1693555200000, latest_time=1693557000000)
Graph(number_of_nodes=5, number_of_edges=7, number_of_temporal_edges=7, earliest_time=1693555200000, latest_time=1693557000000)