The Marvel Universe is American comic book series such as the Avengers, the Iron Man and others. It consisting of thousands of separate universes. I like the Iron Man and the Guardians of the Galaxy series.
Well, I explain how to import the Marvel Universe dataset into agensgraph database. I referrenced these datasets from Kaggle
Requirements
- Agensgraph
- The Marvel universe datasets. (the-marvel-universe-social-network.zip)
data structure
Files | Columns | Description |
---|---|---|
edge.csv | hero / comic | Heroes and the comic in which they appear. |
hero-network.csv | hero1 / hero2 | Edges between heroes that appear in the same comic. |
edges.csv | node/ type | Node name and type |
Graph Modeling
simple graph model
- Vertex Label : hero {name} / comic {title}
- Edge Label : appeared_in / knows
this means
- hero appeared in comic.
- hero knows hero.
Data Import
unzip dataset files.
unzip files into /path/to/ in my case, path is /home/agens/marvel_universe
import query
scripts using agens cli command.
CREATE EXTENSION file_fdw;
CREATE SERVER import_server FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE nodes
(
node TEXT,
type TEXT
)
SERVER import_server
OPTIONS( FORMAT 'csv', HEADER 'true', FILENAME '/home/agens/marvel_universe/nodes.csv', delimiter ',');
CREATE FOREIGN TABLE hero_network
(
hero1 TEXT,
hero2 TEXT
)
SERVER import_server
OPTIONS( FORMAT 'csv', HEADER 'true', FILENAME '/home/agens/marvel_universe/hero-network.csv', delimiter ',');
CREATE FOREIGN TABLE edges
(
hero TEXT,
comic TEXT
)
SERVER import_server
OPTIONS( FORMAT 'csv', HEADER 'true', FILENAME '/home/agens/marvel_universe/edges.csv', delimiter ',');
create graph marvel_graph;
set graph_path = marvel_graph;
# hero vertex create
LOAD FROM nodes AS source
WITH { name : to_jsonb( source.node ), type : to_jsonb( source.type ) } as nodes
WHERE nodes.type = 'hero'
CREATE (a:hero {name: nodes.name } )
;
# comic vertex create
LOAD FROM nodes AS source
WITH { title : to_jsonb( source.node ), type : to_jsonb( source.type ) } as nodes
WHERE nodes.type = 'comic'
CREATE (a:comic { title: nodes.title } )
;
# appered_in edge create
LOAD FROM edges AS source
MATCH (h:hero {name : to_jsonb(source.hero) })
MATCH (c:comic {title : to_jsonb(source.comic) })
CREATE (h)-[:appered_in]->(c)
;
# knows edge create
CREATE ELABEL KNOWS;
LOAD FROM hero_network AS source
MATCH (h1:hero {name : to_jsonb(source.hero1) })
MATCH (h2:hero {name : to_jsonb(source.hero2) })
MERGE (h1)-[:knows]->(h2)
;
# check import result
MATCH (a:hero )-[e:knows]->(b:hero)
WHERE a.name = 'CAPTAIN AMERICA'
RETURN *
LIMIT 20
;
Result
agens=#
agens=# create graph marvel_graph;
CREATE GRAPH
agens=# set graph_path = marvel_graph;
SET
agens=# LOAD FROM nodes AS source
agens-# WITH { name : to_jsonb( source.node ), type : to_jsonb( source.type ) } as nodes
agens-# WHERE nodes.type = 'hero'
agens-# CREATE (a:hero {name: nodes.name } )
agens-# ;
GRAPH WRITE (INSERT VERTEX 6439, INSERT EDGE 0)
agens=# LOAD FROM nodes AS source
agens-# WITH { title : to_jsonb( source.node ), type : to_jsonb( source.type ) } as nodes
agens-# WHERE nodes.type = 'comic'
agens-# CREATE (a:comic { title: nodes.title } )
agens-# ;
GRAPH WRITE (INSERT VERTEX 12651, INSERT EDGE 0)
agens=# LOAD FROM edges AS source
agens-# MATCH (h:hero {name : to_jsonb(source.hero) })
agens-# MATCH (c:comic {title : to_jsonb(source.comic) })
agens-# CREATE (h)-[:appered_in]->(c)
agens-# ;
GRAPH WRITE (INSERT VERTEX 0, INSERT EDGE 94527)
agens=#
agens=# CREATE ELABEL KNOWS;
CREATE ELABEL
agens=# LOAD FROM hero_network AS source
agens-# MATCH (h1:hero {name : to_jsonb(source.hero1) })
agens-# MATCH (h2:hero {name : to_jsonb(source.hero2) })
agens-# MERGE (h1)-[:knows]->(h2)
agens-# ;
GRAPH WRITE (INSERT VERTEX 0, INSERT EDGE 183645)
agens=#
agens=# MATCH (a:hero )-[e:knows]->(b:hero)
agens-# WHERE a.name = 'CAPTAIN AMERICA'
agens-# RETURN a.name, label(e), b.name
agens-# LIMIT 20
agens-# ;
name | label | name
-------------------+---------+------------------------
"CAPTAIN AMERICA" | "knows" | "3-D MAN/CHARLES CHAN"
"CAPTAIN AMERICA" | "knows" | "ABOMINATION/EMIL BLO"
"CAPTAIN AMERICA" | "knows" | "ABSORBING MAN/CARL C"
"CAPTAIN AMERICA" | "knows" | "ACHEBE, REVEREND DOC"
"CAPTAIN AMERICA" | "knows" | "ACHILLES II/HELMUT"
"CAPTAIN AMERICA" | "knows" | "ADAMS, CINDY"
"CAPTAIN AMERICA" | "knows" | "ADAMS, NICOLE NIKKI"
"CAPTAIN AMERICA" | "knows" | "AGAMEMNON III/"
"CAPTAIN AMERICA" | "knows" | "AGENT AXIS/"
"CAPTAIN AMERICA" | "knows" | "AKUTAGAWA, OSAMU"
"CAPTAIN AMERICA" | "knows" | "ALANYA"
"CAPTAIN AMERICA" | "knows" | "ALDEN, PROF. MEREDIT"
"CAPTAIN AMERICA" | "knows" | "ALEXANDER, CARRIE"
"CAPTAIN AMERICA" | "knows" | "ALVAREZ, FELIX"
"CAPTAIN AMERICA" | "knows" | "AMERICAN EAGLE III/J"
"CAPTAIN AMERICA" | "knows" | "AMERICOP/"
"CAPTAIN AMERICA" | "knows" | "AMPHIBIAN/KINGLEY RI"
"CAPTAIN AMERICA" | "knows" | "ANACONDA/BLANCHE SIT"
"CAPTAIN AMERICA" | "knows" | "ANELLE"
"CAPTAIN AMERICA" | "knows" | "ANGEL DOPPELGANGER"
(20 rows)
Conclusion
graph data/model is a such a complicated. but graph database can solve the problem step by step . As you can see above, Agensgraph can store csv data with graph model (ex: social network data) More information you can visit website https://bitnine.net
References
[1] https://en.wikipedia.org/wiki/Marvel_Universe
[2] https://www.kaggle.com/csanhueza/the-marvel-universe-social-network
Comments