Graph-Like Data Models
Graph-Like Data Models 基本组成
- node (entities)
- edge (relationships)
举例:
Social graphs
Vertices are people, and edges indicate which people know each other.
The web graph
Vertices are web pages, and edges indicate HTML links to other pages.
Road or rail networks
Vertices are junctions, and edges represent the roads or railway lines between
them.
举例
image.png一、Property Graph
In the property graph model, each vertex consists of:
• A unique identifier
• A set of outgoing edges
• A set of incoming edges
• A collection of properties (key-value pairs)
Each edge consists of:
• A unique identifier
• The vertex at which the edge starts (the tail vertex)
• The vertex at which the edge ends (the head vertex)
• A label to describe the kind of relationship between the two vertices (这点是重要的,连接不同类型数据必须将edge进行标签分类)
• A collection of properties (key-value pairs)
PostSql 实现
CREATE TABLE vertices (
vertex_id integer PRIMARY KEY,
properties json
);
CREATE TABLE edges (
edge_id integer PRIMARY KEY,
tail_vertex integer REFERENCES vertices (vertex_id),
head_vertex integer REFERENCES vertices (vertex_id),
label text,
properties json
);
CREATE INDEX edges_tails ON edges (tail_vertex);
CREATE INDEX edges_heads ON edges (head_vertex);
这样,你就可以找到每个节点,并通过edge表的head vertex与tail vertex 找到这个节点的所有输入和输出
二、The Cypher Query Language
Cypher is a declarative query language for property graphs, created for the Neo4j
graph database
当你要查,一个出生在美国,生活在欧洲的人,
MATCH
(person) -[:BORN_IN]-> () -[:WITHIN*0..]-> (us:Location {name:'United States'}),
(person) -[:LIVES_IN]-> () -[:WITHIN*0..]-> (eu:Location {name:'Europe'})
RETURN person.name
当我们将数据存在关系型数据库里,理所当然我们也能用SQL进行查询,只是相对会复杂一些。这里充分体现出,每一种数据结构都有其所适应的使用场景,并没有银弹
个评:相对于传统关系型数据库,就相当于将一个个表进行了整合,每一种edge其实就是一个表
三、Triple-Stores 与 SPARQL
这个模型和之前的有些类似,但很多工具都是以此作为存储结果的,因此必须要提
如 Jim likes bananas
Triple-Stores 组成
- Subject Jim
- Predicate(谓语) likes
- Object bananas
Subject就相当之前的Vertex,Object可能为2类
1.value in primitive datatype:在这个情况下,Predicate+ Object相当于原来带property的一个节点
如 (Lucy age 33),相当于 一个叫Lucy的节点,她的property 是(age:33)
2.另一个节点:此情况下,predicate是一个edge,subject是一个tail vertex,Object相当于是 head vertex,(lucy marriedTo alain),object和subject都是vertex,marriedTo是edge
使用三元数据存储表示同一份数据
@prefix : <urn:example:>.
_:lucy a :Person.
_:lucy :name "Lucy".
_:lucy :bornIn _:idaho.
_:idaho a :Location.
_:idaho :name "Idaho".
_:idaho :type "state".
_:idaho :within _:usa.
_:usa a :Location.
_:usa :name "United States".
_:usa :type "country".
_:usa :within _:namerica.
_:namerica a :Location.
_:namerica :name "North America".
_:namerica :type "continent"
SPARQL是为RDF开发的一种查询语言和数据获取协议,遵循W3C的RDF规范的数据来编写查询。因此,整个数据库是一组“主语 - 谓语 - 对象”三元组,这里不做深究了
四、The Foundation: Datalog
这是一种比较老的查询语言,但确实后来众多查询语言的基础