OPEN

Fill in the missing links in the graph
You could run this type of query once a day during a quiet period
On bigger graphs we'd run it in batches to avoid loading the whole database into memory

Common nouns ⇒ Labels

user ⇒ :User
email ⇒ :Email

Verbs that take an object ⇒ Relationships

sent ⇒ SENT
wrote ⇒ WROTE

Proper noun ⇒ Node with properties

Ian ⇒ ({name: 'Ian'})

You need to specify the weight, strength, or some other quality of the relationship:

Frendship strength
Proficiency in a skill

Attribute value comprises a complex value type:

Address (first line, second line, zip code, etc)

Attribute values are interconnected:

Taxonomy of skills

Rich Context, Multiple Dimensions

Be as simple as possible
But beware verbing
- Language habit: verb ⇒ none
  - Send an email ⇒ EMAIL
  - Search Goolge ⇒ GOOGLE

An intermediate node provides flexibility
- It allows more than two nodes to be connected in a single context
But it can be overkill, and will have an impact on performance

Entities are linked in a sequence
You need to traverse the sequence
You may need to identify the beginning or end (first/last, earliest/latest, etc.)
Examples
- Event stream
- Episodes of a TV series
- Job history

Time-based
- Universal versioning schema
- Discrete, continuous sequence
  - Millis since the epoch

Sttucture
- Identity nodes
  - Placeholders
- Timestamped identity relationships
  - i.e. normal domain relationships
State
- State nodes
  - Sanpshot of entity state
- Timestamped state relationships

MATCH (s:Shop{shop_id:1})-[r1:SELLS]->(p:Product)
WHERE (r1.from <= 1391558400000 AND r1.to > 1391558400000)
MATCH (p)-[r2:STATE]->(ps:ProductState)
WHERE (r2.from <= 1391558400000 AND r2.to > 1391558400000)
RETURN p.product_id AS productId,
       ps.name AS product,
       ps.price AS price
ORDER BY price DESC

Purely additive
- No deletions
- Store file locality for node and relationship properties
Creates a lot more data
- Nodes and relationships
Queries will be more complex
Some queries will be slower
- Because they have to search more of the graph

Definition

Restructure graph without changing informational semantics

Reasons

Improve design
Enhance performance
Accommodate new functionality
Enable iterative and incremental development of data model

Execute in repeatable order
Backup database
Execute in batches
- Unbounded results will generate large transactions and may trigger Out of Memory exceptions
Apply migrations to test data to ensure existing functionality doesn't break
Ensure application can accommodate old and new structures if performing against live data

Problem

You've modeled something as a relationship (with properties), but now need to connect it to more than two things

Solution

Extract relationship into a new node (and two new relationships)
Copy old relationship properties onto new node
Delete old relationship

MATCH (a:User)-[r:EMAILED]->(b:User)
WITH a, r, b LIMIT 2
CREATE (email:Email{content:r.content})
MERGE (a)-[:SENT]->(email)-[:TO]->(b)
DELETE r
RETURN count(r) AS numberDeleted

MATCH (group:Group {name:"Neo4j - London User Group"})-[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup)
RETURN otherGroup.name,
       COUNT(topic) AS topicsInCommon,
       COLLECT(topic.name) as topics
ORDER BY topicsInCommon DESC, otherGroup.name
LIMIT 10

MATCH (group:Group {name:"Neo4j - London User Group"})-[:HAS_TOPIC]->(topic)<-[:HAS_TOPIC]-(otherGroup)
WHERE NOT ((:Member {name:"Mark Needham"})-[:MEMBER_OF]->(otherGroup))
RETURN otherGroup.name,
       COUNT(topic) AS topicsInCommon,
       COLLECT(topic.name) as topics
ORDER BY topicsInCommon DESC, otherGroup.name
LIMIT 10

MATCH (m:Member)-[:MEMBER_OF]->(group)-[:HAS_TOPIC]->(topic)
WITH m, topic, COUNT(*) AS times
WHERE times > 3

MERGE (m)-[:INTERESTED_IN]->(topic)

MATCH (member:Member)-[rel:MEMBER_OF]->(group)

MERGE (memebership:Membership {id: member.id + "_" + group.id})
SET membership.joind = rel.joined

MERGE (member)-[:HAS_MEMBERSHIP]->(membership)
MERGE (membership)-[:OF_GROUP]->(group)

MATCH (member:Member)-[:HAS_MEMBERSHIP]->(membership)

WITH member, membership ORDER BY member.id, membership.joined

WITH member, COLLECT(membership) AS memberships
UNWIND RANGE(0,SIZE(memberships) - 2) as idx

WITH memberships[idx] AS m1, memberships[idx+1] AS me
MERGE (m1)-[:NEXT]->(m2)

MATCH (group:Group {name:"Neo4j"})<-[:OF_GROUP]-(membership)-[:NEXT]->(nextMembership),
      (membership)<-[:HAS_MEMBERSHIP]-(member:Member)-[:HAS_MEMBERSHIP]->(nextMembership),
      (nextMembership)-[:OF_GROUP]->(nextGroup)
RETURN nextGroup.name COUNT(*) AS times
ORDER BY times DESC

Test Driven Data Modeling
TigerGraph

Plugin Backlinks: 아무 것도 없습니다.

Graph Data Modeling

Tip: Make the implicit explicit

Attirbutes: Property or Relationship?

Use Relationships When...

AND/OR

AND/OR

Modeling Skills as Nodes

Common Graph Structures

Rich Context, Multiple Dimensions

Trap: Verbing

Example: [:EMAILED] to (:Email)

Considerations

Linked List

Linked List

Interleaved Linked Lists

Pointers to Head and Tail

Versioning Graphs

Seprate Structure from State

Return Results

Considerations

Refactoring

Data Migrations

Extract Node From Relationship

Find similar groups to Neo4j

Exclude groups I'm a member of

What is Jonny interested in?

Facts can become nodes

Refactors to facts

Find next group people join

Docs

Refs

관련 문서

Various Ways