Skip to content

Commit

Permalink
Fix falkordb retrievers and memory
Browse files Browse the repository at this point in the history
  • Loading branch information
YoanSallami committed Sep 19, 2024
1 parent 9b7a2e6 commit 00b619b
Show file tree
Hide file tree
Showing 56 changed files with 1,007 additions and 343 deletions.
126 changes: 0 additions & 126 deletions docs/Core API/Data Types.md

This file was deleted.

File renamed without changes.
65 changes: 65 additions & 0 deletions docs/Core API/Data Types/Document.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Document

Documents are the atomic data used in HybridAGI's Document Memory, they are used to represent textual data and their chunks in the system. Allowing the system to implement vector-only [Retrieval Augmented Generation](https://en.wikipedia.org/wiki/Retrieval-augmented_generation) systems.

`Document`: Represent an unstructured textual data to be processed or saved into memory

`DocumentList`: A list of documents to be processed or saved into memory

## Definition

```python

class Document(BaseModel):
id: Union[UUID, str] = Field(description="Unique identifier for the document", default_factory=uuid4)
text: str = Field(description="The actual text content of the document")
parent_id: Optional[Union[UUID, str]] = Field(description="Identifier for the parent document", default=None)
vector: Optional[List[float]] = Field(description="Vector representation of the document", default=None)
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the document", default={})

def to_dict(self):
if self.metadata:
return {"text": self.text, "metadata": self.metadata}
else:
return {"text": self.text}

class DocumentList(BaseModel, dspy.Prediction):
docs: Optional[List[Document]] = Field(description="List of documents", default=[])

def __init__(self, **kwargs):
BaseModel.__init__(self, **kwargs)
dspy.Prediction.__init__(self, **kwargs)

def to_dict(self):
return {"documents": [d.to_dict() for d in self.docs]}

```

## Usage

```python

input_data = \
[
{
"title": "The Catcher in the Rye",
"content": "The Catcher in the Rye is a novel by J. D. Salinger, partially published in serial form in 1945–1946 and as a novel in 1951. It is widely considered one of the greatest American novels of the 20th century. The novel's protagonist, Holden Caulfield, has become an icon for teenage rebellion and angst. The novel also deals with complex issues of innocence, identity, belonging, loss, and connection."
},
{
"title": "To Kill a Mockingbird",
"content": "To Kill a Mockingbird is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature. The plot and characters are loosely based on the author's observations of her family and neighbors, as well as on an event that occurred near her hometown in 1936, when she was 10 years old. The novel is renowned for its sensitivity and depth in addressing racial injustice, class, gender roles, and destruction of innocence."
}
]

document_list = DocumentList()

for data in input_data:
document_list.docs.append(
Document(
text=data["content"],
metadata={"title": data["title"]},
)
)

>>>
```
125 changes: 125 additions & 0 deletions docs/Core API/Data Types/Fact.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Fact

Facts are the atomic data of a [Knowledge Graph](https://en.wikipedia.org/wiki/Knowledge_graph). They represent the relations between two entities (a subject and object). They are the basis of knowledge based systems and allowing to represent precise and formal knowledge. With them you can implement [Knowledge Graph based Retrieval Augmented Generation]().

`Entity`: Represent an entity like a person, object, place or document to be processed or saved into memory

`Fact`: Represent a first order predicate to be processed or saved into the `FactMemory`

`EntityList`: A list of entities to be processed or saved into memory

`FactList`: A list of facts to be processed or saved into memory

## Definition

```python

class Entity(BaseModel):
id: Union[UUID, str] = Field(description="Unique identifier for the entity", default_factory=uuid4)
label: str = Field(description="Label or category of the entity")
name: str = Field(description="Name or title of the entity")
description: Optional[str] = Field(description="Description of the entity", default=None)
vector: Optional[List[float]] = Field(description="Vector representation of the document", default=None)
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the document", default={})

def to_dict(self):
if self.metadata:
if self.description is not None:
return {"name": self.name, "label": self.label, "description": self.description, "metadata": self.metadata}
else:
return {"name": self.name, "label": self.label, "metadata": self.metadata}
else:
if self.description is not None:
return {"name": self.name, "label": self.label, "description": self.description}
else:
return {"name": self.name, "label": self.label}

class EntityList(BaseModel, dspy.Prediction):
entities: List[Entity] = Field(description="List of entities", default=[])

def __init__(self, **kwargs):
BaseModel.__init__(self, **kwargs)
dspy.Prediction.__init__(self, **kwargs)

def to_dict(self):
return {"entities": [e.to_dict() for e in self.entities]}

class Relationship(BaseModel):
id: Union[UUID, str] = Field(description="Unique identifier for the relation", default_factory=uuid4)
name: str = Field(description="Relationship name")
vector: Optional[List[float]] = Field(description="Vector representation of the relationship", default=None)
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the relationship", default={})

def to_dict(self):
if self.metadata:
return {"name": self.name, "metadata": self.metadata}
else:
return {"name": self.name}

class Fact(BaseModel):
id: Union[UUID, str] = Field(description="Unique identifier for the fact", default_factory=uuid4)
subj: Entity = Field(description="Entity that is the subject of the fact", default=None)
rel: Relationship = Field(description="Relation between the subject and object entities", default=None)
obj: Entity = Field(description="Entity that is the object of the fact", default=None)
vector: Optional[List[float]] = Field(description="Vector representation of the fact", default=None)
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the fact", default={})

def to_cypher(self) -> str:
if self.subj.description is not None:
subj = "(:"+self.subj.label+" {name:\""+self.subj.name+"\", description:\""+self.subj.description+"\"})"
else:
subj = "(:"+self.subj.label+" {name:\""+self.subj.name+"\"})"
if self.obj.description is not None:
obj = "(:"+self.obj.label+" {name:\""+self.obj.name+"\", description:\""+self.obj.description+"\"})"
else:
obj = "(:"+self.obj.label+" {name:\""+self.obj.name+"\"})"
return subj+"-[:"+self.rel.name+"]->"+obj

def from_cypher(self, cypher_fact:str, metadata: Dict[str, Any] = {}) -> "Fact":
match = re.match(CYPHER_FACT_REGEX, cypher_fact)
if match:
self.subj = Entity(label=match.group(1), name=match.group(2))
self.rel = Relationship(name=match.group(3))
self.obj = Entity(label=match.group(4), name=match.group(5))
self.metadata = metadata
return self
else:
raise ValueError("Invalid Cypher fact provided")

def to_dict(self):
if self.metadata:
return {"fact": self.to_cypher(), "metadata": self.metadata}
else:
return {"fact": self.to_cypher()}

class FactList(BaseModel, dspy.Prediction):
facts: List[Fact] = Field(description="List of facts", default=[])

def __init__(self, **kwargs):
BaseModel.__init__(self, **kwargs)
dspy.Prediction.__init__(self, **kwargs)

def to_cypher(self) -> str:
return ",\n".join([f.to_cypher() for f in self.facts])

def from_cypher(self, cypher_facts: str, metadata: Dict[str, Any] = {}):
triplets = re.findall(CYPHER_FACT_REGEX, cypher_facts)
for triplet in triplets:
subject_label, subject_name, predicate, object_label, object_name = triplet
self.facts.append(Fact(
subj = Entity(name=subject_name, label=subject_label),
rel = Relationship(name=predicate),
obj = Entity(name=object_name, label=object_label),
metadata = metadata,
))
return self

def to_dict(self):
return {"facts": [f.to_dict() for f in self.facts]}

```

## Usage

```
```
Original file line number Diff line number Diff line change
@@ -1,27 +1,23 @@
# Graph Program

The Graph Programs are a special data type representing a workflow of actions and decisions with calls to other programs. They are used by our own custom Agent, the `GraphProgramInterpreter`. In order help you to build them, we provide two ways of doing it: Using Python or Cypher.

The two ways are equivalent and allows you to choose the one you prefer.
The two ways are equivalent and allows you to choose the one you prefer, we recommend you however to use the pythonic way, to avoid syntax errors, and eventually save them into Cypher format for later use.

### Python Usage:
### Python Usage

```python
import hybridagi.core.graph_program as gp

main = gp.GraphProgram(
id = "main",
desc = "The main program",
name = "main",
description = "The main program",
)

main.add("answer", gp.Action(
tool = "Speak"
purpose = ""
prompt = \
"""
Please answer to the following question:
{{objective}}
"""
inputs=["objective"],
ouput="answer",
tool = "Speak",
purpose = "Answer the Objective's question",
prompt = "Please answer to the Objective's question",
))

main.connect("start", "answer")
Expand Down
File renamed without changes.
28 changes: 28 additions & 0 deletions docs/Core API/Data Types/Session.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Session

`UserProfile`: Represent the user profile used to personalize the interaction and by the simulation of the user.

```python

class UserProfile(BaseModel):
id: str = Field(description="Unique identifier for the user", default_factory=uuid4)
name: str = Field(description="The user name", default="Unknow")
profile: str = Field(description="The user profile", default="An average User")

class RoleType(str, Enum):
AI = "AI"
User = "User"

class Message(BaseModel):
role: RoleType
message: str

class ChatHistory(BaseModel):
msgs: List[Message] = Field(description="List of messages", default=[])

class InteractionSession(BaseModel):
id: str = Field(description="Unique identifier for the interaction session", default_factory=uuid4)
user_profile: UserProfile = Field(description="The user profile")
chat_history: ChatHistory = Field(description="The chat history")

```
Loading

0 comments on commit 00b619b

Please sign in to comment.