Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [SKU modularization] remove sku_config from v1alpha1 and implement skuHandler interface #601

Closed
wants to merge 7 commits into from

Commits on Sep 19, 2024

  1. delete sku_config

    smritidahal653 committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    1a0269e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    39cd273 View commit details
    Browse the repository at this point in the history
  3. feat: Add RAGEngine CRD (#597)

    This PR adds the initial draft for the RAGEngine CRD in Kaito.
    
    A RAGEngine CRD defines all resources needed to run a RAG on top of a
    LLM inference service. Upon creating a RAGEngine CR, a new controller
    will create a deployment which runs a RAG engine instance. The instance
    provides http endpoints for both `index` and `query` services. The
    instance can optionally choose a public model embedding service or run a
    local embedding model with GPU to convert the input index data to
    vectors. The instance can also connect to a Vector DB instance to
    persist the vectors db or by default using an in-memory vector DB. The
    instance uses the `llamaIndex` library to orchestrate the workflow. When
    RAGEngine instance is up and running, users should send questions to the
    `query` endpoint of RAG instance instead of the normal `chat` endpoint
    in the inference service.
    
    The RAGEngine is intended to be "standalone". It can use any public
    inference service or inference services hosted by Kaito workspace.
    
    The RAG engine instance is designed to help retrieve prompts from
    unstructured data (arbitrary index data provided by the users).
    Retrieving from structured data or search engine is out of the scope for
    now.
    Fei-Guo authored and smritidahal653 committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    f960215 View commit details
    Browse the repository at this point in the history
  4. delete sku_config

    smritidahal653 committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    2629234 View commit details
    Browse the repository at this point in the history
  5. feat: Add RAGEngine CRD (#597)

    This PR adds the initial draft for the RAGEngine CRD in Kaito.
    
    A RAGEngine CRD defines all resources needed to run a RAG on top of a
    LLM inference service. Upon creating a RAGEngine CR, a new controller
    will create a deployment which runs a RAG engine instance. The instance
    provides http endpoints for both `index` and `query` services. The
    instance can optionally choose a public model embedding service or run a
    local embedding model with GPU to convert the input index data to
    vectors. The instance can also connect to a Vector DB instance to
    persist the vectors db or by default using an in-memory vector DB. The
    instance uses the `llamaIndex` library to orchestrate the workflow. When
    RAGEngine instance is up and running, users should send questions to the
    `query` endpoint of RAG instance instead of the normal `chat` endpoint
    in the inference service.
    
    The RAGEngine is intended to be "standalone". It can use any public
    inference service or inference services hosted by Kaito workspace.
    
    The RAG engine instance is designed to help retrieve prompts from
    unstructured data (arbitrary index data provided by the users).
    Retrieving from structured data or search engine is out of the scope for
    now.
    Fei-Guo authored and smritidahal653 committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    4366c36 View commit details
    Browse the repository at this point in the history
  6. delete sku_config

    smritidahal653 committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    b75126a View commit details
    Browse the repository at this point in the history
  7. feat: Add RAGEngine CRD (#597)

    This PR adds the initial draft for the RAGEngine CRD in Kaito.
    
    A RAGEngine CRD defines all resources needed to run a RAG on top of a
    LLM inference service. Upon creating a RAGEngine CR, a new controller
    will create a deployment which runs a RAG engine instance. The instance
    provides http endpoints for both `index` and `query` services. The
    instance can optionally choose a public model embedding service or run a
    local embedding model with GPU to convert the input index data to
    vectors. The instance can also connect to a Vector DB instance to
    persist the vectors db or by default using an in-memory vector DB. The
    instance uses the `llamaIndex` library to orchestrate the workflow. When
    RAGEngine instance is up and running, users should send questions to the
    `query` endpoint of RAG instance instead of the normal `chat` endpoint
    in the inference service.
    
    The RAGEngine is intended to be "standalone". It can use any public
    inference service or inference services hosted by Kaito workspace.
    
    The RAG engine instance is designed to help retrieve prompts from
    unstructured data (arbitrary index data provided by the users).
    Retrieving from structured data or search engine is out of the scope for
    now.
    Fei-Guo authored and smritidahal653 committed Sep 19, 2024
    Configuration menu
    Copy the full SHA
    b420f09 View commit details
    Browse the repository at this point in the history