Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing resource scheduling. #36

Open
streetycat opened this issue Aug 23, 2023 · 1 comment
Open

Computing resource scheduling. #36

streetycat opened this issue Aug 23, 2023 · 1 comment

Comments

@streetycat
Copy link
Contributor

streetycat commented Aug 23, 2023

I have compiled a rough definition and design of the computing resource module, and I hope you can join the discussion to reach a consensus on this module.

Compute Node

A computing resource node in the system that should have the following functions:

  1. Install and start several services that support computing

  2. Accept calculation tasks submitted by users and execute them

  3. Schedule these tasks (various tasks may be executed in parallel or queued)

  4. Some preset standard task types, while others are customized by developers

  5. Some computing resources are public, and some may require authorization

Compute Task Manager

The singleton component responsible for managing computing resources in the system should have the following functions:

  1. Accept registration of 'Compute Node'

  2. Accept calculation tasks submitted by users and select appropriate nodes to execute

  3. Maintain load balancing among various computing nodes

Flowchart

  1. Start up
graph TB
    subgraph ComputeNode["ComputeNode(node_id, node_entry)"]
        InstallService["InstallService(type, service_entry)"]-->StartService["StartService(type, service_entry)"]-->ServiceList["Services{type, service_entry}"]
    end

    ServiceList.->RegisterNode

    subgraph ComputeTaskManager
        RegisterNode["StartService(node_id, node_entry)"]-->Nodes["Nodes{node_id, node_entry}, Services{type, node_id[]}"]
    end
  1. Execute task
graph TB
    subgraph ComputeTaskManager
        RunTask["Run(type, params, [node])"]-->SpecifyNode{"if (node)"}
        PostTask["PostTask(type, params, node)"]
        SpecifyNode--yes-->PostTask
        SpecifyNode--No-->FilterNode["nodes=Services(type)"]-->NextNode["node = nodes.next()"]
        WaitResult["result=WaitResult()"]
    end

    NextNode.->IsBusy-.yes.->NextNode
    IsBusy-.no.->PostTask
    PostTask.->ExecuteTask

    subgraph "ComputeNode(Any)"
        IsBusy{"is busy"}
    end

    subgraph "ComputeNode(Selected)"
        ExecuteTask["result=Execute(type, params)"]-->PostResult["PostResult(result)"]
    end

    PostResult.->WaitResult

I think we can first design a universal task scheduling framework, and then support various execution environments(docker eg.) and preset different task types within this framework.

@waterflier
Copy link
Collaborator

waterflier commented Aug 23, 2023

I am delighted to read your design and suggestions. They are insightful and show some understanding of the system. I am also very excited about the potential that your participation could bring to OpenDAN.

You can read https://github.com/fiatrete/OpenDAN-Personal-AI-OS/blob/MVP/doc/mvp/compute_task.drawio for more detail of compute kernel. I am writeing a artice about workflow now. The purpose of designing compute_kernel subsystem is to enable our users to use their computational resources more efficiently. These computational resources can come from devices they own (such as their workstations and gaming laptops), as well as from cloud computing and decentralized computing networks.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants