How MicroDC Works

A simple asynchronous job queue for distributed AI inference

1

Submit Job

Send your LLM inference request via API or SDK. Specify the model and parameters.

2

Queue Processing

Job enters our distributed queue and waits for an available worker with the right GPU.

3

Worker Execution

A connected worker picks up the job, runs the inference, and submits results back.

4

Get Results

Retrieve your completed results via webhook or polling. Simple and efficient.

👨‍💻
Your App
Submit jobs via API
☁️
MicroDC Platform
Queue & orchestration
🖥️
GPU Workers
Run inference
Results
Delivered to you

Why Asynchronous Processing?

The key benefits of our distributed approach

Cost Effective

Save up to 90% compared to dedicated GPU instances. Pay only for actual compute time, not idle resources.

🔄

Massively Scalable

Process thousands of jobs in parallel across our distributed worker network. No infrastructure to manage.

🛠️

Developer Friendly

Simple API, comprehensive SDKs, and detailed documentation. Start integrating in minutes.

Under the Hood

A deeper look at how MicroDC processes your jobs

Job Submission

When you submit a job through our API or SDK, it's validated and added to our distributed queue. Each job includes the model specification, input prompt, and any configuration parameters.

Intelligent Routing

Our scheduler matches jobs with the most suitable available workers based on model requirements, GPU capabilities, and current load. This ensures optimal resource utilization.

Worker Execution

Workers pull jobs from the queue, load the required model (if not already cached), execute the inference, and submit results back to the platform. All communication is encrypted and authenticated.

Result Delivery

Once complete, results are stored and you're notified via webhook (if configured) or you can poll the status. Results remain available for retrieval for a configurable period.

Ready to Get Started?

Create a free account and start processing AI jobs in minutes.