Use compute
to improve
compute.

INT21 builds self-improving AI systems for the software beneath modern AI. Our first product generates and optimizes low-level GPU software, then proves its work with tests and benchmarks.

Explore PTX Kernel Factory

A cinematic view of interconnected compute infrastructure — INT21 Use compute to improve compute

June 16, 2026 Company launch

Introducing INT21 and PTX Kernel Factory

We are building self-improving AI systems for the software beneath modern AI. The first four implementations produced by PTX Kernel Factory are open source, and the product is entering beta.

Read the launch story See launch results

01 / First proof

Generated by PTX Factory. Measured against best baseline.

PTX Kernel Factory produced four GPU kernel artifacts across Hopper and Blackwell. We tested them on matching hardware and inputs against established expert implementations, with correctness verified before timing.

Factory artifacts

Kimi Delta Attention

Expert baseline FlashKDA CUTLASS

GH200 / Hopper 1.59x

Peak measured performance.

B200 / Blackwell 1.52x

Optimized integration.

Factory artifacts

RMSNorm

Expert baseline QuACK CuTe DSL

GH200 / Hopper 8.17%

Faster on geometric mean.

B200 / Blackwell 126 / 126

Faster in every comparable case.

Operator-level benchmark results, not full-model speedup claims.

02 / First product

PTX Kernel Factory

Every generation starts ahead of the last.

A fully autonomous agent swarm searches for GPU kernel implementations, proves them on real hardware, and carries every benchmark, failure, and successful optimization into the next generation. More runs create more evidence, better search strategies, and stronger kernels over time.

Autonomous optimization loop Evidence retained across generations

01

Fully autonomous swarm

Specialized agents plan, implement, review, and optimize each kernel end to end.
02

Grounded in real hardware

Every candidate is compiled, verified, and benchmarked on the target GPU.
03

Improvement compounds

Reusable evidence improves both the search process and the kernels it produces.

Explore the product Request access

03 / Platform

Built on SwarmOS.

Thousands of agents. One evolving system.

SwarmOS is a cloud-native platform for running specialized agents at elastic scale toward the same measurable goal. Agents explore in parallel, coordinate through shared evidence, and continuously converge on stronger solutions.

Cloud-native swarm control plane Live system model

01 / Elastic scale Thousands of agents
Cloud-native scheduling expands the swarm around available compute.
02 / Shared direction One measurable goal
Every agent works against the same constraints and acceptance criteria.
03 / Generational memory Experience carries forward
Results, failures, and strategies become the next generation's starting point.

04 / Company

Founded by experts across the full AI stack.

INT21 brings together deep experience in agent systems, machine learning models, GPU software, distributed infrastructure, and cloud computing.

01Agents
02Models
03GPU
04Infrastructure
05Cloud

AI-native operating model

Engineering capacity scales with compute, not headcount.

Our experts set direction, constraints, and acceptance criteria. Autonomous agent swarms execute, evaluate, and retain the work, so adding compute expands how much engineering INT21 can perform.

Founders / Cross-stack operators

Research, systems, and infrastructure experience carried into one company.

Bing Xu

Founder & CEO

Agents
Models
GPU

Bing co-authored the original Generative Adversarial Nets paper, created XGBoost's Python package, and co-created MXNet and AITemplate. Before founding INT21, he was a Distinguished Engineer at NVIDIA following its acquisition of HippoML, the GPU inference company he founded.

Qingye Jiang

Founding Partner

Infrastructure
Cloud

Qingye has spent more than a decade building and tuning high-performance computing and distributed systems at AWS. His work spans workload analysis, performance engineering, cloud infrastructure, and real-time systems.

PTX Kernel Factory / Beta

Bring us a hard GPU workload.

Start with an operation that is too slow, a new architecture without a mature kernel, or an important workload that has not justified weeks of specialist time.

Request beta access

Use computeto improvecompute.

Introducing INT21 and PTX Kernel Factory

Fully autonomous swarm

Grounded in real hardware

Improvement compounds

Engineering capacity scales with compute, not headcount.

Bing Xu

Qingye Jiang

Bring us a hard GPU workload.

Use compute
to improve
compute.