Research Article
Taalas: Model-Specialized Hardware - Turning AI Models into Silicon

Background
Taalas is an AI chip startup founded approximately 2.5 years ago (around 2023), with a team including former AMD and Tenstorrent employees. The company has raised $169 million in funding with the goal of challenging NVIDIA’s dominance in the AI chip market.
CEO: Ljubisa Bajic (previously founded Tenstorrent)
Core Technology: “Model as Computer”
Traditional AI hardware runs models on general-purpose GPUs, requiring significant data movement between memory and compute units. Taalas takes a radical approach:
Traditional: General Hardware → Software Model → Data Movement
Taalas: Model Weights Hardwired → Compute-in-Memory → No Software Scheduling
Eliminating the “Memory Wall”
- Traditional problem: 90% of energy consumption goes to data transfer between memory and compute
- Taalas solution: Hardware-based model parameter embedding eliminates external HBM dependency
HC1 (Hardcore 1): First Generation Chip
Architecture
| Component | Technology |
|---|---|
| Weight固化层 (MaskROM) | Mask ROM process encodes model weights in chip metal layers, co-located with compute units on the same silicon die |
| SRAM Dynamic Layer | Stores dynamic data (e.g., KV cache for Transformer models) |
Performance Metrics
| Metric | Value |
|---|---|
| Inference Speed | 17,000 tokens/second (Llama 3.1 8B) |
| Performance/Watt | 1000x improvement |
| Performance/Dollar | 1000x improvement |
| Cost Reduction | 20x |
| Speed Improvement | 10x |
Hardware Specifications
- Process: TSMC 6nm
- Die Area: 815 mm²
- Data Type: 3-bit base quantization (mixed 3-bit/6-bit)
- Trade-off: Some accuracy loss due to quantization
HC2: Next Generation
| Feature | Plan |
|---|---|
| Release | Expected Winter 2026 |
| Format | Standard 4-bit floating point |
| Target | Frontier-scale LLMs |
Key Innovations
1. Extreme Specialization
- Custom chip for each AI model
- Only 2 metal layers need modification (out of 100+)
- Model-to-hardware in 2 months (vs. 6+ months traditionally)
2. Simplified Hardware Design
No HBM required
No 3D stacking
No liquid cooling
No high-speed I/O
Result: Single chip = complete AI system
3. Automated Design Flow
- Proprietary toolchain converts model computation graphs directly to chip physical layout
- Rapid adaptation to new models (e.g., Meta Llama series)
Market Position
- Target: AI inference market
- Scenarios: Edge computing, data center inference requiring low latency and high energy efficiency
- Strategy: Compete with general-purpose GPUs through specialization and cost advantages
Challenges
| Challenge | Description |
|---|---|
| Model Dependency | Chip tightly bound to specific model; model updates require hardware modification (though only 2 metal layers) |
| Accuracy Trade-off | Current 3-bit quantization may affect high-precision tasks |
| Ecosystem | Need ongoing adaptation with mainstream AI frameworks |
Conclusion
Taalas redefines AI chip architecture through “model hardwareization,” aiming to carve out a new market position against NVIDIA’s dominance. The short-term goal is to validate technical feasibility through HC1, while long-term expansion depends on HC2 and subsequent products to broaden model support.
Report generated: 2026-03-11