Taalas Chip Architecture

Background

Taalas is an AI chip startup founded approximately 2.5 years ago (around 2023), with a team including former AMD and Tenstorrent employees. The company has raised $169 million in funding with the goal of challenging NVIDIA’s dominance in the AI chip market.

CEO: Ljubisa Bajic (previously founded Tenstorrent)

Core Technology: “Model as Computer”

Traditional AI hardware runs models on general-purpose GPUs, requiring significant data movement between memory and compute units. Taalas takes a radical approach:

Traditional: General Hardware → Software Model → Data Movement
Taalas:     Model Weights Hardwired → Compute-in-Memory → No Software Scheduling

Eliminating the “Memory Wall”

  • Traditional problem: 90% of energy consumption goes to data transfer between memory and compute
  • Taalas solution: Hardware-based model parameter embedding eliminates external HBM dependency

HC1 (Hardcore 1): First Generation Chip

Architecture

Component Technology
Weight固化层 (MaskROM) Mask ROM process encodes model weights in chip metal layers, co-located with compute units on the same silicon die
SRAM Dynamic Layer Stores dynamic data (e.g., KV cache for Transformer models)

Performance Metrics

Metric Value
Inference Speed 17,000 tokens/second (Llama 3.1 8B)
Performance/Watt 1000x improvement
Performance/Dollar 1000x improvement
Cost Reduction 20x
Speed Improvement 10x

Hardware Specifications

  • Process: TSMC 6nm
  • Die Area: 815 mm²
  • Data Type: 3-bit base quantization (mixed 3-bit/6-bit)
  • Trade-off: Some accuracy loss due to quantization

HC2: Next Generation

Feature Plan
Release Expected Winter 2026
Format Standard 4-bit floating point
Target Frontier-scale LLMs

Key Innovations

1. Extreme Specialization

  • Custom chip for each AI model
  • Only 2 metal layers need modification (out of 100+)
  • Model-to-hardware in 2 months (vs. 6+ months traditionally)

2. Simplified Hardware Design

No HBM required
No 3D stacking
No liquid cooling
No high-speed I/O

Result: Single chip = complete AI system

3. Automated Design Flow

  • Proprietary toolchain converts model computation graphs directly to chip physical layout
  • Rapid adaptation to new models (e.g., Meta Llama series)

Market Position

  • Target: AI inference market
  • Scenarios: Edge computing, data center inference requiring low latency and high energy efficiency
  • Strategy: Compete with general-purpose GPUs through specialization and cost advantages

Challenges

Challenge Description
Model Dependency Chip tightly bound to specific model; model updates require hardware modification (though only 2 metal layers)
Accuracy Trade-off Current 3-bit quantization may affect high-precision tasks
Ecosystem Need ongoing adaptation with mainstream AI frameworks

Conclusion

Taalas redefines AI chip architecture through “model hardwareization,” aiming to carve out a new market position against NVIDIA’s dominance. The short-term goal is to validate technical feasibility through HC1, while long-term expansion depends on HC2 and subsequent products to broaden model support.

Report generated: 2026-03-11