Sunil Venkataram
blog / garden / projects / about
Back to garden
Tag: architecture
3 notes
  • 🌰 Edge vs Server Model Architecture - Why One DNA Cannot Serve Both
  • 🌰 KV Cache Compression - The Primary Bottleneck in Long-Context Inference
  • 🌰 Per-Layer Embeddings - Trading Flash for DRAM on Edge Models

Subscribe to get new posts in your inbox

© 2025 | Sunil Venkataram