⏰ Limited-Time Free Models — Last Chance

Several premium models are about to exit their free tier. GLM 4.6 and Kimi K2 0905 expire today (May 12), while xAI’s entire Grok line disappears tomorrow (May 13):

Model Expires Context Prompt/Completion (per 1K tokens)
Z.ai: GLM 4.6 May 14 204,800 $0.00039 / $0.0019
MoonshotAI: Kimi K2 0905 May 14 262,144 $0.0004 / $0.002
xAI: Grok 4.1 Fast May 15 2,000,000 $0.0002 / $0.0005
xAI: Grok 4 Fast May 15 2,000,000 $0.0002 / $0.0005
xAI: Grok Code Fast 1 May 15 256,000 $0.0002 / $0.0015
xAI: Grok 4 May 15 256,000 $0.003 / $0.015
xAI: Grok 3 Mini May 15 131,072 $0.0003 / $0.0005
xAI: Grok 3 May 15 131,072 $0.003 / $0.015

Grok 4.1 Fast is a standout — offering 2 million tokens of context at almost negligible pricing.

The LLM-OS-Models team dominates today’s trending board with three Gemma-4-based terminal SFT models, each crossing 1,100 downloads within hours:

  • LLM-OS-Models/gemma-4-E2B-Terminal-SFT-Native-Liquid-1Epoch — 1,102 downloads, terminal agent fine-tune
  • LLM-OS-Models/gemma-4-E2B-it-Terminal-SFT-Native-Liquid-2Epoch — 1,095 downloads, 3 likes
  • LLM-OS-Models/gemma-4-E2B-it-Terminal-SFT-Native-Liquid-1Epoch — 1,103 downloads

Other notable uploads include Kanawut/mask2former (134 downloads, segmentation model), Koras1k/ast-finetuned-gtzan (audio classification), ElioChampaney/NOVA-50M (a small language model), sundaycoil/ec2-auto-manager, and juergengunz/fluxer (4 likes).

The trend is clear: fine-tuned agent models and lightweight domain-specific models are driving the most engagement.

🚀 New Models on OpenRouter

Two new models appeared on OpenRouter:

  • Claude Opus 4.7 (Fast) by Anthropic — 1,000,000 token context, prompt pricing at $0.00003/1K tokens. This is OptiLLM-optimized, offering Opus-level quality with dramatically reduced latency.
  • Perceptron Mk1 by Perceptron — 32,768 token context, an astonishing $0.00000015/1K tokens (less than a millionth of a dollar per token). This is among the cheapest models ever listed on OpenRouter.

Five AI/ML repositories caught the community’s attention:

  1. strukto-ai/mirage ⭐2,054 — A unified virtual filesystem for AI agents, written in TypeScript. Giving AI agents a structured way to navigate and manipulate files.
  2. yaojingang/yao-open-prompts ⭐1,822 — A comprehensive Chinese AI prompt library covering work, study, content, marketing, and daily life scenarios.
  3. huangserva/3DCellForge ⭐1,698 — AI-powered interactive 3D cell generation and exploration studio, built with JavaScript.
  4. lightseekorg/tokenspeed ⭐972 — A speed-of-light LLM inference engine written in Python, pushing the boundaries of inference performance.
  5. alchaincyf/huashu-md-html ⭐479 — A bidirectional markdown↔HTML pipeline wrapping markitdown, Pandoc, and more.
  1. Agent-First Fine-Tuning: The Gemma-4 terminal SFT models signal that the community is rapidly moving toward specialized agent fine-tunes — models trained specifically for tool use and terminal interaction.
  2. Ultra-Low-Cost Inference: Perceptron Mk1 at $0.00000015/K tokens and Grok 4.1 Fast at $0.0002/K tokens represent a new frontier in affordable AI. The race to the bottom on pricing continues.
  3. Open-Source Agent Infrastructure: With mirage building a file system for AI agents and tokenspeed optimizing inference, the open-source ecosystem is maturing the full stack for autonomous AI agents.
  4. Bilingual and Multimodal Expansion: Yao Open Prompts (Chinese) and 3DCellForge (bio-visualization) demonstrate AI’s expanding reach across languages and scientific domains.