Hacker News: yiyingzhang

New comment by yiyingzhang in "MicroVMs: Run isolated sandboxes with full lifecycle control"

yiyingzhang — Fri, 26 Jun 2026 16:46:00 +0000

How's this different from Firecracker?

New comment by yiyingzhang in "A game where you're an OS and have to manage processes, memory and I/O events"

yiyingzhang — Fri, 26 Jun 2026 02:53:03 +0000

This is cool! I may introduce it to the undergrad OS course I teach at UCSD. Does it have memory hierarchy?

New comment by yiyingzhang in "NSF slashes research programs to support new tech initiative, insiders say"

yiyingzhang — Thu, 25 Jun 2026 20:12:14 +0000

This a a step away from university-driven research towards giving money to industry. FWIW, NSF's prior industry investment has rarely yield any true impact. Most successful startups spun from universities directly go for VC fund, esp. in CS.

New comment by yiyingzhang in "You can't unit test for taste"

yiyingzhang — Thu, 25 Jun 2026 20:04:57 +0000

Isn't this true since the beginning of software development? AI hasn't changed that yet

New comment by yiyingzhang in "Why eval startups fail (2025)"

yiyingzhang — Thu, 25 Jun 2026 17:24:17 +0000

"Safety evals are an exception I believe eval startups can work when they're targeting safety benchmarks specifically. Researchers who want to work on safety evals tend to be ideologically opposed to working on capabilities, which means they don't migrate to post-training or applications due to monetary incentives."

This is quite interesting. Seems more relevant in 2026.

Decoupling Compute and Memory for Async GPUs

yiyingzhang — Thu, 25 Jun 2026 17:08:20 +0000

Cool open-source project that introduces a new programming model for decoupling compute and memory for NVIDIA GPUs that supports asynchronous memory operations (e.g., Hopper). 12% perf improvement over SOTA and 67% less kernel code.

Paper: "VDCores: Resource Decoupled Programming and Execution for Asynchronous GPU" arXiv:2605.03190

Comments URL: https://news.ycombinator.com/item?id=48676372

Points: 8

# Comments: 2

New comment by yiyingzhang in "OpenAI unveils its first custom chip, built by Broadcom"

yiyingzhang — Thu, 25 Jun 2026 17:00:21 +0000

This is another Cerebras? fwiw, it took Cerebras many years to finally get a handle on the yield and the cooling problem. Wondering if they just hired a bunch of people from Cerebras.

Show HN: GPT-5 available for free on Gensee

yiyingzhang — Thu, 07 Aug 2025 20:44:17 +0000

TL;DR: we just made GPT-5 available for free on Gensee, a developer-oriented AI Agent optimization and deployment platform: https://platform.gensee.ai/

This is a crazy week with a bunch of model releases: gpt-oss, Claude-Opus-4.1, and now today's GPT-5. It may feel impossible for agent developers to keep up, with all the manual migrating, re-testing, and analyzing.

We built Gensee to solve exactly this problem. Gensee lets you see the immediate impact of a new model on your already built agents and workflows. Here’s how it works:

- Instant Model Swapping: Have an agent running on GPT-4o? With one click, you can clone it and swap the underlying model to GPT-5 family. No code changes, no re-deploying.

- Automated A/B Testing & Analysis: Run your test cases against both versions of your agent simultaneously. Gensee gives you a side-by-side comparison of outputs, latency, and cost, so you can immediately see if GPT-5 improves quality or breaks your existing prompts and tool functions.

- Smart Routing for Optimization: Gensee automatically selects the best combination of models for any given task in your agent to optimize for quality, cost, or speed.

- Pre-built Agents: You can also grab one of our pre-built agents and immediately test it across the entire spectrum of new models to see how they compare.

The goal is to eliminate the engineering overhead of model evaluation so you can spend your time building, not just updating.

We'd love for you to try it out and give us feedback, especially if you have an existing project you want to benchmark against GPT-5.

Join our Discord: https://discord.gg/qQr6SVW4

Comments URL: https://news.ycombinator.com/item?id=44830108

Points: 6

# Comments: 0

Show HN: Free access and one-click swap to GPT-OSS and Claude-Opus-4.1 on Gensee

yiyingzhang — Wed, 06 Aug 2025 16:48:51 +0000

We've made gpt-oss and Claude-Opus-4.1 available to use for free on Gensee! https://gensee.ai With Gensee, you can seamlessly upgrade your AI agents to stay current:

- One-click swap your current models with these new models (or any other supported models).

- Automatically discover the optimal combination of models for your AI agents based on your preferred metrics, whether it's cost, speed, or quality.

Also, some quick experience with a Grade-7 math problem that most previous models fail to get the correct answer. Opus-4. gets it partially right (the correct answer is A, Opus-4.1 says not sure between A and D).

"Some birds, including Ha, Long, Nha, and Trang, are perching on four parallel wires. There are 10 birds perched above Ha. There are 25 birds perched above Long. There are five birds perched below Nha. There are two birds perched below Trang. The number of birds perched above Trang is a multiple of the number of birds perched below her. How many birds in total are perched on the four wires? (A) 27 (B) 30 (C) 32 (D) 37 (E) 40"

Comments URL: https://news.ycombinator.com/item?id=44814491

Points: 3

# Comments: 0

Show HN: Optimize and launch a travel-planning AI application in minutes

yiyingzhang — Tue, 05 Aug 2025 18:04:17 +0000

We're the creators of Gensee, a platform we built to help developers quickly productionize their AI agents and workflows.

To show how Gensee works, we created a new end-to-end demo https://www.youtube.com/watch?v=AXIX9LgN4mU where we build and launch a travel planner AI application: https://demo.gensee.ai/travel-planner. The web app uses two agents: one to generate a travel plan based on user requirements built using CamelAI's multi-agent society, and another to answer follow-up questions with LLM and web search using no framework (pure Python). We've also open-sourced the travel planner app itself: https://github.com/GenseeAI/Trip-planner-demo.

Here's the process we show:

- DEPLOY: We start with the agent's source code in the GitHub repo and deploy it to Gensee directly using the repo url.

- TEST & ANALYZE: To evaluate the agent, Gensee automatically generates test cases customized to the agent. We can then inspect the full execution trace for each test run (including LLM and tool call inputs/outputs) and manually swap models/tools.

- METRICS: Next, we can instruct Gensee to automatically generate metrics (e.g., "does the generated plan include all requested cities?"). These metrics use LLM-as-a-Judge internally. There are also two objective metrics: dollar cost and execution latency.

- OPTIMIZE: We then select our desired metrics and run Gensee’s automated optimization process, which experiments with different models and tools to find the setup that maximizes quality, minimizes cost, or minimizes latency.

- LAUNCH & AUTOSCALE: Once we're happy with the optimized agent, Gensee provides a production-ready API endpoint that we can integrate directly into our web application. We can also download the Gensee-optimized source code and do more offline tuning. Once launched, the agent will be autoscaled on Gensee as requests arrive. Gensee is the only entity to pay, as Gensee internally covers all model and tool call costs.

We are trying to build the "AgentOps" tooling that we hope can be useful to all agent developers and beyond.

We would be grateful for the community's honest feedback!

You can try it here: https://platform.gensee.ai. We're providing $10 in FREE credits every month. Thanks for checking it out!

Comments URL: https://news.ycombinator.com/item?id=44801837

Points: 4

# Comments: 0

Show HN: Gensee – Free AI Agent Optimization and Deployment

yiyingzhang — Thu, 31 Jul 2025 20:06:57 +0000

I'm the co-founder of GenseeAI (https://www.gensee.ai). We've recently launched the public beta of Gensee, an AI agent/workflow platform oriented for developers and small teams.

Here’s what I’ve heard again and again: it's gotten much easier to build a proof-of-concept AI agent, but turning that prototype into a high-quality, scalable, and cost-effective product is still a massive chore, involving endless trial-and-error with prompts, models, tools, testing, analysis, etc.

We built Gensee to automate that "last mile" from prototype to production.

Here’s how it works:

- You provide the GitHub link to your project, a Docker image, or a Zipped package of your agent source to Gensee. They can be written in any framework or without a framework, as long as it’s in Python. No code modification or annotation needed.

- We handle input/output identification, model/tool call identification, test case generation, metrics generation, testing, automated optimization, server provisioning, containerization, tool/model calling, and endpoint creation to get it live as an API.

- You see detailed evaluation results with customized metrics and test cases, all fully automated.

- We optimize your agent automatically to achieve better quality, cost, and/or latency. You can download our optimized agent code, all transparent.

- You can choose any optimized or original agent configurations to serve. Simply copy the API endpoint to your frontend code calling the agent.

To support fellow developers, we give every new user 500 free monthly credits, enough to cover one to two agent deployment, optimization, and initial model and tool usage costs. If your usage grows, it becomes a cost-efficient pay-as-you-go service that scales with you.

We're still in beta and would love to get your feedback. Do you prefer no-code agent generation instead of source code uploading? Should Gensee also run your frontend and other code in addition to agents? Any other optimization goals you have? Any key missing features?