Hacker News: xiaofei_

New comment by xiaofei_ in "Ingesting PDFs and why Gemini 2.0 changes everything"

xiaofei_ — Mon, 17 Feb 2025 18:07:17 +0000

How is their API priced? I checked a few months ago and remembered it being expensive.

New comment by xiaofei_ in "Show HN: Generate structured insights from a website URL"

xiaofei_ — Sun, 15 Sep 2024 18:43:51 +0000

I used the URL of your blog post (https://breckyunits.com/30000hours.html) and ran it through a tool with the query "What is it about?" It generated a summary of your article, which you can check out here: https://drive.google.com/file/d/1WGqg2yg-waFlUcJZ6nZBSS9eViD.... Hope you find it useful!

By the way, great blog content!

New comment by xiaofei_ in "Show HN: Best Practices for Using Structured Output from LLMs"

xiaofei_ — Mon, 02 Sep 2024 01:02:58 +0000

I appreciate you taking the time to read the article and share your thoughts. Your arguments consider many perspectives and dimensions. Please allow me to address each point:

> You’re comparing apples to oranges - structured output (a capability) with structured output + CoT (a technique), saying that structured output isn’t good for reasoning. Well, it’s not supposed to “reason,” and you didn’t apply CoT to it!

The goal of our evaluation is to address the original OpenAI JSON mode statement: https://openai.com/index/introducing-structured-outputs-in-t... (see the section “Separating a final answer from supporting reasoning or additional commentary”). It illustrates structured output used as CoT reasoning steps, which is the source of the confusion and the basis for both the research paper and our evaluation work. Our findings indicate that structured output is indeed not effective for reasoning (i.e., don’t trust the answer even if you specify a reasoning field).

> Why would you use any other temperature than 0 when you are asserting the correctness of the data extraction and the “reasoning” of the LLM? You don’t want variation.

Good question. The example we provided involves not only correct arithmetic but also accurate interpretation of specific conditions (e.g., recognizing that the first 29 hours are charged at one rate, with additional hours at a higher rate). While setting the temperature to 0 ensures consistent, predictable results by choosing the most likely next word or token, we wanted to explore how the model handles variations and uncertainties. Note that all models were set with a temperature of 1.0 in the comparison. A consistent output in a multistep setup suggests a robust reasoning process we can trust. In contrast, the JSON mode with reasoning field (i.e., reasoning_steps in the structured-output-reasoning-cot pipeline, as detailed in the Chain-of-Thought Reasoning section of https://colab.research.google.com/github/instill-ai/cookbook...) did not show similar reliability.

> Why are you using the LLM to do math? If the data was extracted correctly (with structured output or function calling), let it write the formula and evaluate it. The new API is just a nicer built-in way to extract structured data. Previously (still valid), you had to use function calling and pass it a “returnResult” function that had its payload typed to your expected schema. This is one of the most powerful and effective tools we have to work with LLMs. If used properly, we shouldn’t avoid it just because it doesn’t “reason” as well.

Function calling is outside the scope of our current exploration. Our focus, inspired by the paper "Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models" Research Paper (https://arxiv.org/abs/2408.02442v1), is on maintaining the model’s accuracy while producing structured outputs. As far as I know, functions need to be pre-defined for function calling with OpenAI’s LLMs. For example, to get the correct salary calculation, you would need to pre-define the relevant salary calculation functions. By the way, I’d be interested in seeing your full code (the code in your "./api") if you’re willing to share. Our experiment focuses on evaluating the model’s reasoning ability without relying on external tools.

We also observed some intriguing results with the multi-step pipeline. In your gist, it appears that models like GPT-4o-mini or GPT-3.5-turbo didn’t produce accurate answers consistently. However, in our experiment, we achieved correct results even with these less powerful models (see video https://drive.google.com/file/d/19NZjZ8LZRazInImcm27XjperBMt...):

  - GPT-3.5 for reasoning
  - GPT-4o-mini for structured outputs (note that only GPT-4o related models support structured outputs)

Show HN: Best Practices for Using Structured Output from LLMs

xiaofei_ — Sun, 01 Sep 2024 19:29:50 +0000

Article URL: https://www.instill.tech/blog/llm-structured-outputs

Comments URL: https://news.ycombinator.com/item?id=41419624

Points: 21

# Comments: 4

New comment by xiaofei_ in "Show HN: Open-source infra for building embedded data pipelines"

xiaofei_ — Sat, 03 Sep 2022 20:43:09 +0000

Thanks for the explanation. Does it mean HubSpot as the data source itself will have to maintain these native data pipelines?

New comment by xiaofei_ in "Show HN: Open-source infra for building embedded data pipelines"

xiaofei_ — Fri, 02 Sep 2022 23:55:09 +0000

Hey, interesting work. I wonder how do you compare pipebird with data tools like Airbyte?

New comment by xiaofei_ in "Show HN: We made an open-source visual data ETL"

xiaofei_ — Thu, 25 Aug 2022 19:28:11 +0000

Another VDP author here!

Please check out VDP Documentation: http://go.instill.tech/4evat9. We are happy to answer any questions. Thanks!

New comment by xiaofei_ in "Show HN: Pornpen.ai – AI-Generated Porn"

xiaofei_ — Thu, 25 Aug 2022 17:23:08 +0000

any reasons why some results show the butt in the belly area?

Show HN: VDP – open-source unstructured visual data ETL

xiaofei_ — Mon, 22 Aug 2022 02:17:55 +0000

Hi HN - We're Ping-Lin and Xiaofei from Instill AI (https://www.instill.tech). We're building VDP (https://github.com/instill-ai/vdp), an open-source ETL tool for unstructured visual data.

When people say they are data-driven, most of the time it means they are driven by structured data. I will cut the part where we cite reports claiming that 80% of data are unstructured. The reality is unstructured data are more difficult to analyse and not a lot of companies know or have the resources to deal with them.

Before starting Instill AI, we were in a smart video startup dealing with large volumes of visual data every day. Back then, the concept of MLOps was pretty new (2014), every ML company was exploring and building its own stack. We built a battle-proven Vision AI system in-house and had the system running in production for years.

What we have learnt from the journey are: 1) buy vs. build: unaffordable high inference cost was the main barrier keeping us from adopting an off-the-shelf solution like Google Vertex AI or Amazon SageMaker, so we went for the "build" route. The truth was the resources we spent on building and maintaining the system were unexpectedly huge, time and money-wise; 2) the Vision AI system we built can actually be modularised and generalised to apply to other industry sectors.

We reckon what we had experienced can be a common phenomenon in the industry, and we can help solve the problem. That's why we decide to build VDP, an open-source, general and modularised ETL infrastructure for unstructured visual data for a broader community.

Many brilliant MLOps platforms/tools providing computer vision solutions have emerged in the last few years. Most of the tools are built from a model-centric perspective and fall into the following categories:

- general ML platforms for model training, experiment tracking, model deployment, etc.

- platforms that serve a specific vertical, such as E-commerce, and manufacturing.

- platforms that focus on a single component of MLOps, such as data labelling, dataset preparation, and model serving.

VDP is built from a data-driven perspective. Although the computer vision model is the most critical component in a visual data ETL pipeline, the ultimate goal of VDP is to streamline the end-to-end visual data flow, with the transform component being able to flexibly import computer vision models from different sources.

Today, the early version of VDP supports 2 sources and all Airbyte destination connectors, and it can import computer vision models from various sources including Local, GitHub, DVC, ArtiVC and Hugging Face. Setting up a VDP pipeline is fairly easy via its low-code API and no-code Console. Please take a look at the tutorial: https://www.instill.tech/docs/tutorials/build-an-async-det-p....

VDP can run locally with Docker Compose. We're working on integrating with Kubernetes and a fully managed version in Instill Cloud.

We aim to build VDP as the single point of visual data integration, so users can sync visual data from anywhere into centralised warehouses or applications and focus on gaining insights across all data sources, just like how the modern data stack handles structured data.

Operation-wise, VDP resources will be managed in a declarative way to fuse them better with the modern cloud-native context. The API-first and microservice design has opened all sorts of possibilities for VDP.

Thanks for reading HN! We are first-time open-source project maintainers. There are definitely lots to learn! Let us know what you think in the comments.

VDP links:

[1] GitHub: https://github.com/instill-ai/vdp

[2] Documentation: https://www.instill.tech/docs

[3] Demo: https://demo.instill.tech

Comments URL: https://news.ycombinator.com/item?id=32546589

Points: 11

# Comments: 1

Open-source no-/low-code unstructured visual data ETL tool – VDP

xiaofei_ — Tue, 02 Aug 2022 10:53:14 +0000

Article URL: https://github.com/instill-ai/vdp

Comments URL: https://news.ycombinator.com/item?id=32317126

Points: 3

# Comments: 0

Generative of six overlapping squares exploring vision and language

xiaofei_ — Tue, 12 Jul 2022 23:05:41 +0000

Article URL: https://drib.net/homage

Comments URL: https://news.ycombinator.com/item?id=32076938

Points: 1

# Comments: 0