Hacker News: acaciabengo

New comment by acaciabengo in "Launch HN: Transload (YC P26) – Measuring freight items with CCTV"

acaciabengo — Wed, 10 Jun 2026 19:42:51 +0000

Interesting work. Is this something where SAM-3D or it background applies to?

Show HN: QuantTakeoff – Construction PDFs to takeoff and 3D scene

acaciabengo — Sat, 16 May 2026 23:09:04 +0000

I built QuantTakeoff and releasing v1.0 for validation: Input a construction PDF, get back takeoff report with wall lengths, areas, door/window counts and sizes 3D GLB of the building at real-world scale all under ~10 mins.

The pain it solves: Reduce time estimator to trace out elements either by hand or software and extract out reports.

Stack: Ensemble of computer vision tools (95%), VLM OCR (5%)

Hard parts that are working: Auto-find the plan page in a 200-sheet bid set (most stacks make you point at the right sheet manually). I am still manually annotating a new dataset for the new models. Holds up on noisy as-builts and hand-marked sheets that break OCR-first pipelines Post CV processing to optimize for usability. Pixel accuracy to measure elements including size / width of doors and windows.

Demo: https://youtu.be/fVy7tDFqR98

Particularly want feedback on: Real world estimators feedback on what could be better for the practice. What would make use this or what is missing?

P.S Live demo has been removed to manage compute costs from a few enthusiasts who burn through the HF spaces bill.

Comments URL: https://news.ycombinator.com/item?id=48164588

Points: 1

# Comments: 0

New comment by acaciabengo in "Ask HN: Who wants to be hired? (May 2026)"

acaciabengo — Wed, 06 May 2026 21:12:51 +0000

Location: Kampala, Uganda Remote: Yes | Relocate: No

Tech: Python, PyTorch, Detectron2, OpenCV, Swin/ViT, BERT, Hugging Face, Ruby on Rails, FastAPI, PostgreSQL, Airflow, RabbitMQ, Docker, GCP.

Email: acaciabengo@gmail.com

Resume:

https://drive.google.com/file/d/1oOeY0tsJ7Ujx2fk3m5BkSfBhto2...

GitHub / HF: acaciabengo

ML + Software Engineer, 11+ yrs. MSCS (ML) at Georgia Tech. Ex-founder/CTO of a telecom/fintech platform (60M+ SMS, 5M+ USSD lottery tickets, LTV/churn models that cut marketing ~50%).

Recent:

- plan_to_3d — Floor plan PDFs → interactive 3D GLB. Mask R-CNN + Swin-T/FPN in Detectron2 for wall/door/window segmentation; Shapely vectorization, trimesh extrusion with opening clipping, OCR-based scale, Gradio + Three.js viewer.

- NLP distress/moderation models for 13K users; Airflow pipelines on 1M+ msgs/month; Rails backend unifying Twilio/FB/YouTube (220K+ interactions). - CensorX — Open-sourced DistilBERT/ViT moderation models on HF (97% / 92% precision).

Looking for applied CV / multimodal ML, or senior ML+backend roles where shipping end-to-end matters.

Ask HN: Advice on Solo Launching

acaciabengo — Fri, 03 Apr 2026 15:56:35 +0000

I have built a pipeline that takes construction documents as PDFs, extracts out the floor plans.

It extracts out BIM data from the plans at pixel accuracy producing data on walls, floors, windows and doors accurately.

It also generates 3D impressions and a brief 10s 3D video of the floor plans.

I have had multiple discussion to build within a company already solving the same problem but not closed a deal yet and considering taking it to the market as a solo founder.

I am outside the UK and USA and was seeking a founder for market access and future fundraising.

What is the best advice on how to go to market as a sole founder?

Comments URL: https://news.ycombinator.com/item?id=47628282

Points: 6

# Comments: 5

New comment by acaciabengo in "Ask HN: Who wants to be hired? (March 2026)"

acaciabengo — Wed, 04 Mar 2026 19:41:34 +0000

Location: Kampala, Uganda

Remote: Yes

Willing to relocate: No

Technologies: Python, PyTorch, TensorFlow, OpenCV, Detectron2, Ruby on Rails, Docker, GCP, SQL.

Resume: https://drive.google.com/file/d/1G8Rzgb7a2kS8myjnJqhxdALUA2d...

Email: acaciabengo@gmail.com

Summary: Machine Learning Engineer & Software Engineer with 11+ years of experience, MSCS at Georgia Tech (Machine Learning).

Recent work includes a 2D→3D architectural reconstruction pipeline (Detectron2 + Swin Transformers), large-scale predictive modelling for gaming platforms (risk, churn, LTV), and NLP systems processing 1M+ messages/month for a US health-tech company.

New comment by acaciabengo in "Show HN: Emotional photoreal AI humans at $0.06 / min"

acaciabengo — Thu, 26 Feb 2026 20:09:00 +0000

This is great and exciting. I happened to be doing some research to build memory-efficient diffusion models. I have not yet built the demo, but looking at a mix of architecture from several papers, IMTalker, SageAttension, FlashVSR, and Sparse VideoGen, with the intention to reduce memory to about 8GB.

The plan was to swap FlashAttention out, and also for an audio driver; SVG could have improved. At 60FPS, I think you are already doing this.

Great work.

Show HN: Turning 2D floor plans into 3D-ready JSON with Detectron2

acaciabengo — Thu, 26 Feb 2026 18:39:58 +0000

Hey HN,

For the past few weekends, I have been working on a computer vision pipeline to solve a specific PropTech problem: turning messy, highly occluded 2D floor plans into clean, structured data for 3D extrusion. I originally built this as a demo for a firm.

The Stack & Architecture I built an instance segmentation pipeline that relies strictly on pixel-perfect masking to extract the geometry.

The Backbone: Swin Transformer + Detectron2 + OpenCV Training: Trained on 1024x1024 images using an RTX 4090. Inference: Runs on CPU in < 10s. Demo Performance: 67.1% AP50 for instance segmentation masks, and 38.2% AP across the strict 0.50:0.95 IoU thresholds.

Why I'm posting: A lot of virtual staging and architectural startups have beautiful Three.js rendering engines, but still rely on manual data entry to build the base geometry. I built this specifically as an extraction engine to sit underneath those UIs.

If you are in PropTech or building a product that could benefit from embedding this model under the hood, I would love to chat.

Comments URL: https://news.ycombinator.com/item?id=47170196

Points: 2

# Comments: 0

Instance segmentation model that extracts 3D geometry from 2D floor plans

acaciabengo — Fri, 20 Feb 2026 19:17:57 +0000

Hey HN,

I am an ML Engineer and a full-stack software engineer. For the past few weekends, I have been working on a pipeline to solve a PropTech problem: turning messy, highly occluded 2D floor plans into clean, structured data for 3D extrusion. Originally demoed for a firm hiring for the role.

The Problem: If you try to use standard object detection (bounding boxes) or basic OCR (tested Qwen, DeepSeek) on architectural plans, it fails instantly. Walls intersect, doors swings and dimension lines heavily occlude the actual structures.

The Stack & Architecture: I built an instance segmentation pipeline that relies strictly on pixel-perfect masking to pull the geometry.

The Backbone: Swin Transformer + Detectron2.

Model trained on 1024X1024 images, with RTX 4090

Inference: Inference on CPU <10s.

Output: Clean JSON

Demo Performance: 67.1% AP50 for instance segmentation masks, and a 38.2% AP across the strict 0.50:0.95 IoU thresholds.

Vector Clean-up (The JSON Payload): 3D engines don't want pixel masks; they want math. The pipeline passes the raw predictions through Shapely to run boolean unions on intersecting walls, outputting clean, mathematically sound 2D polygons in a structured JSON payload.

Why am I posting: A lot of virtual staging and architectural startups have beautiful Three.js rendering engines, but they still rely on human data entry to get the base data. I built this specifically as an extraction engine to sit underneath those UIs.

If you are in the PropTech or you are building a product that could benefit from embedding this model, I would love to chat.

Comments URL: https://news.ycombinator.com/item?id=47092547

Points: 2

# Comments: 0

New comment by acaciabengo in "Ask HN: Who wants to be hired? (February 2026)"

acaciabengo — Tue, 03 Feb 2026 11:11:02 +0000

Location: Kampala, Uganda

Remote: Yes

Willing to relocate:No

Technologies: Python, PyTorch, TensorFlow, OpenCV, Detectron2, Ruby on Rails, NLP (Transformers, ViT), Docker, GCP, SQL.

Resume: https://drive.google.com/file/d/1G8Rzgb7a2kS8myjnJqhxdALUA2d...

Email: acaciabengo@gmail.com

Linkedin: https://linkedin.com/in/acaciabengo

HuggingFace: https://huggingface.co/acaciabengo

Description: I am a Senior Machine Learning & Software Engineer with 11+ years of experience and a current MSCS student at Georgia Tech. I have worked remotely for US-based companies for the last 3+ years and specialize in bridging the gap between robust backend engineering (Ruby on Rails) and production-grade ML models.

Key Projects & Experience: • Computer Vision (2D to 3D): Currently building a pipeline to convert 2D architectural floor plans into 3D models using Image Segmentation (Detectron2) and Swin Transformers.

• ML for Gaming: Engineered predictive algorithms for high-volume sports betting and lottery platforms, including models for risk management, user segmentation, Churn Prediction and LTV forecasting.

• NLP at Scale: Architected Deep Learning models for a US-based health tech organization that reduced moderation time by 80% and processed over 1 million messages monthly.

• Content Moderation: Developed CensorX, a multimodal NSFW detection tool using Vision Transformers (ViT) and DistilBERT. Note on Hiring: I am hireable through a Canadian company (B2B/Contract) or via an Employer of Record (e.g., Globalization Partners), allowing for frictionless onboarding for North American entities.

New comment by acaciabengo in "Open source NSFW detection (ViT and DistilBERT) with 99% AUC"

acaciabengo — Sun, 18 Jan 2026 15:18:02 +0000

I have been working on CensorX, a multimodal content moderation set of models. It is from a personal project where I built content moderation in a Discord Bot.

I have open-sourced the fine-tuned models on Hugging Face and am looking for feedback on false positives/negatives in real-world scenarios.

The main exploration has been ablations based on freezing certain layers of the transformers. More work could be explored by tuning other parameters and expanding the datasets.

The Models: • Image (ViT-B/16): Fine-tuned Vision Transformer achieving 91.9% Accuracy and 0.99 AUC. o Link: https://huggingface.co/acaciabengo/nsfw_image_detection • Text (DistilBERT): Binary classifier trained on ~200k samples. o Focus: Optimized for low-latency inference (<100ms) to fit into real-time chat streams. o Link: https://huggingface.co/acaciabengo/nsfw_text_detection How to try it: 1. Self-Host (Free): You can pull the weights directly from Hugging Face and run them in your own container. 2. Managed API (Freemium): I have deployed these exact models as a high-availability API on RapidAPI. There is a free tier for testing. RapidAPI I am very interested in feedback on: • Performance • Access to larger datasets • Shared experience from people who have handled similar tasks Thank You

Open source NSFW detection (ViT and DistilBERT) with 99% AUC

acaciabengo — Sun, 18 Jan 2026 15:18:02 +0000

Article URL: https://huggingface.co/acaciabengo

Comments URL: https://news.ycombinator.com/item?id=46668474

Points: 2

# Comments: 1