Hacker News: dplavery92

New comment by dplavery92 in "Tencent Hunyuan-Large"

dplavery92 — Tue, 05 Nov 2024 21:43:57 +0000

The title of Tencent's paper [0] as well as their homepage for the model [1] each use the term "Open-Source" in the title, so I think they are making the claim.

[0] https://arxiv.org/pdf/2411.02265 [1] https://llm.hunyuan.tencent.com/

New comment by dplavery92 in "C++ proposal: There are exactly 8 bits in a byte"

dplavery92 — Thu, 17 Oct 2024 23:56:53 +0000

Eight is a nice power of two.

New comment by dplavery92 in "Kalman Filter Explained Simply"

dplavery92 — Mon, 12 Feb 2024 21:02:29 +0000

You can also construct multiple hypothesis trackers from multiple Kalman Filters, but there is a little more machinery. For example, Interacting Multiple Models (IMM) trackers may use Kalman Filters or Particle Filters, and a lot of the foundational work by Bar-Shalom and others focuses on Kalman Filters.

New comment by dplavery92 in "Kalman Filter Explained Simply"

dplavery92 — Mon, 12 Feb 2024 20:55:53 +0000

The Kalman filter has a family of generalizations in the Extended Kalman Filter (EKF) and Unscented Kalman Filter (UKF.)

Also common in robotics applications is the Particle Filter, which uses a Monte Carlo approximation of the uncertainty in the state, rather than enforcing a (Gaussian) distribution, as in the traditional Kalman filter. This can be useful when the mechanics are highly nonlinear and/or your measurement uncertainties are, well, very non-Gaussian. Sebastian Thrun (a CMU robotics professor in the DARPA "Grand Challenge" days of self-driving cars) made an early Udacity course on Particle Filters.

New comment by dplavery92 in "Simulating fluids, fire, and smoke in real-time"

dplavery92 — Tue, 19 Dec 2023 21:27:14 +0000

I was encountering the same problem on my Intel MBP, and per another one of the comments here, find that switching from Chrome to Safari to view the page allows me to view the whole page, view it smoothly, and without my CPU utilization spiking or my fans spinning up.

New comment by dplavery92 in "OpenAI's board has fired Sam Altman"

dplavery92 — Fri, 17 Nov 2023 21:41:42 +0000

I don't think anyone in this thread knows what happened, but since we're in a thread speculating why the CEO of the leading AI company was suddenly sacked, the possibility of an unacceptable interpersonal scandal isn't any more outlandish than others' suggestions of fraud, legal trouble for OpenAI, or foundering financials. The suggestion here is simply that Altman having done something "big and dangerous" is not a foregone conclusion.

In the words of Brandt, "well, Dude, we just don't know."

New comment by dplavery92 in "UHZ1: NASA telescopes discover record-breaking black hole"

dplavery92 — Mon, 06 Nov 2023 22:07:27 +0000

Correct, the article places UHZ1 at 13.2 billion light-years away, so roughly ~500 Gy into our 13.7-billion-year-old universe.

New comment by dplavery92 in "Mars has a layer of molten rock inside"

dplavery92 — Fri, 27 Oct 2023 00:24:25 +0000

https://en.wikipedia.org/wiki/Earthrise

New comment by dplavery92 in "Medieval staircases were not built going clockwise for the defender's advantage"

dplavery92 — Mon, 09 Oct 2023 21:17:43 +0000

From the captioned art in the article: "Siege, from the Peterborough Psalter, early 14th century, via the KBR Museum, Belgium. Yes, those defenders are all women."

New comment by dplavery92 in "A non-mathematical introduction to Kalman filters for programmers"

dplavery92 — Wed, 02 Aug 2023 23:27:10 +0000

I think a great place to start is https://www.bzarg.com/p/how-a-kalman-filter-works-in-picture...

Unlike the OP article, it does make use of the math formalism for Kalman filters, but it is a relatively gentle introduction that does a very good job visualizing and explaining the intuition of each term. I have gotten positive feedback (no pun intended!) from interns or junior hires using this resource to familiarize themselves with the topic.

If you are making a deeper study and are ready to dive into a textbook that more thoroughly explores theory and application, there is a book by Gibbs[1] that I have used in the past and is well-regarded in some segments of industry that rely on these techniques for state estimation and GNC.

[1] https://onlinelibrary.wiley.com/doi/book/10.1002/97804708900...

New comment by dplavery92 in "Like diffusion but faster: The Paella model for fast image generation"

dplavery92 — Tue, 27 Jun 2023 21:07:16 +0000

From Sections 3 and 4 of the VQGAN paper[1] upon this work is built: "To generate images in the megapixel regime, we ... have to work patch-wise and crop images to restrict the length of [the quantized encoding vector] s to a maximally feasible size during training. To sample images, we then use the transformer in a sliding-window manner as illustrated in Fig.3." ... "The sliding window approach introduced in Sec.3.2 enables image synthesis beyond a resolution of 256×256pixels."

From the Paella paper[2]: "Our proposal builds on the two-stage paradigm introduced by Esser et al. and consists of a Vector-quantized Generative Adversarial Network (VQGAN) for projecting the high dimensional images into a lower-dimensional latent space... [w]e use a pretrained VQGAN with an f=4 compression and a base resolution of 256×256×3, mapping the image to a latent resolution of 64×64indices." After training, in describing their token predictor architecture: "Our architecture consists of a U-Net-style encoder-decoder structure based on residual blocks,employing convolutional[sic] and attention in both, the encoder and decoder pathways."

U-Net, of course, is a convolutional neural network architecture. [3]. The "down" and "up" encoder/decoder blocks in the Paella code are batch-normed CNN layers. [4]

[1] https://arxiv.org/pdf/2012.09841.pdf [2] https://arxiv.org/pdf/2211.07292.pdf [3] https://arxiv.org/abs/1505.04597 [4] https://github.com/dome272/Paella/blob/main/src/modules.py#L...

New comment by dplavery92 in "Like diffusion but faster: The Paella model for fast image generation"

dplavery92 — Tue, 27 Jun 2023 00:33:52 +0000

Transformers are not forced to use a specific input (or output) shape; the original ViT paper demonstrates interpolating positional embeddings to inference with arbitrary image shapes.

New comment by dplavery92 in "Like diffusion but faster: The Paella model for fast image generation"

dplavery92 — Mon, 26 Jun 2023 22:34:56 +0000

Presumably a transformer model or similar that uses positional encodings for the tokens could do that, but the U-Net decoder here uses a fixed-shape output and learns relationships between tokens (and sizes of image features) based on the positions of those tokens in a fixed-size vector. You could still apply this process convolutionally and slide the entire network around to generate an image that is an arbitrary multiple of the token size, but image content in one area of the image will only be "aware" of image content at a fixed-size neighborhood (e.g. 256x256).

New comment by dplavery92 in "AI Canon"

dplavery92 — Thu, 25 May 2023 18:30:29 +0000

Eh, it's a little tricky. A lot of research marketed under the "AI" umbrella would be categorized under cs.LG (https://arxiv.org/list/cs.LG/recent), cs.CV (https://arxiv.org/list/cs.CV/recent), cs.CL (https://arxiv.org/list/cs.CL/recent), and to a lesser degree cs.NE (https://arxiv.org/list/cs.NE/recent). Oh, and of course, cs.AI (https://arxiv.org/list/cs.AI/recent). Not every one of those areas has grown monotonically, but the growth in CV and CL especially has been explosive over the last ten years.

New comment by dplavery92 in "Translating Akkadian clay tablets with ChatGPT?"

dplavery92 — Mon, 15 May 2023 23:08:42 +0000

Alternatively, "The Entertainment" in Infinite Jest.

New comment by dplavery92 in "Llama.cpp 30B runs with only 6GB of RAM now"

dplavery92 — Fri, 31 Mar 2023 22:09:17 +0000

Sure, but when one 12gb GPU costs ~$800 new (e.g. for the 3080 LHR), "a couple of dozens" of them is a big barrier to entry to the hobbyist, student, or freelancer. And cloud computing offers an alternative route, but, as stated, distribution introduces a new engineering task, and the month-to-month bills for the compute nodes you are using can still add up surprisingly quickly.

New comment by dplavery92 in "C++ Neural Network in a Weekend (2020)"

dplavery92 — Wed, 01 Feb 2023 04:44:07 +0000

NNs are potentially very powerful arbitrary function approximators, but you have very limited control (or, arguably, insight) into the precise nature of the solutions their optimization arrives at. Because of that, they've been especially well suited to problems in vision and NLP where we have basic intuition about the phenomenology but can't practically manage a formal description of that intuition (and enumerating that description is probably not of great intellectual interest): what, in pixel space, makes a cat a cat or a dog a dog? What, in patterns of natural words, indicates sarcasm or positive/negative sentiment?

They also get tons of use in results-oriented modeling of lots of other statistics questions in structured data (home prices, resource allocation, voter turnouts, etc.) but in this luddite's opinion, these sorts of applications tend to be pretty fraught if they short-change the convenience of the model training paradigm for a deeper understanding of the data phenomenology.

New comment by dplavery92 in "US Department of Energy: Fusion Ignition Achieved"

dplavery92 — Wed, 14 Dec 2022 01:23:20 +0000

Be that as it may, a number of positions at LLNL, including many of those affiliated with NIF, require that candidate is a US person and is eligible for a DOE security clearance. A security clearance is not necessarily binary on being a US person, but a number of national-security related positions may require not only the clearance, but also that the candidate is a US person (or outright forbid foreign nationals.)

New comment by dplavery92 in "US Department of Energy: Fusion Ignition Achieved"

dplavery92 — Wed, 14 Dec 2022 01:16:58 +0000

This is not quite correct. LLNL is a Federally Funded Research & Development Center (FFRDC) which is owned, as a facility, by the government, but managed and staffed by a non-profit contracting organization called Lawrence Livermore National Security, LLC (LLNS) under a contract funded by DOE/NNSA. The board of LLNS is made up of representatives from universities (California + TAMU), other scientific non-profits (Battelle Memorial Institute), and private nuclear ventures (e.g. Bechtel.) LLNS pays, with very few exceptions, staff salaries at LLNL, and they are not beholden to the government civilian pay schedule.

https://www.llnl.gov/about/management-sponsors

New comment by dplavery92 in "Demo of =GPT3() as a spreadsheet feature"

dplavery92 — Thu, 03 Nov 2022 00:33:13 +0000

For what it's worth, I'm also very bad at plotting graphs with any kind of accuracy, which is why I use plotting software instead of doing it by hand.

I get the feeling that my visual system and the language I use are respectively pretty bad at processing and conveying precise information from a plot, (beyond simple descriptors like "A is larger than B" or "f(x) has a maximum"). I guess I would find it mildly surprising if any Vision-Language model were able to perform those tasks very well, because the representations in question seem pretty poorly suited.

I get that popular diffusion models for image generation are doing a bad job composing concepts in a scene and keeping relationships constant over the image--even if Stable Diffusion could write in human script, it's a bad bet that the contents of a legend would match a pie chart that it drew. But other Vision-Language models, designed for image captioning or visual question answering, rather than generating diverse, stylistic images, are pretty good at that compositional information (up to, again, the "simple descriptions" level of granularity I mentioned before.)