Y
HN HRCB new | past | comments | ask | show | by right | domains | dashboard | about hrcb
+0.28 Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU (github.com)
387 points by xaskasdf 3 days ago | 102 comments on HN | Mild positive Product · vv3.4 · 2026-02-24
Article Heatmap
Preamble: +0.25 — Preamble P Article 1: +0.20 — Freedom, Equality, Brotherhood 1 Article 2: +0.20 — Non-Discrimination 2 Article 3: ND — Life, Liberty, Security Article 3: No Data — Life, Liberty, Security 3 Article 4: ND — No Slavery Article 4: No Data — No Slavery 4 Article 5: ND — No Torture Article 5: No Data — No Torture 5 Article 6: ND — Legal Personhood Article 6: No Data — Legal Personhood 6 Article 7: ND — Equality Before Law Article 7: No Data — Equality Before Law 7 Article 8: ND — Right to Remedy Article 8: No Data — Right to Remedy 8 Article 9: ND — No Arbitrary Detention Article 9: No Data — No Arbitrary Detention 9 Article 10: ND — Fair Hearing Article 10: No Data — Fair Hearing 10 Article 11: ND — Presumption of Innocence Article 11: No Data — Presumption of Innocence 11 Article 12: +0.07 — Privacy 12 Article 13: +0.30 — Freedom of Movement 13 Article 14: ND — Asylum Article 14: No Data — Asylum 14 Article 15: ND — Nationality Article 15: No Data — Nationality 15 Article 16: ND — Marriage & Family Article 16: No Data — Marriage & Family 16 Article 17: 0.00 — Property 17 Article 18: ND — Freedom of Thought Article 18: No Data — Freedom of Thought 18 Article 19: +0.57 — Freedom of Expression 19 Article 20: +0.20 — Assembly & Association 20 Article 21: ND — Political Participation Article 21: No Data — Political Participation 21 Article 22: ND — Social Security Article 22: No Data — Social Security 22 Article 23: ND — Work & Equal Pay Article 23: No Data — Work & Equal Pay 23 Article 24: ND — Rest & Leisure Article 24: No Data — Rest & Leisure 24 Article 25: +0.40 — Standard of Living 25 Article 26: +0.35 — Education 26 Article 27: +0.57 — Cultural Participation 27 Article 28: +0.15 — Social & International Order 28 Article 29: ND — Duties to Community Article 29: No Data — Duties to Community 29 Article 30: +0.10 — No Destruction of Rights 30
Negative Neutral Positive No Data
Aggregates
Weighted Mean +0.28 Unweighted Mean +0.26
Max +0.57 Article 19 Min 0.00 Article 17
Signal 13 No Data 18
Confidence 26% Volatility 0.17 (Medium)
Negative 0 Channels E: 0.5 S: 0.5
SETL -0.05 Structural-dominant
Evidence: High: 2 Medium: 10 Low: 1 No Data: 18
Theme Radar
Foundation Security Legal Privacy & Movement Personal Expression Economic & Social Cultural Order & Duties Foundation: 0.22 (3 articles) Security: 0.00 (0 articles) Legal: 0.00 (0 articles) Privacy & Movement: 0.18 (2 articles) Personal: 0.00 (1 articles) Expression: 0.39 (2 articles) Economic & Social: 0.40 (1 articles) Cultural: 0.46 (2 articles) Order & Duties: 0.13 (2 articles)
Domain Context Profile
Element Modifier Affects Note
Privacy +0.10
Article 12
GitHub has standard privacy controls and policies protecting user data and discussion content from unauthorized access.
Terms of Service +0.05
Article 1 Article 2
GitHub ToS establish baseline equal treatment of users without discrimination, though enforcement depends on implementation.
Accessibility +0.15
Article 25 Article 26
Observable accessibility features including keyboard navigation, ARIA support, and responsive design promote equitable access to platform functionality.
Mission
GitHub's public mission emphasizes open collaboration and global access to development tools, indirectly supporting knowledge-sharing rights.
Editorial Code +0.08
Article 19 Article 27
GitHub community guidelines establish standards for respectful discussion and protect user expression within community contexts.
Ownership -0.05
Article 17
GitHub retains platform control; user-generated content ownership is subject to platform terms, creating conditional rather than absolute intellectual property rights.
Access Model +0.12
Article 19 Article 27
Public discussion board model enables open participation and knowledge dissemination without gatekeeping, supporting freedom of expression and information access.
Ad/Tracking -0.08
Article 12
GitHub's feature flags and analytics tracking create potential privacy concerns; behavioral data collection may infringe on privacy of thought.
HN Discussion 16 top-level comments
randomtoast 2026-02-21 22:57 UTC link
0.2 tok/s is fine for experimentation, but it is not interactive in any meaningful sense. For many use cases, a well-quantized 8B or 13B that stays resident will simply deliver a better latency-quality tradeoff
throwaway2027 2026-02-21 23:14 UTC link
Didn't DirectX add an API for loading assets directly to GPU memory? Would that work?
jauntywundrkind 2026-02-21 23:32 UTC link
Could be neat to see what giving the 8b like 6gb ram instead of 10gb. Something in-between, where you still need NVMe, but not like the 3x ratio of the 70b model on 23GB.

Nice work. PCI-P2P (GPU-Direct (tm)) is such great stuff. Cool to see!

rl3 2026-02-21 23:42 UTC link
Nice. I've been looking at doing something similar, more on the order of running a 1T model with less than half the available VRAM.

One workup indicated it was theoretically possible to modify a piece of SGLang's routing layer to support JIT predict-ahead expert swaps from Gen5 NVMe storage straight into GPU memory.

I'm hoping that proves true. The setup relies on NVIDIA Dynamo, so NIXL primitives are available to support that.

Curious if anyone's tried this already.

Wuzado 2026-02-22 00:07 UTC link
I wonder - could this be used for multi-tier MoE? Eg. active + most used in VRAM, often used in RAM and less used in NVMe?
exabrial 2026-02-22 00:46 UTC link
I feel like we need an entirely new type of silicon for LLMs. Something completely focused on bandwidth and storage probably at the sacrifice of raw computation power.
jacquesm 2026-02-22 01:38 UTC link
This is an interesting area for experiments. I suspect that in the longer term model optimization (knowing which bits you can leave out without affecting the functioning of the model) will become the dominant area of research just like it did with compression algorithms because effectively a model is a lossy compression scheme.

And that's good because that increases democratization of AI away from the silos that are being created.

01100011 2026-02-22 02:35 UTC link
Yeah, GPUdirect should allow you to dma straight to a storage device.

I wonder... what if the m.2 storage was actually DRAM? You probably don't need persistence for spilling a model off the GPU. How would it fare vs just adding more host memory? The m.2 ram would be less flexible, but would keep the system ram free for the CPU.

civicsquid 2026-02-22 05:30 UTC link
Really cool. I'm wondering: what background did you need to be able to think of the question that resulted in this project?

I know you said you're involved in some retrogaming and were experimenting, but as someone who works in a world where hardware is pretty heavily abstracted away, even if I got into retrogaming I don't know that I'd consider that there may be a systems improvement lying around. Beyond the creative aspect, it feels like there is some systems and hardware background that helped put the idea together (and I'd be interested to go learn about of that systems/hardware knowledge myself).

Aurornis 2026-02-22 06:03 UTC link
Cool project. Can you provide more details about your DKMS patching process for consumer GPUs? This would be fun to try out, but I’d need some more details on that patch process first.
timzaman 2026-02-22 07:04 UTC link
Umm sorry but the cpu can easily keep up shuttling around to/from your nvme. Especially ancient gen3 pcie. Not sure why ud do this.
sylware 2026-02-22 09:01 UTC link
Isn't that linux DMA buf?
spwa4 2026-02-22 10:16 UTC link
I've often wondered doing this with extreme compression. What if you did extreme compression + decompression on the GPU? Because you're leaving a lot of compute unused.
stuaxo 2026-02-22 11:34 UTC link
Interesting. Can AMD GPUs do direct io like this?
7777777phil 2026-02-22 12:08 UTC link
Cool hack but 0.5 tok/s on 70B when a 7B does 30+ on the same card. NVIDIA's own research says 40-70% of agentic tasks could run on sub-10B models and the quality gap has closed fast.
davideom0414 2026-02-22 12:50 UTC link
Really interesting experiment i should have done this before Do you have numbers on effective throughput vs PCIe theoretical bandwidth? I’m curious whether this is primarily latency-bound or bandwidth-bound in practice Can some tell me??
Score Breakdown
+0.25
Preamble Preamble
Medium Practice
Editorial
ND
Structural
+0.25
SETL
ND
Combined
ND
Context Modifier
ND

Repository structure and GitHub platform support international collaboration and development without explicit discrimination. No editorial content directly addressing preamble values.

+0.20
Article 1 Freedom, Equality, Brotherhood
Medium Practice
Editorial
ND
Structural
+0.15
SETL
ND
Combined
ND
Context Modifier
ND

GitHub ToS establish equal treatment baseline. Repository itself contains no editorial on equal dignity or rights.

+0.20
Article 2 Non-Discrimination
Medium Practice
Editorial
ND
Structural
+0.15
SETL
ND
Combined
ND
Context Modifier
ND

GitHub ToS include non-discrimination provisions. No on-page evidence of discrimination in repository access or participation.

ND
Article 3 Life, Liberty, Security

No observable content addressing right to life, liberty, personal security on this repository page.

ND
Article 4 No Slavery

No observable content addressing slavery or servitude on this repository page.

ND
Article 5 No Torture

No observable content addressing torture or cruel treatment on this repository page.

ND
Article 6 Legal Personhood

No observable content addressing right to recognition before law on this repository page.

ND
Article 7 Equality Before Law

No observable content addressing equal protection of law on this repository page.

ND
Article 8 Right to Remedy

No observable content addressing right to effective remedy on this repository page.

ND
Article 9 No Arbitrary Detention

No observable content addressing freedom from arbitrary arrest on this repository page.

ND
Article 10 Fair Hearing

No observable content addressing fair trial and impartial hearing on this repository page.

ND
Article 11 Presumption of Innocence

No observable content addressing presumption of innocence on this repository page.

+0.07
Article 12 Privacy
Medium Practice
Editorial
ND
Structural
+0.05
SETL
ND
Combined
ND
Context Modifier
ND

GitHub privacy protections apply; however, feature flag and analytics tracking present on page create modest privacy concerns. Net effect slightly positive due to privacy controls outweighing tracking.

+0.30
Article 13 Freedom of Movement
Medium Practice
Editorial
ND
Structural
+0.30
SETL
ND
Combined
ND
Context Modifier
ND

Repository is publicly accessible without geographic restriction; users can freely move within and across the platform globally for collaboration purposes.

ND
Article 14 Asylum

No observable content addressing asylum or refuge rights on this repository page.

ND
Article 15 Nationality

No observable content addressing right to nationality on this repository page.

ND
Article 16 Marriage & Family

No observable content addressing family rights or marriage on this repository page.

0.00
Article 17 Property
Medium Practice
Editorial
ND
Structural
+0.05
SETL
ND
Combined
ND
Context Modifier
ND

Repository code is visible and attributable to author. However, GitHub retains platform control and content ownership is conditional, limiting absolute property rights protection.

ND
Article 18 Freedom of Thought

No observable content addressing freedom of thought, conscience, religion on this repository page.

+0.57
Article 19 Freedom of Expression
High Advocacy Framing Practice Coverage
Editorial
+0.35
Structural
+0.40
SETL
-0.14
Combined
ND
Context Modifier
ND

Repository demonstrates freedom of expression: open-source code publicly shared, comments and collaboration enabled, no gatekeeping of discussion. GitHub's access model and editorial guidelines support expression. Public repository enables information dissemination without censorship.

+0.20
Article 20 Assembly & Association
Medium Practice
Editorial
ND
Structural
+0.20
SETL
ND
Combined
ND
Context Modifier
ND

Repository allows discussions and collaborative participation; no observable prohibition on peaceful assembly or association. GitHub's community features enable collective action around technical projects.

ND
Article 21 Political Participation

No observable content addressing political participation or voting on this repository page.

ND
Article 22 Social Security

No observable content addressing social security or welfare rights on this repository page.

ND
Article 23 Work & Equal Pay

No observable content addressing work or employment rights on this repository page.

ND
Article 24 Rest & Leisure

No observable content addressing rest and leisure rights on this repository page.

+0.40
Article 25 Standard of Living
Medium Practice
Editorial
ND
Structural
+0.25
SETL
ND
Combined
ND
Context Modifier
ND

GitHub's accessible design (keyboard navigation, ARIA, responsive) supports equitable access to this technical resource. Repository contains developmental tools that may support standard of living when used broadly.

+0.35
Article 26 Education
Medium Framing Practice
Editorial
+0.15
Structural
+0.25
SETL
-0.16
Combined
ND
Context Modifier
ND

Repository description and code demonstrate technical education and knowledge sharing. GitHub's public model supports education access. Accessibility features enable learning for diverse users.

+0.57
Article 27 Cultural Participation
High Advocacy Framing Practice Coverage
Editorial
+0.40
Structural
+0.35
SETL
+0.14
Combined
ND
Context Modifier
ND

Repository explicitly shares intellectual property (software) openly; code is accessible and reusable. Open-source licensing model (implied by public repository) enables participation in scientific/cultural life. GitHub editorial guidelines and access model support creative expression and benefit-sharing.

+0.15
Article 28 Social & International Order
Medium Practice
Editorial
ND
Structural
+0.15
SETL
ND
Combined
ND
Context Modifier
ND

GitHub infrastructure provides foundational support for social and international order enabling UDHR rights. Repository participates in global technical commons. No observable impediment to rights realization.

ND
Article 29 Duties to Community

No observable content addressing duties to community or limitations on rights on this repository page.

+0.10
Article 30 No Destruction of Rights
Low Practice
Editorial
ND
Structural
+0.10
SETL
ND
Combined
ND
Context Modifier
ND

No activity or content promoting destruction of UDHR rights observed. Repository exists within legal framework protecting rights; no hostile interpretation detected.

About HRCB | By Right | HN Guidelines | HN FAQ | Source | UDHR
build fc56cf0+0q5s · 2026-02-25 01:32 UTC