Y
HN HRCB new | past | comments | ask | show | jobs | articles | domains | dashboard | seldon | network | factions | velocity | about hrcb
home / turbopuffer.com / item 47099401
+0.03 A distributed queue in a single JSON file on object storage (turbopuffer.com)
155 points by Sirupsen 3 days ago | 48 comments on HN | Neutral Editorial · vv3.4 · 2026-02-25
Article Heatmap
Preamble: +0.13 — Preamble P Article 1: +0.05 — Freedom, Equality, Brotherhood 1 Article 2: -0.13 — Non-Discrimination 2 Article 3: 0.00 — Life, Liberty, Security 3 Article 4: 0.00 — No Slavery 4 Article 5: 0.00 — No Torture 5 Article 6: 0.00 — Legal Personhood 6 Article 7: 0.00 — Equality Before Law 7 Article 8: 0.00 — Right to Remedy 8 Article 9: 0.00 — No Arbitrary Detention 9 Article 10: 0.00 — Fair Hearing 10 Article 11: 0.00 — Presumption of Innocence 11 Article 12: -0.22 — Privacy 12 Article 13: +0.05 — Freedom of Movement 13 Article 14: 0.00 — Asylum 14 Article 15: 0.00 — Nationality 15 Article 16: 0.00 — Marriage & Family 16 Article 17: 0.00 — Property 17 Article 18: 0.00 — Freedom of Thought 18 Article 19: +0.34 — Freedom of Expression 19 Article 20: +0.07 — Assembly & Association 20 Article 21: -0.13 — Political Participation 21 Article 22: 0.00 — Social Security 22 Article 23: 0.00 — Work & Equal Pay 23 Article 24: 0.00 — Rest & Leisure 24 Article 25: 0.00 — Standard of Living 25 Article 26: +0.24 — Education 26 Article 27: +0.20 — Cultural Participation 27 Article 28: 0.00 — Social & International Order 28 Article 29: +0.08 — Duties to Community 29 Article 30: 0.00 — No Destruction of Rights 30
Negative Neutral Positive No Data
Aggregates
Weighted Mean +0.03 Unweighted Mean +0.02
Max +0.34 Article 19 Min -0.22 Article 12
Signal 31 No Data 0
Confidence 30% Volatility 0.10 (Low)
Negative 3 Channels E: 0.6 S: 0.4
SETL +0.11 Editorial-dominant
Evidence: High: 1 Medium: 6 Low: 24 No Data: 0
Theme Radar
Foundation Security Legal Privacy & Movement Personal Expression Economic & Social Cultural Order & Duties Foundation: 0.01 (3 articles) Security: 0.00 (3 articles) Legal: 0.00 (6 articles) Privacy & Movement: -0.04 (4 articles) Personal: 0.00 (3 articles) Expression: 0.10 (3 articles) Economic & Social: 0.00 (4 articles) Cultural: 0.22 (2 articles) Order & Duties: 0.03 (3 articles)
Domain Context Profile
Element Modifier Affects Note
Privacy
No privacy policy or data handling practices observable on this technical blog post.
Terms of Service
No terms of service observable on this blog post URL.
Accessibility -0.05
Article 2 Article 21
Heavy reliance on obfuscated JavaScript and embedded scripts may impede accessibility. Chart data requires JavaScript execution.
Mission
No explicit mission statement observable on this technical documentation page.
Editorial Code +0.08
Article 19
Technical blog demonstrates transparent engineering documentation and educational intent around distributed systems design.
Ownership
No ownership transparency information observable on this blog post.
Access Model +0.05
Article 19
Blog content appears publicly accessible without paywall or registration barrier observed in HTML.
Ad/Tracking -0.12
Article 12
Embedded protection scripts (KPSDK) and challenge mechanisms suggest bot detection and potential tracking infrastructure, creating uncertainty around user privacy boundaries.
HN Discussion 14 top-level · 0 replies
soletta 2026-02-24 10:42 UTC link
The usual path an engineer takes is to take a complex and slow system and reengineer it into something simple, fast, and wrong. But as far as I can tell from the description in the blog though, it actually works at scale! This feels like a free lunch and I’m wondering what the tradeoff is.
jamescun 2026-02-24 10:48 UTC link
This post touches on a realisation I made a while ago, just how far you can get with the guarantees and trade-offs of object storage.

What actually _needs_ to be in the database? I've never gone as far as building a job queue on top of object storage, but have been involved in building surprisingly consistent and reliable systems with object storage.

dewey 2026-02-24 10:53 UTC link
Depending on who hosts your object storage this seems like it could get much more expensive than using a queue table in your database? But I'm also aware that this is a blog post of an object storage company.
Normal_gaussian 2026-02-24 11:10 UTC link
The original graph appears to simply show the blocking issue of their previous synchronisation mechanism; 10 min to process an item down to 6 min. Any central system would seem to resolve this for them.

In any organisation its good to make choices for simplicity rather than small optimisations - you're optimising maintenance, incident resolution, and development.

Typically I have a small pg server for these things. It'll work out slightly more expensive than this setup for one action, yet it will cope with so much more - extending to all kinds of other queues and config management - with simple management, off the shelf diagnostics etc.

While the object store is neat, there is a confluence of factors which make it great and simple for this workload, that may not extend to others. 200ms latency is a lot for other workloads, 5GB/s doesn't leave a lot of headroom, etc. And I don't want to be asked to diagnose transient issues with this.

So I'm torn. It's simple to deploy and configure from a fresh deployment PoV. Yet it wouldn't be accepted into any deployment I have worked on.

pjc50 2026-02-24 11:17 UTC link
Several things going on here:

- concurrency is very hard

- .. but object storage "solves" most of that for you, handing you a set of semantics which work reliably

- single file throughput sucks hilariously badly

- .. because 1Gb is ridiculously large for an atomic unit

- (this whole thing resembles a project I did a decade ago for transactional consistency on TFAT on Flash, except that somehow managed faster commit times despite running on a 400Mhz MIPS CPU. Edit: maybe I should try to remember how that worked and write it up for HN)

- therefore, all of the actual work is shifted to the broker. The broker is just periodically committing its state in case it crashes

- it's not clear whether the broker ACKs requests before they're in durable storage? Is it possible to lose requests in flight anyway?

- there's a great design for a message queue system between multiple nodes that aims for at least once delivery, and has existed for decades, while maintaining high throughput: SMTP. Actually, there's a whole bunch of message queue systems?

isoprophlex 2026-02-24 11:29 UTC link
Is this reinventing a few redis features with an object storage for persistence?
jstrong 2026-02-24 12:03 UTC link
that's A choice.
loevborg 2026-02-24 13:56 UTC link
Love this writeup. There's so much interesting stuff you can build on top of Object Storage + compare-and-swap. You learn a lot about distributed systems this way.

I'd love to see a full sample implementation based on s3 + ecs - just to study how it works.

staticassertion 2026-02-24 14:09 UTC link
Yeah, I mean, I think we're all basically doing this now, right? I wouldn't choose this design, but I think something similar to DeltaLake can be simplified down for tons of use cases. Manifest with CAS + buffered objects to S3, maybe compaction if you intend to do lots of reads. It's not hard to put it together.

You can achieve stupidly fast read/write operations if you do this right with a system that is shocking simple to reason about.

> Step 4: queue.json with an HA brokered group commit > The broker is stateless, so it's easy and inexpensive to move. And if we end up with more than one broker at a time? That's fine: CAS ensures correctness even with two brokers.

TBH this is the part that I think is tricky. Just resolving this in a way that doesn't end up with tons of clients wasting time talking to a broker that buffers their writes, pushes them, then always fails. I solved this at one point with token fencing and then decided it wasn't worth it and I just use a single instance to manage all writes. I'd again point to DeltaLake for the "good" design here, which is to have multiple manifests and only serialize compaction, which also unlocks parallel writers.

The other hard part is data deletion. For the queue it looks deadly simple since it's one file, but if you want to ramp up your scale and get multiple writers or manage indexes (also in S3) then deletion becomes something you have to slip into compaction. Again, I had it at one point and backed it out because it was painful.

But I have 40k writes per second working just fine for my setup, so I'm not worrying. I'd suggest others basically punt as hard as possible on this. If you need more writes, start up a separate index with its own partition for its own separate set of data, or do naive sharding.

motoboi 2026-02-24 14:20 UTC link
By typography alone I can now turbopuffer is written in zig.
salil999 2026-02-24 14:40 UTC link
Reminds me of WarpStream: https://www.warpstream.com

Similar idea but you have the power of S3 scale (if you really need it). For context, I do not work at WS. My company switched to it recently and we've seen great improvements over traditional Kafka.

talentedtumor 2026-02-24 16:16 UTC link
Does this suffer from ABA problem, or does object storage solve that for you by e.g. refusing to accept writes where content has changed between the read and write?
up2isomorphism 2026-02-24 16:49 UTC link
The windows has passed already for this kind of opportunities since there are dozen of people all doing the same thing. Also abusing object storage is not very fun.
hinkley 2026-02-24 19:57 UTC link
For performance reasons we needed a set of assets on all copies of a service. We were using consul for the task management, which is effectively a tree of data that’s tantamount to a json file (in fact we usually pull trees of data as a json file).

Among other problems I knew the next thing we were going to have to do was autoscaling and the system we had for call and response was a mess from that respect. Unanswered questions were: How do you know when all agents have succeeded, how do you avoid overwriting your peers’ data, and what do you do with agents that existed yesterday and don’t today?

I ended up rewriting all of the state management data so that each field had one writer and one or more readers. It also allowed me to move the last live service call for another service and decommission it. Instead of having a admin service you just called one of the peers at random and elected it leader for the duration of that operation. I also arranged the data so the leader could watch the parent key for the roll call and avoid needing to poll.

Each time a task was created the leader would do a service discovery call to get a headcount and then wait for everyone to set a suggests or failure state. Some of these state transitions were idempotent, so if you reissued a task you didn’t need to delete the old results. Everyone who already completed it would noop, and the ones that failed or the new servers that joined the cluster would finish up. If there was a delete operation later then the data would be purged from the data set and the agents, a subsequent call would be considered new.

Long story short, your CS program should have distributed computing classes because this shit is hard to work out from first principles when you don’t know what the principles even are.

Score Breakdown
+0.13
Preamble Preamble
Medium F:engineering-education A:technical-transparency
Editorial
+0.15
Structural
+0.10
SETL
+0.09
Combined
ND
Context Modifier
ND

Educational content on distributed systems promotes knowledge transfer and technical understanding. Preamble-level dignity framing minimal but implicit in open technical discussion.

+0.05
Article 1 Freedom, Equality, Brotherhood
Low P:public-access
Editorial
ND
Structural
+0.05
SETL
ND
Combined
ND
Context Modifier
ND

Public access to technical knowledge without registration barrier supports equality principle. No direct editorial framing of equality.

-0.13
Article 2 Non-Discrimination
Medium P:accessibility-barrier
Editorial
ND
Structural
-0.08
SETL
ND
Combined
ND
Context Modifier
ND

JavaScript-heavy page with obfuscated protection scripts creates accessibility barriers. Bot detection/KPSDK may discriminate based on device capability. Domain modifier applied for accessibility concerns.

0.00
Article 3 Life, Liberty, Security
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding right to life on technical blog. No structural mechanisms impacting this right.

0.00
Article 4 No Slavery
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding slavery or servitude on technical documentation.

0.00
Article 5 No Torture
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding torture or cruel treatment on technical blog.

0.00
Article 6 Legal Personhood
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding legal personhood on technical documentation.

0.00
Article 7 Equality Before Law
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding equal protection before law on blog content.

0.00
Article 8 Right to Remedy
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding legal remedy on technical documentation.

0.00
Article 9 No Arbitrary Detention
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding arrest or detention on technical blog.

0.00
Article 10 Fair Hearing
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding fair trial on technical documentation.

0.00
Article 11 Presumption of Innocence
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding criminal retrospectivity on blog.

-0.22
Article 12 Privacy
Medium P:tracking-infrastructure P:challenge-mechanisms
Editorial
-0.05
Structural
-0.15
SETL
+0.12
Combined
ND
Context Modifier
ND

Embedded KPSDK bot detection and challenge mechanisms create privacy and surveillance concerns. XMLHttpRequest and fetch patching adds implicit tracking headers (x-is-human, x-path, x-method). Domain modifier reflects ad_tracking concern regarding privacy uncertainty.

+0.05
Article 13 Freedom of Movement
Low P:public-access
Editorial
ND
Structural
+0.05
SETL
ND
Combined
ND
Context Modifier
ND

Content appears freely accessible without geographic restriction barriers observed. Mild positive for freedom of movement in information access.

0.00
Article 14 Asylum
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding asylum or refuge on technical blog.

0.00
Article 15 Nationality
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding nationality on technical documentation.

0.00
Article 16 Marriage & Family
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding marriage or family on blog content.

0.00
Article 17 Property
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding property rights on technical documentation.

0.00
Article 18 Freedom of Thought
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding conscience or religion on blog.

+0.34
Article 19 Freedom of Expression
High A:technical-education F:transparent-engineering P:public-access P:no-paywall
Editorial
+0.25
Structural
+0.15
SETL
+0.16
Combined
ND
Context Modifier
ND

Strong positive signals for freedom of expression and information. Technical blog provides detailed engineering documentation with educational intent. Public-facing without registration or paywall barriers. Editorial code modifier (0.08) and access_model modifier (0.05) support freedom of expression. Content demonstrates commitment to transparent knowledge sharing about distributed systems.

+0.07
Article 20 Assembly & Association
Low A:peaceful-assembly-knowledge
Editorial
+0.10
Structural
+0.05
SETL
+0.07
Combined
ND
Context Modifier
ND

Public technical content enables knowledge communities to assemble and coordinate. Open documentation supports peaceful collective learning around distributed systems.

-0.13
Article 21 Political Participation
Medium P:accessibility-barrier P:javascript-requirement
Editorial
ND
Structural
-0.08
SETL
ND
Combined
ND
Context Modifier
ND

JavaScript-heavy site with obfuscated protection code and chart data requiring script execution creates barriers to participation. Bot detection mechanisms may exclude legitimate users with accessibility needs. Domain modifier applied for accessibility concerns affecting participation rights.

0.00
Article 22 Social Security
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding social security or welfare on technical blog.

0.00
Article 23 Work & Equal Pay
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding work or employment on technical documentation.

0.00
Article 24 Rest & Leisure
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding rest or leisure on blog.

0.00
Article 25 Standard of Living
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding food, clothing, housing on technical documentation.

+0.24
Article 26 Education
Medium A:technical-education F:knowledge-accessibility
Editorial
+0.20
Structural
+0.10
SETL
+0.14
Combined
ND
Context Modifier
ND

Technical education content promotes learning and development. Detailed engineering documentation supports human capability development. Editorial code transparency (0.08) supports educational purpose. Public access enables broader learning opportunity.

+0.20
Article 27 Cultural Participation
Medium A:scientific-sharing F:engineering-transparency
Editorial
+0.15
Structural
+0.10
SETL
+0.09
Combined
ND
Context Modifier
ND

Blog represents participation in scientific and technical culture. Engineering documentation shared openly demonstrates commitment to cultural participation in technology field. Educational intent supports broader access to scientific advancement.

0.00
Article 28 Social & International Order
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding international order or Article 28 on technical blog.

+0.08
Article 29 Duties to Community
Low F:community-benefit A:shared-infrastructure
Editorial
+0.10
Structural
+0.05
SETL
+0.07
Combined
ND
Context Modifier
ND

Technical documentation sharing benefits engineering community. Discussion of queue implementation serves broader purpose of scalable infrastructure. Mild positive for community development through knowledge sharing.

0.00
Article 30 No Destruction of Rights
Low
Editorial
ND
Structural
0.00
SETL
ND
Combined
ND
Context Modifier
ND

No observable signals regarding interpretation limitations on technical blog.

About HRCB | By Right | HN Guidelines | HN FAQ | Source | UDHR | RSS
build f581ea9+b3nz · 2026-02-25 03:04 UTC