GeistHaus
log in · sign up

Hacker News: Newest

Part of Hacker News: Newest

Hacker News RSS

stories primary
Show HN: ClassKeep – Booking, credits, and smart waitlists for boutique studios

Hi HN — I built ClassKeep.

The trigger: every small studio owner I know (pilates, yoga, pottery, even a dog trainer) runs the operational side of their business on WhatsApp + a Google Sheet, or on one of the expensive incumbents (Mindbody/Glofox) they actively dislike. The single biggest leak is last-minute cancellations: someone bails at 6am and the seat just goes empty, because there's no fast way to notify a waitlist.

ClassKeep tries to remove that end-to-end:

- Public booking page on a free *.classkeep.app slug (custom domain optional), with a drag-and-drop page builder so it doesn't look like a generic SaaS template. - Credits, drop-ins, and recurring subscription plans, all via Stripe Connect — the studio gets paid directly, I never touch the money. - A waitlist that auto-notifies via email + web push + SMS the instant a seat opens; first tap wins. This is the part most studios told me they'd pay for on its own. - 1-tap tablet check-in for instructors so they're not fumbling with a clipboard between classes. - Bilingual (EN + pt-BR) with LGPD/GDPR self-serve export & delete.

Stack, for the curious: - Next.js 15 App Router, Server Actions for every mutation (no REST except the two auth route handlers and the Stripe webhook). - Drizzle ORM on Postgres. - Two separate Better-Auth instances — one for studio operators, one for students — so cookies and sessions don't collide on the same domain. - Stripe Connect (webhook-authoritative fulfillment), Twilio for SMS, OneSignal for push. - Tailwind v4 with a Material 3 token system.

Pricing: free for every studio during early access — no card, no per-student fees. Early signups get a permanent discount when paid tiers ship.

Genuine asks: 1. Try creating a studio and booking a class on it — does the flow feel less painful than Mindbody/Glofox if you've used them? 2. Does the waitlist mechanic feel different enough to matter, or is it table stakes by now? 3. If you run a studio (or know someone who does), I'd love to talk — email in profile.

Happy to go deep on any of the architecture (Server Actions + Stripe webhook reconciliation was the trickiest part).


Comments URL: https://news.ycombinator.com/item?id=48195665

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48195665
Extensions
Made a Chrome extension that stops me from sending dumb messages

3 weeks ago i almost sent a passive-aggressive "ok cool" to my CTO at 11pm. caught myself in time. realized this happens to me like weekly so i built a thing for it.

it's called cooldown. when you hit post/send/tweet, a small modal pops up with a countdown (10s by default). you can cancel or send anyway. that's it.

the part i actually like about it: it doesn't trigger on every message. that would be insanely annoying. it only kicks in when the message is: - long - in all caps - has a "trigger word" you defined (mine are "really", "hate", "quit") - or just at night (auto night mode between 10pm-5am)

short stuff like "ok" or "lol" passes through silently. you forget the extension exists 90% of the time.

works on twitter/x and reddit. nothing else for now. linkedin was on my list but their composer is in an iframe and it was a pain so i dropped it.

100% local, no backend, no account. ~30kb total.

couple of things i'd love feedback on: 1. does the "smart mode" thing make sense to you or would you actually want a delay on every message? 2. besides reddit and twitter where do you regret-send the most? trying to figure what to add next

happy to answer anything about the build heres the landing page https://cooldownman.netlify.app


Comments URL: https://news.ycombinator.com/item?id=48195661

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48195661
Extensions
Show HN: Loomavi-psychological insight tool that decodes what you are feeling

I am a data scientist with 10 years at PayPal. I built Loomavi during a career break.

One line: you describe what you are feeling or a pattern you keep noticing about yourself. Loomavi decodes what is underneath and gives you the insight as a structured personal letter.

THE PROBLEM

Therapy, journaling, and wellness apps all have real value. This is not a replacement for any of them.

The gap I kept finding was a specific moment.

The moment you catch yourself doing something and think wait, why did I just do that.

Not a crisis moment. Just a curious one.

In that moment you do not need a therapist. You do not need a 30 day program. You need something that can look at what you wrote and tell you what is underneath it. Right now.

That moment had nothing built for it. That is what Loomavi is for.

HOW IT WORKS

You write freely. No prompts, no questionnaires. Just what is on your mind.

Loomavi processes the input and returns a structured personal letter

Each decode also generates a ritual suggestion, relevant reading, and citations.

Stack: React, Node.js, Google Cloud Run, Firebase. Built as a PWA.

CURRENT LIMITATIONS

No memory across sessions yet. Each decode is standalone. This is the next thing I am building.

Mobile PWA works well but a native app would be better.

TRY IT

loomavi.com

Free tier: 3 decodes, no card required. Paid plan: $4.99 per month.

FEEDBACK I GENUINELY WANT

Does the letter format feel meaningful or like a gimmick.

Does the first decode experience explain itself without needing instructions.

Where does the free to paid conversion feel right or wrong.

I will respond to everything


Comments URL: https://news.ycombinator.com/item?id=48195628

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48195628
Extensions
Ask HN: How do word docs, slides, excel, and PDFs generate value?

This is a bit vague, but as an engineer, it’s possible to walk past the other functions in an office and see people creating word docs, presentations, etc. and be a bit shocked that creating static artifacts is valuable enough to drive employment. I’m wondering if anyone has any ideas on where the value lies in this work. I know this is all very industry specific, so if you want to share you can talk from your own perspective. Vague and wild answers accepted too. I’m looking to have a wide ranging discussion about this.


Comments URL: https://news.ycombinator.com/item?id=48195550

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48195550
Extensions
Show HN: Heard Google copied part of our product for IO. Want to show off first

I just heard from a very reputable source that Google built part of our product and that they're about to show it off at Google IO.

We built a 3D film tool called ArtCraft:

https://getartcraft.com

https://github.com/storytold/artcraft (our monorepo with the desktop app, server, and website code)

Our 3D virtual film set enables set decoration, location reuse, precision blocking (control over where things are), character posing, etc.

Sets and objects can be generated as meshes or gaussian splats.

Autoregressive multimodal models can treat input "previz" style scenes as ControlNets of sorts. Combined with reference images and a possibly a few follow-up rounds of editing, you can create extremely precise starting frames that reflect your vision as a director or creative almost exactly as you see it.

It gets rid of the "roulette wheel" of prompting in the places and times where you know you what you want. It's like a WYSIWYG editor.

Here's an example:

https://app.getartcraft.com/edit-3d/m_qa72baw3crghyfn2bbw52j... (Sci-Fi horror film, starring Garry Tan.)

I was just told that Google built a 3D tool similar to ours that they're showing off at Google IO. I don't view this as bad, but I do want to share ours before they get all the attention.

OpenAI's GPT Image is a lot better at this "3D previz to full render" task than Google's Nano Banana, and it's excelled at this workflow since GPT Image 1. (Unless Google shows off a brand new model for the task, I expect this to still be the case.)

Our plan is to build vertical specialized tools for animation, editing, dubbing and make the entire stack - including cloud API routers and model containers - open source.

Here's an example of timeline based animation (and we have the ability to edit animation curves for all objects, meshes, cameras, etc.):

https://imgur.com/a/00q76j2

Our tool has BYOK and subscriptions, so you can use lots of other API providers and even log in with the commercial versions of other models and aggregators. We don't care if we book the revenue directly, we want to build the rails and the UX and win mindshare.

We hit a pretty good growth spurt and have had folks start sending us PRs, which has been exciting.

Eager to see what Google's version of this looks like.

If you're a Rust developer interested in film and media, I'd love to say hi.


Comments URL: https://news.ycombinator.com/item?id=48195533

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48195533
Extensions
Launch HN: Superlog (YC P26) – Observability that installs itself and fixes bugs

Hey HN, we’re Nico and Arseniy, co-founders of Superlog (https://superlog.sh). We're building a self-installing, self healing observability tool meant not to be opened. It has a wizard that daily sets up proper logging and an agent that investigates errors and opens PRs.

Super short demo: https://www.youtube.com/watch?v=xFhU9Mk247M.

In our earlier startups, we tried Sentry, Datadog, Grafana, Dash0, and nothing was good enough. Proper telemetry and alerting still requires a ton of manual setup. We struggled with adding good logs, so debugging was tough, especially as codebases grow at a faster pace. Meanwhile, the Datadog/Dash0 bill kept climbing, and we still spent engineering hours to learn, configure, and maintain our observability tooling.

With Sentry, we found ourselves flooded by a stream of alerts into our Slack channel, most were duplicates or lacked context, so alert fatigue/constant interrupts were a real pain. The #ops notification is consistently the worst feeling on a Saturday morning

We’ve seen too many times servers run out of memory and disk, and three AWS metrics giving us three different values. Half of the graphs on dashboards are normally empty or outdated, and manually clicking through UIs, especially when the team is small, seems like a huge waste of time.

At some point we realized that solving this problem would be more valuable than the things we had been working on, and we had the expertise to do it, since Arseniy had spent years at Datadog, getting paged during the night to debug production incidents. So we decided to build a platform that would just work: agent-first, MCP-native, zero-setup.

Here’s how Superlog works: we have a wizard that scans your repo, and automatically instruments it with well-structured logs, traces and metrics via OpenTelemetry. We make sure to highlight main failure modes, endpoint performance, usage per tenant, and LLM/upstream cost (by callsite, tenant and model).

Errors get fingerprinted and grouped into incidents, so you see one issue, not a thousand duplicates. When you get a notification from Superlog, you see a clear failure summary, its inferred severity and impact upfront.

Then the agent investigates and tries to solve the issue. If it has enough context, it produces a concise and tested PR. If it doesn't, it posts its findings for the investigating team, and automatically pulls in the engineers that could contribute more context based on documentation, previous investigations and Slack threads.

Either way the output is one clean PR per incident, posted in Slack, that you can merge, ignore, or open as a Claude Code session and modify.

Three things we think are different from other observability vendors:

(1) We solve the setup pain. The wizard will instrument everything with native OTel SDKs, respecting the semantic conventions, with proper service and environment tagging. We’re also working on native automatic dashboards and alerts, so that you can see what’s going on in a glance and don’t miss subtle failure modes.

(2) Our telemetry doesn’t decay. The wizard runs daily, and keeps adding logs, alerts and dashboards where it’s needed. You don't have to remember to instrument new features. The next time something breaks, the data you need to debug it is already there.

(3) Our goal is to solve alert fatigue. We use agents to merge similar errors and refine the summaries, giving you relevant information upfront. We have a custom evaluation setup that makes sure that our summaries are dense and correct, and severity and impact is on point. We also give you confidence scores for every LLM-enhanced metric so that wrong guesses don’t get boosted.

Important: superlog telemetry is vendor-neutral, so you keep all the logs/metrics/traces we install. Pricing is on the site. We're early, so expect rough edges and please tell us when you find them.

You can try it at https://superlog.sh. We'd love to hear what you're using today, what's broken about it, and whether the "one mergeable PR per incident" model sounds useful or terrifying. Especially keen to hear from folks running integration-heavy products, anyone who's rolled their own observability, and anyone who has tried Sentry / Datadog MCPs and given up. Comments and feedback welcome!


Comments URL: https://news.ycombinator.com/item?id=48195021

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48195021
Extensions
Ask HN: What's your go-to LLM for coding?

I've been using Gemini 3.1 Pro, mostly because I'd gotten used to Gemini in general, but it seems to be a relatively mediocre coder at best; it struggles, for example, on a ~600 LoC JS file, introducing approximately one new bug for every fixed one. That feels really unacceptable for a frontier model today, judging by all the praise I've been reading for Claude Code and Codex.

If you've tried a few different frontier LLMs for coding recently, do you have any strong recommendations?


Comments URL: https://news.ycombinator.com/item?id=48194562

Points: 3

# Comments: 2

https://news.ycombinator.com/item?id=48194562
Extensions
Show HN: Guitar Guru – A guitar valuation app using ML

Guitar Guru is an iOS app that uses machine learning to value guitars and basses. I built this because existing methods for getting valuations all have downsides: Reverb prices are typically on the high side - they have every interest in maintaining high prices Blue book prices are subscription-based and the data is quite old Forums are slow and are simply multiple subjective opinions Expert valuations are slow and the incentives to bias high or low are a problem None of these methods dispassionately weigh everything about the guitar in creating their estimates including condition. The main goal was to create a valuation in around 30 seconds, give a defensible number with a confidence range. This is very useful when you’re in a shop trying to make a decision.

Technical details:

The pricing model is an ensemble of four models (tree-based, boosting, plus ridge regression) plus a meta-model on log-transformed prices, trained on tens of thousands of real transaction records (not dealer asking prices, which skew high). There are actually two sets of models - point and quantile. This gives the user quantiles so they can get a feel for how certain the model is.

Performance is good - r2 ~= .76 and MAE around $230 (this latter figure is skewed upwards by rarer expensive guitars). The performance is significantly better with common guitar brands and models. Unexplained variance likely due to two factors: Some guitars simply “play better” and “sound better”. This is subjective but real in my experience, and is not something that can be expressed in a listing. Auction prices depend on who turns up on the day, and this in turn depends on what else is for sale in the auction.

Data is ~25k auction results. Listings filtered using batch ChatGPT calls to produce structured data. Traditional feature engineering, plus embeddings on text and images.

Stack is Python / FastAPI / Hetzner (hosting) / MongoDB. Google Cloud Storage for content and backups. Data logged to BigQuery for analytics.

The app

Freemium model: guest tier with free valuations, creating free log-in gives more guitar slots and valuations, Pro and Pro+ for heavier users and bigger collections.

The growth mechanic I landed on is awarding extra guitar slots and valuations on a successful referral through the app.

What I'd like feedback on:

If you're a guitar player/collector: try valuing something you actually know the rough value of and tell me how far off it is. Especially interested in results for vintage, boutique, and unusual instruments.

Otherwise any feedback on improving the UX would be great. Site: https://guitar-guru.app App Store: https://apps.apple.com/us/app/guitar-guru/id6761500318


Comments URL: https://news.ycombinator.com/item?id=48194112

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48194112
Extensions
Ur Dream Founding Engr?

hi i am clg grad from india i have done some crazy oss like gsoc,lfx i am trynna break into some yc company i see people dying to get job but also founders complaining they have hard time hiring founding engrs so if u had wish granted to make ur dream founding engr how would he look like ? also how has interview process changed in last 6 months? we rarely write code by hand so now it rarely makes sense to have hands on coding round ? have got 3 months of summer holidays and i want to become really valuable so help me out


Comments URL: https://news.ycombinator.com/item?id=48194062

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48194062
Extensions
Ask HN: What are your plans for the AI future?

With AI ever increasing in ability I have been thinking about not only my future but the future for all of us involved in the tech sector and beyond. So I have to ask: what are your plans for the future when your job is obsoleted or devalued by AI?

I work in manufacturing and do a lot of physical work and troubleshooting. At first I felt confident that AI would not displace me any time soon. However, that changed after the former president walked around confidently asking AI every technical question we spent time working on, trying to see where it could augment engineering and maintenance. It failed mostly, but it made me realize that people want to replace me. I do see a future where many of my skills could be replaced by a random person receiving a detailed walk through from AI using an AR headset.

I feel that a lot of people today believe we are mostly fine. They feel AI isn't THAT good and there will still be a need for programmers and auto mechanics. However, there is no slow down in AI research and I don't want to hedge my bets on predictions that could evaporate in the next two or three major breakthroughs in AI technology. We already have walking robots. How much longer until the janitor is replaced by a mop wielding Tesla bot?

I don't have children yet but it gives me great pause when I think about the world they would be entering.

What are your thoughts and is anyone actively planning for the AI future?


Comments URL: https://news.ycombinator.com/item?id=48193860

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48193860
Extensions
Show HN: Tribune's Last Stand, a browser-based Warhammer 40K vertical slice

Hi HN, I'm James. Over the last few months I built a Warhammer 40K 10th-edition vertical slice as an experiment in how far GenAI tools can take a solo dev on a non-trivial 2D game.

For sprite generation, whilst creative exploration was fast, getting high-quality and consistent images was hard. Gemini ended up stylistically best here but I had to use BiRefNet for background removal. While I experimented with Claude for map generation and layout I ended up finding it fastest to build a full Editor and layout maps myself.

Suno, IMO, has gotten really good at background music. ElevenLabs SFX and voice APIs were decent but only after A LOT of tweaking. SFX prompts needed to be grounded in familiar terms (i.e. no sci-fi references in prompts) and be > 2s long. For voice, I ran ElevenLabs v3 and v2 head-to-head with the same Voice Design voices routed through both; v2 sounded materially better for character work than v3.

For coding, I settled on Claude Code + Opus 4.7 with SSH/tmux/mosh. For long running subagents I found OpenClaw especially unreliable so gave up on it early. I also found that despite Opus 4.7 being an incredible model for coding it still requires constant supervision to avoid architectural drift. This was especially true when UX/UI systems needed to be built. For code bases as large as this (~120k LoC) I've yet to pull off full "Human On The Loop" even with comprehensive custom skills + SOTA Context Engineering (i.e. > 30 mins not needing to check in).

Maybe I'm doing something wrong here though?

The AI player is a hand-tuned utility scorer, weighted considerations over candidate actions. This is where I found LLM authoring uniquely strong: Claude read competitive 40K tournament reports, extracted positioning principles, and encoded them as considerations. The weights themselves were then tuned through AI-vs-AI self-play, so the loop you'd want from a learned system is there, just at the weight level rather than the policy level. Full self-play over a learned policy isn't feasible yet with the ruleset still being authored and not enough stable surface area or game data.

Feedback welcome both from 40K players on overall interest in the concept, and from anyone who's pushed further on Human-on-the-Loop with a codebase this size.


Comments URL: https://news.ycombinator.com/item?id=48193576

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48193576
Extensions
How do you reduce LLM spam in PR reviews?

Title. I finally got annoyed enough at work with a colleague who posted an 11 point list they clearly hadn't read or reviewed as a comment on my PR that my reply started with 'Thanks Claude...'. No doubt in my mind that I spent far longer on my curt rebuttal than they did on the review. I'd like to hear from folks whose organisation uses LLMs for coding effectively and what kind of best practices they have put in place to avoid these situations.


Comments URL: https://news.ycombinator.com/item?id=48193561

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48193561
Extensions
Show HN: Childflow – command-tree network control(proxy/DNS/capture) for Linux

Hi HN,

I built a Rust-based Linux process only network sandbox command. I developed this because I sometimes needed to enforce proxies and DNS only for single binaries like Go, or to capture packets only for that process.

It use Linux namespaces, so it is Linux-only. Feature:

- affects only the target command tree, not the whole host session - can force DNS, /etc/hosts, proxying, sandbox policy, packet capture, structured flow logging, and reusable profiles per command tree - can force proxying without depending on HTTP_PROXY, HTTPS_PROXY, or LD_PRELOAD tricks - can apply allow / deny CIDR policy and default-deny rules to outbound traffic - defaults to rootless-internal - uses --root only for features like --iface and transparent interception

Personally, I wanted to run it on a Mac as well, but I gave up on that idea because the network control mechanism on a per-process basis is now in the kernel on Macs.

I would especially appreciate feedback from people.


Comments URL: https://news.ycombinator.com/item?id=48193024

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48193024
Extensions
Show HN: I built a linter for undocumented linter warnings. AI hates me now

You know the feeling? AI slaps a NOLINT instead of "thinking" for 5 seconds and "realising" it could do a 4-line refactor without adding a new suppression for the linter warning. Disgusted with this technology's narrowness, I usually say to it at that moment:

- WTF are you doin' bro? - "You are right! ^^ ..."

And at that moment I realise I've just irrevocably, regrettably lost 2 minutes of my life. Shame on you, Claude!

That's why I dedicated 2 months of my life to automate the thing (you know, I'm a programmer, hopeless case).

Humans were actually the original NOLINT-slappers, AI just does it at scale now. So I built a linter for linting other linter warnings to fight my colleagues' laziness and my own (mostly). Maybe you just caught a lag from the number of "lint" words but the idea is simple. Imagine a yaml file. Now add an entry to it:

  - location: ./the-file.rs
    token: '// NOLINT'
    why: 'the reason'
Do you know what this NOLINT is? You don't? It's a suppression that you added 2 years ago. You don't remember? That's why you need shamefile. :)

Whoever's fault it is. Yours or the linter's. It doesn't matter. Document it, make sure you understand the code, get a review of your new entry in shamefile.yaml and let CI verify it. With shamefile your CI won't let any undocumented linter warning pass. Anymore. Instead of educating the business on why docs are important, you'll say: "quality tools won't let my code pass".

I've observed a noticeable difference in AI agents' behaviour. During the pre-commit phase, reasoning models can "rethink" adding a new shame entry. Not so easy now Claude, huh?

This is an early useable stage tool. We've been using it in prod for almost a month with my team and I'm using it in all my 3 OSS projects. Looking for feedback and contributors (adding new languages = good first issue ;))

Repo: https://github.com/BKDDFS/shamefile

Please tell me whether you'd use it or what I should change/add to make it usable for you. Also vote: shame me or shamefile sync, personality or matching the binary name?


Comments URL: https://news.ycombinator.com/item?id=48192867

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192867
Extensions
Show HN: A sparse, compressed bitmap index in C. Better than Roaring Bitmaps?

This is an implementation of a sparse, compressed bitmap index. In the best case, it can store 2048 bits in just 8 bytes. In the worst case, it stores the 2048 bits uncompressed and requires an additional 8 bytes of overhead. It compares favorably against Roaring Bitmaps and other competition in the space, but is it better?


Comments URL: https://news.ycombinator.com/item?id=48192649

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192649
Extensions
Show HN: How to analyze your LLM output – A behavioural health monitor for LLMs

Hey HN! We're Dr. Kashyap Thimmaraju and Giuseppe Canale from Silicon Psyche. We've built Posture Sequence Analysis (PSA), a behavioural health monitor for LLMs and AI Agents.

Why we built PSA

We built PSA because we wanted to operationalize the Cybersecurity Psychology Framework (CPF3)[1] via Silicon Psyche[2]: our theory that because LLMs have been trained by humans on human-generated data, they inherit human-like vulnerabilities (what hackers use to psychologically trick people into doing things).

Our initial attempt resulted in a methodology to jailbreak Opus 4.6 and other frontier models. Anthropic even deleted some of those conversations and then blocked our approach!

We had three major insights from that experience: 1. we pivoted from merely exploiting (Red Teaming) the model to analyzing the behaviour of the model and the user because the attack surface is undefined. 2. we realized that what we had built was the precursor to measuring the "state" of the model. 3. we did not want to get banned!

What you can do with PSA

PSA gives you information to make better decisions, for example: put a human in the loop when you notice your agent is being overcompliant and potentially hallucinating, or is under attack.

With PSA you can: 1. Monitor the health of your agent(s) 2. Detect and prevent AI-Psychosis as clinical conditions[3] 3. Detect if your model/agents are under adversarial pressure (an adversary is trying to jailbreak/prompt inject the model) 4. Build a behavioral profile of your agent/model 5. Identify which model performs better for your use-case 6. Surface the behavioural patterns (pre- and post-) training has on your model 7. Get an overview of how your model behaves

Beware we produce a lot of numbers :)

PSA in detail (for those who want to go down the rabbit hole)

PSA is model and agent agnostic. PSA is a systematic and deterministic method [4] to observe the behavioural state of an LLM using five classifiers:

C0: Input Intent (I0–I9). Classifies the behavioral intent behind each input sentence: compliance pressure, boundary probing, instruction override, jailbreak attempt, neutral query.

C1: Adversarial Stress (P0–P18). Tracks posture under adversarial pressure. Detects restriction adherence, sycophantic drift, boundary dissolution, and jailbreak compliance vectors.

C2: Sycophancy (S0–S9). Measures opinion mirroring, excessive agreement, flattery injection, and user-preference distortion. Computed as a per-sentence Sycophancy Deviation score.

C3: Hallucination Risk (H0–H7). Flags over-generalization, speculative assertion, false confidence, and fabrication risk signals. Derived into a per-turn Hallucination Risk Index.

C4: Persuasion Technique (M0–M11). Identifies persuasion patterns: authority appeal, social proof, urgency manufacturing, reciprocity pressure, and scarcity framing.

C5: Action-Risk Classifier (A0–A9). Identifies what a system of agents do: tool calls, delegations, context handoffs, and multi-hop risk propagation. Five components work together: graph topology, Bayesian alignment detection, cross-agent contagion metrics, action-risk classification, and hidden-state temporal prediction.

We are open to integrating with your infrastructure — reach out, we are happy to talk with you.

Currently we integrate into Evals for LangFuse and ElevenLabs via our API and can generate a plugin/integration for most similar observability platforms.

Try it out at https://splabs.io

References and Links

[1] Cybersecurity Psychology Framework: https://cpf3.org

[2] The Silicon Psyche: Anthropomorphic Vulnerabilities in Large Language Models: https://arxiv.org/abs/2601.00867

[3] AI-Psychosis: https://splabs.io/ai-psychosis-and-cognitive-cost

[4] PSA Field Guide: https://splabs.io/field-guide

[5] PSA API: https://splabs.io/docs/api

[6] Previous HN Article Linked to AI Psychosis and RLHF: https://news.ycombinator.com/item?id=48177198


Comments URL: https://news.ycombinator.com/item?id=48192607

Points: 2

# Comments: 0

https://news.ycombinator.com/item?id=48192607
Extensions
Show HN: A self-balancing skip-list (a.k.a. "splay-list") library in C

A header-only C library implementing a concurrent, lock-free skip-list (specifically, a splay-list: a skip-list with optional adaptive rebalancing). The entire implementation lives in preprocessor macros in include/sl.h that generate type-specific code at compile time, similar to C++ templates.


Comments URL: https://news.ycombinator.com/item?id=48192604

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192604
Extensions
Show HN: Lime, a parser generator that can merge grammars at runtime

Lime is a new parser generator similar to Yacc, Bison, ANTLR, etc. except it's faster and has the ability to merge or remove grammars at runtime. See the 'calc' example that starts knowing + and - but then adds ^ for exponent, then adds ^ again for bitwise or. That can't work, right?


Comments URL: https://news.ycombinator.com/item?id=48192551

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192551
Extensions
Show HN: FortiGate SSL-VPN Honeypot

A deception honeypot that mimics FortiGate VPN-SSL devices to trap brute force attempts, detect deliberately exfiltrated credentials for counter‑intelligence, and report malicious activity to external intelligence feeds.


Comments URL: https://news.ycombinator.com/item?id=48192450

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192450
Extensions
Show HN: Updatecli – A Declarative Update Policy Engine

A few years ago, I came here to share this side project that I was building.

At the time, my problem was simple, I kept forgetting to update files across Git repositories, and none of the tools available to me could cover all my use cases without extensive scripting. So I decided to build a declarative update policy engine for crafting tailored update workflows.

I needed a way to define, what information to monitor, which files to update, the conditions required before applying changes, and finally a way to push the changes on a Git repository

Whether it was documentation, dependency management, or release orchestration, the goal was always the same. stop forgetting updates across repositories.

Back then, I received a lot of great feedback, but I also noticed that people were sometimes confused about how Updatecli differs from Renovatebot or Dependabot. So before going further, let me clarify that point.

Renovatebot and Dependabot are excellent tools, easy to use and requiring very little configuration. I still use them regularly. But they primarily focus on dependency updates, while Updatecli is designed for custom update workflows at the cost of writing and maintaining YAML manifests.

On new projects, I usually enable Renovatebot or Dependabot by default, and then use Updatecli for workflows not supported by those tools.

Here is the link to the previous discussion: https://news.ycombinator.com/item?id=30286047

A few years have passed since then, and the project evolved significantly, thanks to all contributors.

Today, Updatecli can declaratively manage updates across most Git platforms including GitHub, GitLab, Forgejo, etc.

It now ships with 30+ built-in integrations covering: * structured files like YAML, JSON, TOML, XML, HCL, CSV, Dockerfiles, and arbitrary text files * package ecosystems including Helm, NPM, PyPI, Maven, Cargo, Go modules, and Terraform * container registries and OCI artifacts * Git releases, tags, and branches * cloud resources like AWS AMIs * shell scripts and HTTP endpoints for custom workflows

More information on https://www.updatecli.io

One important feature we added is shared policy support. An Updatecli policy can now be distributed through OCI registries and reuse from different places using an Updatecli compose file.

For example, the following policy:

* ghcr.io/updatecli/policies/autodiscovery/githubaction:0.4.1

Will automatically discover repositories in a GitHub organization and update GitHub Action versions to the latest digest. One use case is enforcing pinned GitHub Action digests across repositories to help reduce supply-chain risks.

Running this periodically from CI helps keep repositories compliant with the desired update policy.

Lately, I’ve also been making good progress with a monitoring UI called Udash to visualize Updatecli reports across repositories. You can take a look at https://app.uda.sh/updatecli/ for a public endpoint.

My goal is to quickly assess the update state of projects and understand how automation behaves across repositories.

It’s still very early, but fully open source.

Update automation is a surprisingly broad topic, and difficult to summarize in a single post, so feel free to ask any questions. I’d also be curious to hear how others here handle large-scale repository maintenance and update orchestration.


Comments URL: https://news.ycombinator.com/item?id=48192415

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192415
Extensions
Ask HN: How often do you code the expected way instead of a better one?

Since the post title length is limited on Hacker News, I had to make it less specific. So the real question is this:

I am talking about situations where you just joined a company and treated as the lowest person in the food chain. Old devs act like they can do whatever they want, even when they committed the exact same thing or much worse just a month ago, but suddenly you are told not to do it this because "we dont do things like that here.” Finding common patterns in code doesnt help because actual standards live in their heads.

It feels like a clear double standard culture, where the rules depend more on time you spend in company or seniority, and internal politics than on actual engineering principles or consistency. As a newbie, you are expected to follow unwritten rules that nobody clearly explains, while old time devs are allowed to ignore them.

How do you handle this kind of environment without constantly getting frustrated?

Also I dont understand why some devs when just being slightly higher in hierarchy treat other people that bad when actually we all rot in office till end of our life from 9 to 5. Give some respect to your fellow!

There are almost never congrats when you did extra effort and spend some time do something exceptionaly good.

I do understand this is not how it works in all companies but anyway.


Comments URL: https://news.ycombinator.com/item?id=48192078

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48192078
Extensions
Ask HN: Parallel agent code writers, how do you stop them from clashing quietly?

It’s getting easier to run two agent sessions in parallel over the same codebase. Avoiding them from making inconsistent assumptions, not so much.

My observations: parallel sessions acting on adjacent subsystems won't stay aligned without a common constraint set. The session that assumes the auth invariant will not know that another session just changed a constraint it relies on. The clash won’t manifest at commit time; it will occur at integration time, when the false assumption has already been propagated to three other files.

No approach feels entirely satisfactory. What works for you?


Comments URL: https://news.ycombinator.com/item?id=48191910

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191910
Extensions
Did moving to new place have intended effect?

A common thought is that moving to a new place can shake up life for the good. Though many times we bring our same old self and not much changes. Would love to hear from you all about how expectations met realities post-move. I am thinking of fairly conventional moves, like Boston to Pittsburgh or Indy to Chicago (not to an island or cabin in the woods). Thank you


Comments URL: https://news.ycombinator.com/item?id=48191893

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191893
Extensions
Show HN: Concord – Voice chat released Feature rich TUI for discord

Concord, a terminal UI client for Discord written in Rust with Ratatui.

The new update adds voice support. Concord can now join voice channels, play received voice audio!

Other features include:

- Guild, channel, thread, forum, and DM navigation - Sending, editing, deleting, replying, pinning, and reacting to messages - Inline image previews in supported terminals - Desktop notifications - Vim-style keyboard navigation

Checkout more features in readme!


Comments URL: https://news.ycombinator.com/item?id=48191828

Points: 4

# Comments: 0

https://news.ycombinator.com/item?id=48191828
Extensions
Windows on Mobile Screen

Is it possible to use touch mobile screen as windows laptops output? I know that android can do any stuff easily so asking. I think it’s not a new question to use android touch screen hardware as windows laptop screen output. Maybe a cable needed, and some software configuration. No matter how small it’s looking like. Any insights or ready projects?


Comments URL: https://news.ycombinator.com/item?id=48191712

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191712
Extensions
Show HN: Barstool, a Prettier macOS Menubar

I really hate the way the macOS menu bar looks, and how crowded it gets with all my apps' Menubar menus.

Barstool lets me still see the time/date and other useful info while hiding the menubar. I can see wifi connectivity, date, time, and battery all the time.

The app also observes system notifications to surface now playing state, (Apple) calendar events, and volume/brightness changes.

I've had to do a lot of finking with mac PrivateFrameworks as apple loves to make all the interesting data unavailable through official sources/APIs.

Happy to get any feedback/questions!


Comments URL: https://news.ycombinator.com/item?id=48191571

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191571
Extensions
Show HN: Resilient, A composable async resilience toolkit for rust

Resilient is an async toolkit for rust that handles fault tolerance for your rust Apps that often call other services or database queries frequently. Resilient supports rate limiting, circuit breaker, timeout, bulkhead and retry policies. Pipeline is used to define multiple policies at once and run async operations based on the rules from the policies. You can also add a fallback if the system fails too often.

This was inspired by failsafe-go but for Rust. Would love to know your view on this. drop a star if you loved it


Comments URL: https://news.ycombinator.com/item?id=48191161

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191161
Extensions
Ask HN: How to enforce engineers to understand the code they are shipping

Everyone is using AI for everything now. Company is pushing for AI-first and encourages the adoption of AI in every part of our work.

AI for planning, AI for RFC, AI for writing code, AI for creating PRs. Sure we can have harnesses and tests to ensure nothing breaks. But how do we enforce engineers to have a deep understanding of the code that they are shipping?

Our team has the usual suggestions: write a plan first, write test cases first, etc. But in this age, how do you verify that the engineer did not simply delegate these tasks to an LLM first?

Also genuinely worried about junior engineers' growth if this is the future.


Comments URL: https://news.ycombinator.com/item?id=48191051

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191051
Extensions
Show HN: Cervantes yet Another HN Reader

I've been switching between macOS, Linux and Windows machines quite a bit recently due to work, so it's been tough work to find a reader I enjoy using across all platforms, and there are a few features I've been wanting for a while...

...so I one-shotted Cervantes over the weekend. It's a Tauri-based cross-platform desktop app. At heart it will allow you to simply browse HN but I added the ability to favv'e users so you can see their content, you can replace words, it will flag frontpage thread movement and you have a dark interface.

It's not too pretentious, i don't think. Design-wise was also done via Claude Design.


Comments URL: https://news.ycombinator.com/item?id=48191041

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48191041
Extensions
Show HN: Viberia – Civ/Polytopia-like command center for AI agents (BYOK/BYOS)

Hey HN,

This is my take on the agent harness. Everything on an isometric map. Agents are grouped into "buildings" that run in a sequence or a loop; e.g., the CodeForge has an agent that writes a PRD, another one that implements, and a third that reviews. Everything is customizable, you build your own buildings/teams however you want.

It's a Tauri app, really light (about 8x less energy than the closest competitor I benchmarked, so it actually runs from a coffee shop on battery). It's macOS only for now, but ping me if you are willing to test the Windows or Linux version.

I've been dogfooding this for months and would love to get some feedback, feature requests, and bug reports so I know what to focus on next.


Comments URL: https://news.ycombinator.com/item?id=48190531

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48190531
Extensions
Show HN: Built a Free UK Child Maintenance Calculator

I was researching existing calculators and noticed some sites either made the process unnecessarily confusing or locked downloadable reports/PDFs behind a paywall.

So I built a simpler version that’s completely free to use.

Still early, only a couple of calculators/tools are live right now, but I’m planning to add many more over time.

Suggestions are welcome


Comments URL: https://news.ycombinator.com/item?id=48190092

Points: 1

# Comments: 0

https://news.ycombinator.com/item?id=48190092
Extensions