<?xml version="1.0" encoding="UTF-8"?><feed xmlns="http://www.w3.org/2005/Atom">
  <title>Serious Computer Business</title>
  <id>https://octet-stream.net/b/scb/</id>
  <updated>2026-05-13T00:02:38Z</updated>
  <link href="https://octet-stream.net/b/scb/"></link>
  <author>
    <name>Thomas Karpiniec</name>
    <email>tom.karpiniec@outlook.com</email>
  </author>
  <entry>
    <title>MIE Soft Mode</title>
    <updated>2026-01-30T10:27:53+11:00</updated>
    <id>urn:uuid:988cc006-feed-4609-9d60-c035f976506f</id>
    <content type="html">&lt;p&gt;&#xA;I &lt;a href=&#34;/b/scb/2026-01-14-difficulties-enabling-apples-mie.html&#34;&gt;wrote previously&lt;/a&gt; that I was having difficulty making Apple&#39;s Memory Integrity Enforcement feature do what it says on the tin. After getting some help from Eskimo &lt;a href=&#34;https://developer.apple.com/forums/thread/811668&#34;&gt;on the developer forums&lt;/a&gt; I&#39;m pleased to report that it does work. Somewhat. In its current form it&#39;s not as impressive as I hoped, and I feel like there are still some bugs lurking.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;a href=&#34;https://octet-stream.net/assets/mie/xcode-mte-sync-fault.png&#34;&gt;&lt;img src=&#34;https://octet-stream.net/assets/mie/xcode-mte-sync-fault.png&#34; style=&#34;width:100%&#34;&gt;&lt;/a&gt;&#xA;&#xA;&lt;p&gt;&#xA;The thing that tripped me up most is that we&#39;re not allowed to use &#34;hard mode&#34; yet. This means it&#39;s impossible to reproduce exactly what was demoed in Apple&#39;s video. It turns out this was called out in the &lt;a href=&#34;https://developer.apple.com/documentation/xcode-release-notes/xcode-26_1-release-notes&#34;&gt;Xcode 26.1.1 release notes&lt;/a&gt;, which I guess I should be reading if I want to play with brand new features:&#xA;&lt;/p&gt;&#xA;&#xA;&lt;blockquote&gt;&#xA;When enabling Hardware Memory Tagging under Enhanced Security (Capabilities editor -&gt; Enhanced Security -&gt; Memory Safety -&gt; Enable Hardware Memory Tagging), all applications will currently run under Soft Mode irrespective of the Soft Mode for Memory Tagging option.&#xA;&lt;/blockquote&gt;&#xA;&#xA;&lt;p&gt;&#xA;What this means in normal English is that, for now, MIE will not actually protect anything in your app unless you&#39;re running under the Xcode debugger with a special setting enabled. It will observe the memory corruption, allow it to proceed, and log a fake crash report with the backtrace. This is not completely terrible&amp;mdash;as a developer you will at least be able to find out when corruption is occurring so you can put out a fix. Sadly, we&#39;re not yet at the promise of &#34;your phone would sooner terminate the app than allow memory corruption to occur&#34;, which is what you actually want for protection from malicious parties. Watch this space, I guess.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Even this new information doesn&#39;t fully explain what I saw a couple of weeks ago, however. When I was launching the app outside the debugger I had tonnes of crash reports accumulating on my phone but I didn&#39;t see any MTE ones. This was partly my oversight: once I had followed Eskimo&#39;s instructions to produce a valid simulated crash and knew exactly what it looked like, I trawled through carefully one at a time and found that I had previously triggered soft mode in two cases, in amongst a deluge of regular memory corruption crashes where it seemed MTE wasn&#39;t active at all. I&#39;m not surprised I was confused. When I went back to my original app yesterday, and soft mode worked, I hadn&#39;t changed a single thing. My macOS, iOS and Xcode are still on the same versions.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;So, who knows? Maybe there&#39;s an intermittent bug registering the entitlements properly. In any case I think I&#39;m done playing with MIE for now. It&#39;s a neat feature. I hope everyone finds lots of bugs with it, and we get true hard mode soon.&#xA;&lt;/p&gt;&#xA;&#xA;</content>
    <link href="https://octet-stream.net/b/scb/2026-01-30-mie-soft-mode.html" rel="alternate"></link>
    <author>
      <name>Thomas Karpiniec</name>
      <email>tom.karpiniec@outlook.com</email>
    </author>
  </entry>
  <entry>
    <title>Standard Search: a search engine for standard.site posts</title>
    <updated>2026-01-31T15:03:39+11:00</updated>
    <id>urn:uuid:025a42a5-69ce-4087-867b-ca1e62f57e85</id>
    <content type="html">&lt;p&gt;&#xA;The other day I put a search engine online which attempts to index all articles in the Atmosphere conforming to the &lt;a href=&#34;https://standard.site/&#34;&gt;standard.site&lt;/a&gt; lexicon: &lt;a href=&#34;https://standard-search.octet-stream.net/&#34;&gt;Standard Search&lt;/a&gt;&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;In practice this mostly covers people blogging on sites like &lt;a href=&#34;https://pckt.blog/&#34;&gt;pckt.blog&lt;/a&gt; or &lt;a href=&#34;https://leaflet.pub/&#34;&gt;leaflet.pub&lt;/a&gt; but also independent websites like mine that have gone out of their way to integrate with ATProto. This is rapidly growing in popularity&amp;mdash;when I first launched the search engine a few days ago there were around 3900 indexed documents and today there are 4122.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;This was a relatively easy job as far as search engines go. Between the ATProto firehose and the relay&#39;s &lt;a href=&#34;https://docs.bsky.app/docs/api/com-atproto-sync-list-repos-by-collection&#34;&gt;ability to list all repos with a particular collection&lt;/a&gt;, there is no great technical hurdle to find all the standard.site records that exist on the network and the process is entirely automated by &lt;a href=&#34;https://github.com/bluesky-social/indigo/tree/main/cmd/tap&#34;&gt;tap&lt;/a&gt;. This means that no crawling is required for discovery.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Tap is a standalone service that&#39;s designed to be consumed over an HTTP API, which means you can use it from basically any programming environment. Since I&#39;m a fan of Rust I previously built the library &lt;a href=&#34;https://crates.io/crates/tapped&#34;&gt;tapped&lt;/a&gt;. This hides the fiddly details of running tap, making requests and parsing the JSON events.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;The search and indexing part wasn&#39;t particularly tough either because I could use &lt;a href=&#34;https://crates.io/crates/tantivy&#34;&gt;tantivy&lt;/a&gt;. This is basically a pure Rust equivalent of Lucene and it works incredibly well. It even handles generating the snippets of text highlighting where the keywords appear in the text. For documents that don&#39;t include any content in the AT record I scraped the HTML and ran it through &lt;a href=&#34;https://crates.io/crates/dom_smoothie&#34;&gt;dom_smoothie&lt;/a&gt; to get the text. The existing AT blogging platforms all seem to have &lt;code&gt;plaintext&lt;/code&gt; fields in their &lt;code&gt;content&lt;/code&gt; data so I extract and index those.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;The main part that doesn&#39;t come off-the-shelf is verification. Standard.site has a &lt;a href=&#34;https://standard.site/#verification&#34;&gt;two-part verification check&lt;/a&gt;. The AT record points to the page&#39;s URL, and the HTML at that URL needs to have a &lt;code&gt;&amp;lt;link&amp;gt;&lt;/code&gt; tag pointing back to that same AT record to prove ownership. There&#39;s also a publication record which is verified against a &lt;code&gt;.well-known&lt;/code&gt; URL. Therefore to prevent any forged records there is a bit of scraping required for each discovered post. Of course, at this early stage nobody (to my knowledge) is actually trying to forge anything. Verification failures mostly occur because someone didn&#39;t realise that they needed to add &lt;code&gt;&amp;lt;link&amp;gt;&lt;/code&gt; tags. (If you want to check your own implementation I built &lt;a href=&#34;https://site-validator.fly.dev/&#34;&gt;an online validator&lt;/a&gt;).&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Then there is the matter of validating the standard.site records themselves. Technically I don&#39;t need to do this: if I&#39;m able to parse out the fields I need and the scraping checks succeed, then that would be enough to index the documents. However, in the spirit of encouraging correctness the search engine is actually pretty pedantic. It uses &lt;a href=&#34;https://crates.io/crates/jacquard-lexicon&#34;&gt;jacquard-lexicon&lt;/a&gt; to validate the record which confirms that the lexicon matches both structurally (having the right fields) and in terms of constraints. In practice these constraints are sometimes not observed, particularly the maximum number of graphemes in a post&#39;s description field (300). Documents that don&#39;t conform exactly are skipped from indexing.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;One unknown at this stage is when there will be problems with spam or otherwise problematic content showing up in the index. I speculate that strict verification might keep a lid on this for a while. It&#39;s a little bit fiddly to get it right on your own and if a spammer is using a hosted platform they&#39;re likely to get kicked off anyway. I&#39;ll decide what to do if and when that happens.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;One mildly embarassing thing: when I announced this I didn&#39;t realise that there already was a search engine for standard.site posts: &lt;a href=&#34;https://pub-search.waow.tech/&#34;&gt;pub-search.waow.tech&lt;/a&gt; by &lt;a href=&#34;https://bsky.app/profile/zzstoatzz.io&#34;&gt;@zzstoatzz.io&lt;/a&gt;. That&#39;s no reason not to have a go making my own but it might have been nice of me to acknowledge that it wasn&#39;t a new concept.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Now that it&#39;s there, there are more things I could play with when I have time: language detection and filtering, vector search, things like that. We&#39;ll see what happens.&#xA;&lt;/p&gt;&#xA;&#xA;</content>
    <link href="https://octet-stream.net/b/scb/2026-01-31-standard-search-for-standard-site-posts.html" rel="alternate"></link>
    <author>
      <name>Thomas Karpiniec</name>
      <email>tom.karpiniec@outlook.com</email>
    </author>
  </entry>
  <entry>
    <title>Promoting use of fine-grained PATs</title>
    <updated>2026-04-11T06:08:51Z</updated>
    <id>urn:uuid:00337a02-b7ee-433b-b0e8-ad125637978d</id>
    <content type="html">&lt;p&gt;&#xA;Software development is becoming an increasingly risky business. Supply chain attacks are more frequent than ever, and those of us using agentic LLMs run the risk that it will add a dependency automatically that contains malware. We&#39;re perfectly capable of doing that without LLMs of course (see typosquatting) but it&#39;s riskier when an LLM can add the dependency, download it, and run it, all before you get a chance to realise what it&#39;s doing.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;As a result many of us are making an effort to isolate our development environments. This means both isolation between projects and isolation from our &#34;main computer&#34; that we&#39;re using to do the work, which probably contains most of our personal data.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;If you run your development in a container or remote SSH server there are two things to worry about if malware takes over: the code itself if your repo is private or proprietary, and the SSH key or token that you&#39;re using to communicate with the git remote. The most popular choice is an SSH key, which can be stored with varying levels of security. In practice, in many situations malware will be able to make use of that key, either by exfiltrating an unencrypted key or by using the SSH agent.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;The trouble with a service like GitHub is that your SSH keys are user-scoped. If someone gets your key then they have access to all of your repos, along with all of those shared with you by other users and organisations. If you&#39;re trying to isolate your environment it&#39;s kind of upsetting if the GitHub authentication in that environment grants a wider scope of permissions than it needs to.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;GitHub does have three features for narrower permissions, but they all have their own issues.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;ol&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens&#34;&gt;Fine-grained personal access tokens&lt;/a&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://docs.github.com/en/authentication/connecting-to-github-with-ssh/managing-deploy-keys#deploy-keys&#34;&gt;SSH Deploy Keys&lt;/a&gt;&#xA;&lt;li&gt;&lt;a href=&#34;https://github.com/features/codespaces&#34;&gt;Codespaces&lt;/a&gt;&#xA;&lt;/ol&gt;&#xA;&#xA;&lt;p&gt;&#xA;Codespaces are convenient for some projects but they&#39;re costly and the organisation that owns the repo has to pay for them. SSH Deploy Keys are a repo-level mechanism not designed for regular development workflows. Some people use it for this purpose but it&#39;s going against the grain.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Fine-grained PATs are great because you can set exactly which repos and capabilities are permitted for that token, but there is a wrinkle. If you&#39;re working on another organisation&#39;s repo then by default you can&#39;t create a PAT on your own&amp;mdash;you have to submit a request that an org admin needs to review and approve. The good news is that there&#39;s an option to turn that off, which means that developers are free to create tokens with the narrowest required permissions for each situation.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;If you&#39;re a developer, consider asking for the ability to use PATs freely. If you&#39;re an organisation, consider changing the setting from the default so that contributors are able to use PATs proactively and independently.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;I think it would be neat if GitHub allowed you to restrict specific SSH keys to particular organisations or repos. For personal use, honestly my favourite thing is to use Codespaces. It feels a little bit silly not using the computational power of my laptop but it automatically creates repo-scoped tokens so it&#39;s a pretty excellent level of isolation with very little effort.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;a href=&#34;https://octet-stream.net/assets/pats.png&#34;&gt;&lt;img src=&#34;https://octet-stream.net/assets/pats.png&#34;&gt;&lt;/a&gt;&#xA;&#xA;</content>
    <link href="https://octet-stream.net/b/scb/2026-04-11-promoting-use-of-fine-grained-pats.html" rel="alternate"></link>
    <author>
      <name>Thomas Karpiniec</name>
      <email>tom.karpiniec@outlook.com</email>
    </author>
  </entry>
  <entry>
    <title>Yes, yes, I&#39;m one of those AI users now</title>
    <updated>2026-05-08T01:40:06Z</updated>
    <id>urn:uuid:b06e4b81-de75-44cf-8b21-bb34c1fc08a9</id>
    <content type="html">&lt;p&gt;&#xA;It&#39;s been a weird year. Last May I was working full-time for a client when they announced that they wanted to try Claude Code. Suddenly, I was furnished with an account and encouragement to try it out. I typed some clumsy prompts into Sonnet 3.5 and it had a clumsy go at writing Rust. It wasn&#39;t that great for my typical development work but I found a few use cases where it could solve annoying or tedious problems.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Fast forward to now. New models are extremely capable and my job is infused with AI every day. Whether I&#39;m writing code, debugging, understanding a new codebase, or scripting an unfamiliar toolchain, the robot helps. I&#39;ve been doing this kind of work for a while. I can state confidently that I&#39;m at least three times faster for similar or better technical output. The benefits flow directly to my clients&amp;mdash;higher quality for fewer billable hours.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;I&#39;m sharing this because I owe this blog an update. The last time I discussed AI directly was early last year and it would be fair to describe me as a hater. I predicted some of the controversies around &lt;a href=&#34;/b/scb/2025-04-21-ai-a-fork-in-the-road-for-open-source.html&#34;&gt;open source contributions&lt;/a&gt; and &lt;a href=&#34;/b/scb/2025-04-30-they-paved-paradise-and-put-up-a-bunch-of-slop.html&#34;&gt;preserving human communities online&lt;/a&gt; and I was pretty bummed about the inevitability of it all. I assumed I would be part of the tribe writing code the old way. Not for any particular reason&amp;mdash;just to stick it to the people who messed up those things that I like. Clearly, that&#39;s not how things played out.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Thing is, I like being paid to write software. No sane business is going to pay me to spend two days typing out a Rust module that a frontier LLM can whip up in five minutes, even if it takes me half an hour to review. There&#39;s a great deal of work and judgment around that coding, which is why I still have a job, but it&#39;s crystal clear that if I rejected LLMs entirely I&#39;d be less employable. I&#39;m a consultant, which puts me at the pointy end of these discussions. If I want to get paid I need to meet the market. For now this remains more attractive than becoming a plumber.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Unfortunately, it appears that AI is going to be pretty disruptive to society. Copyright concerns are being waved away and our leaders are clearly more interested in having AI than not. I live in a democracy, flawed as it is, and the outcome is what it is. Regulations around misinformation and economic support for affected people will matter a great deal. Immiserating myself in protest will not, although it might get upvotes on some corners of the internet.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Note that I&#39;m only talking about employment or contracting here. If you&#39;re writing and selling software by yourself then you can take as much time as you want, given sufficient cash flow. If you&#39;re writing hobby open source software then anything goes&amp;mdash;write it however you like, and police contributions however you like. No arguments from me.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Of course, the LLMs won&#39;t stop here. Right now an experienced developer provides the big picture thinking and a great deal of supervision. Even among those who&#39;ve taken up AI there&#39;s a widespread view that &#34;it&#39;s just another tool&#34; or &#34;there&#39;ll always need to be someone who really understands the computers&#34;. It&#39;s both comforting, and the most appropriate way to use this tech today with its current capabilities. I also suspect it&#39;s wrong in the long term.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Models are already quite smart. Given the right prompt and context they can do wild things. As long as my work is primarily looking at a screen and typing things on a keyboard the LLMs are still coming for me.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Make hay while the sun shines, I guess?&#xA;&lt;/p&gt;&#xA;&#xA;</content>
    <link href="https://octet-stream.net/b/scb/2026-05-08-yes-yes-im-one-of-those-ai-users-now.html" rel="alternate"></link>
    <author>
      <name>Thomas Karpiniec</name>
      <email>tom.karpiniec@outlook.com</email>
    </author>
  </entry>
  <entry>
    <title>Mythos and Legends</title>
    <updated>2026-05-12T11:23:29Z</updated>
    <id>urn:uuid:6ac5f9dc-deac-4b58-9f50-4fad8bfba5a0</id>
    <content type="html">&lt;p&gt;&#xA;A recent pastime for me has been reading the reports coming out from &lt;a href=&#34;https://www.anthropic.com/glasswing&#34;&gt;Project Glasswing&lt;/a&gt;. As you&#39;re probably aware by now, that&#39;s the scheme where Anthropic is permitting various companies and open source projects to scan their code for security vulnerabilities using the fancy new model called Mythos.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;This all kicked off with &lt;a href=&#34;https://red.anthropic.com/2026/mythos-preview/&#34;&gt;a detailed post&lt;/a&gt; from Anthropic&#39;s own security researchers. They suggested that by evolving and improving its general capabilities, Mythos became dramatically better than previous models at locating and stringing together bugs to create working exploits. One of their proudest scalps was a 17-year-old RCE bug in FreeBSD&#39;s NFS implementation, and Mythos was apparently also adept at browser exploitation. Despite the overall sombre tone, they still found an opportunity to link to James Mickens&#39; &lt;a href=&#34;https://www.usenix.org/system/files/1311_05-08_mickens.pdf&#34;&gt;&lt;em&gt;The Night Watch&lt;/em&gt;&lt;/a&gt;, a piece that deserves to be linked more often.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Recently I&#39;ve been reading more LinkedIn (I know, I know) and it&#39;s been pretty funny watching this play out. First there was the marvelling and FUD concerning the new capabilities, then others came out to trumpet that it was pure marketing. This second group was reaching peak intensity when Mozilla published.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;&lt;a href=&#34;https://blog.mozilla.org/en/privacy-security/ai-security-zero-day-vulnerabilities/&#34;&gt;271 vulnerabilities identified by Mythos&lt;/a&gt; and fixed. They later posted &lt;a href=&#34;https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/&#34;&gt;more details about their process&lt;/a&gt; and also updated their count to 423 security bugs in April (achieved via various means). Remember that Tor Browser is based on Firefox? Probably a good thing if that one&#39;s secure. Thanks, Anthropic.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;&lt;a href=&#34;https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier&#34;&gt;AISLE came out with a provocative post&lt;/a&gt; that they were able to find some of the same vulnerabilities using less capable models. This was interesting, but much of the trick was isolating the relevant buggy part so that the model wouldn&#39;t be distracted by everything else. I&#39;ll come back to this in a minute.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;&lt;a href=&#34;https://www.wolfssl.com/how-claude-mythos-preview-helped-harden-wolfssl/&#34;&gt;wolfSSL also made an announcement&lt;/a&gt;&amp;mdash;8 new CVEs.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;blockquote&gt;&#xA;So, is Mythos as good as the hype? On our codebase, yes.&#xA;&lt;/blockquote&gt;&#xA;&#xA;&lt;p&gt;&#xA;Now curl has had the chance to get checked out by Mythos. &lt;a href=&#34;https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-vulnerability/&#34;&gt;Apparently it has found&lt;/a&gt; one low-severity vulnerability (not memory safety) and around 20 non-security bugs.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;Daniel Stenberg&#39;s comments about AI are pretty interesting in general. He&#39;s been through a lot in recent months&amp;mdash;AI slop on curl&#39;s bug bounty program eventually causing him to &lt;a href=&#34;https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-bug-bounty/&#34;&gt;shut it down&lt;/a&gt;, then later reporting that &lt;a href=&#34;https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/&#34;&gt;the security reports are getting quite good actually&lt;/a&gt; (due to AI), and now the Mythos thing.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;He said this about the latest work:&#xA;&lt;/p&gt;&#xA;&#xA;&lt;blockquote&gt;&#xA;My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.&#xA;&lt;/blockquote&gt;&#xA;&#xA;&lt;p&gt;&#xA;I feel he&#39;s being modest about the absence of bugs&lt;a id=&#34;footnote-1-ref&#34; href=&#34;#footnote-1&#34;&gt;[1]&lt;/a&gt;, or at least avoiding hubris. Curl is a heavily scrutinised codebase that&#39;s already had significant attention from researchers assisted by AI. To me, the more likely explanation is that there&#39;s not much left to find. Amusingly, Daniel has &lt;a href=&#34;https://daniel.haxx.se/blog/2026/04/30/approaching-zero-bugs/&#34;&gt;a recent blog post stating exactly the opposite&lt;/a&gt;, that there are probably still plenty of bugs to come. You can all come and laugh at me in a few months if there&#39;s a raft of CVEs that Mythos didn&#39;t identify. It would only be fair. I&#39;m still going to make this prediction, however, for two reasons.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;The first is Mozilla&#39;s experience. They said that they found 22 security bugs with Opus 4.6 and then 271 with Mythos. If the model was only incrementally better at finding vulnerabilities, that&#39;s not the kind of result you would expect.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;The other is AISLE&#39;s finding. If Opus 4.6 (or 4.7) is okay at finding vulnerabilities when targeted at particular vulnerable code, then it&#39;s likely that this has already been happening. If there are hundreds of researchers using publicly-available models and pointing them at different parts of curl&#39;s code in varying levels of detail, it&#39;s logical that they&#39;ve found much of what Mythos could, except with more effort involved.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;As you would expect, all the reports I&#39;ve seen so far have come from open source projects. They don&#39;t mind talking about it. Sure, it&#39;s mildly embarrassing when memory safety bugs keep showing up in memory-unsafe languages but we&#39;re used to that. Best to just fix them and keep on keeping on.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;For the commercial entities who are also part of Glasswing the calculus is a little different. I expect that few companies want to crow about the number of security bugs they had lurking in their code until Anthropic swept in to help. Still, I hope they find lots and lots of things and fix them all. The big tech companies are responsible for the security and privacy of most people on this planet and it&#39;s what their users deserve. Once the vulnerabilities are fixed, I hold out hope that Microsoft will spend some Mythos tokens taking care of the other bugs. A man can dream. And maybe some of them will find the courage to post something publicly.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;p&gt;&#xA;What&#39;s most interesting to me is this: if we have all this neato AI tech that can find vulnerabilities, what&#39;s the path to getting this capability running against all my PRs? Will it be expensive? My hope is that this task can be optimised without having to reach for a huge general model like Mythos. Focusing analysis on diff contents will save tokens but sometimes it&#39;s the interactions with other parts of the codebase that bite you. Whoever gets this right at an economical price point is probably going to make a lot of money.&#xA;&lt;/p&gt;&#xA;&#xA;&lt;hr&gt;&#xA;&#xA;&lt;ol&gt;&#xA;&lt;li id=&#34;footnote-1&#34;&gt;I said as much in a comment on HN. Part of my comment was relayed anonymously to Mastodon. Daniel&#39;s &lt;a href=&#34;https://mastodon.social/@bagder/116556422302882208&#34;&gt;response&lt;/a&gt; was possibly either roasting me, or riffing on it. I&#39;m just going to assume riffing. &lt;a href=&#34;#footnote-1-ref&#34;&gt;↩︎&lt;/a&gt;&#xA;&lt;/li&gt;&#xA;&lt;/ol&gt;&#xA;</content>
    <link href="https://octet-stream.net/b/scb/2026-05-12-mythos-and-legends.html" rel="alternate"></link>
    <author>
      <name>Thomas Karpiniec</name>
      <email>tom.karpiniec@outlook.com</email>
    </author>
  </entry>
</feed>