(#budayha) @firstname.lastname@example.org I thought about that for a brief moment, but I realized that most of the content we’ve created at dev.twtxt.netactually centers around very specific content related to the twtxt spec itself and the extensions we’ve created atop and around it without breaking existing clients. So I’m not sure 🤔 – I think Yarn.social development has to be a bit different anyway, how to build the software, the UI/UX, the API, the various tools and services we’ve built to support the larger platform, etc…
Anyway… Congrats @email@example.com 🎉 Proud new owner of the new tt.vltra.plus Yarn.social pod 🥳 – What’s this now, 4 pods now in the wild that aren’t my own or managed by me? 🤔 Gonna have to start figuring out a way for the Yarns search engine to sart accounting for them? Or a way to identify them as a measure of “count”?
(#736inyq) Hmmm this is particualrly bad actually from the point of view of “wtf happened” 😂
@firstname.lastname@example.org No problems mate! 🤗 I’d just love to know how to reproduce what happened here so I can fix the backend to umm not ingest things like this? Not sure actually… There’s no source feed either which is really strange 😳
@email@example.com (#736inyq) Yeah they’re gone from my timeline too, now, but they were ingested by my pod, so these will definitely collide. I’d just be interested to see what the ingested data was, so I’m building an application/json content-negotiation for permalinks as I write this.
@firstname.lastname@example.org (#zmv53uq) I think your math is correct 👌 It’s also what I’ve concluded as well. Currently the search engine is seeing daily posts of around 500-600 per day. So we’re not going to collide anytime soon in reality 👍
@email@example.com (#zmv53uq) The hashing algorithm we’ve chosen and the encoding format is such that it is extremely unlikely to have a hash collision at the current scale; Bit… It is possible I suppose 🤣 How would I go about testing this?
@firstname.lastname@example.org (#rivvvna) Did you actually have a look at the project? 🤔 It actually supports just analyzing an access.log; But I don’t think doing this really provides enough insight of your site’s traffic IHMO. My question is more along the lines of:
If I run my own analytics for my own sites like yarn.socia would most people be okay with that?
@jlj@email@example.com (#ywqpb6a) This is why I think it’s important that the Yarns search engine and once implemented the one at https://search.twtxt.netactually re-crawl active feeds more often with some sensible algorithm. I’m already tracking moving averages for “fetch time” and “new posts”, so it’ll be interesting to see those numbers soon, but if we agree this is a good idea I think it could help solve this very problem by the yarnd backend going:
Oh oops, I don’t have that hash, lemme go see if my configured search engine does.
Yes? Fetch and cache.
@jlj@firstname.lastname@example.org (#m6fesrq) I’ve used finger too long ago to remember, but these days I’m stuck on the “why”; what value would it bring… I think I asked you this once @email@example.com about what “utility” it would have beyond just being “fun” 🤔