You are viewing a single comment's thread from:

RE: Curated by Thoth - 2026-01-05 23:08Z

in #thoth-testlast month

@remlaps
Perhaps it could be avoided to include an author with multiple posts (per post or per day).

Since I started looking at the Thoth suggestions, I have been toying with the idea of running a German version that makes suggestions for me (and maybe other German users). I just don't really have any idea how this could be realised in multiple languages. I ask myself questions like: Do I use an own account? Or just add more comments under the Thoth.test post? ...

You explicitly mentioned the possibility, did you already have use cases in mind?

Sort:  

Perhaps it could be avoided to include an author with multiple posts (per post or per day).

This is a good idea. It's not really designed well for a per-day setting, but I'll try to add it per-post in the next couple of weekends.

You explicitly mentioned the possibility, did you already have use cases in mind?

Yeah, I was imagining multiple Thoth accounts so that people can follow and delegate to just their languages/topics of choice. That follows from the fundamental idea that multiple Thoth operators would eventually compete based upon their ability to highlight articles that interest readers and attract voters.

I think this would be great. Let me know if I can help with the setup. I think it would just require an LLM API key, changing a few words in two prompts, and setting up the screening settings.

Personally, I read many more non-English posts if I see the Thoth summary in English, than if I only see the author and title in my feed, so I think German and Spanish Thoths would be very useful for native speakers in those languages. (I'm skeptical about AI's abilities in other languages that are common here.)

multiple Thoth operators would eventually compete

That wasn't what I had in mind, but it could actually be a sensible alternative.
If they cover different topics, competition would be an advantage. Like a kind of subscription model where you "subscribe" to scientific topics, for example, and then receive suggestions.

However, if they act in general terms (as I understand the current approach), a comment in German (or another language) would suffice for me to then read the article in its original language.

Let me know if I can help with the setup.

Thank you. I'll gladly accept that once I've looked into it in more detail. I didn't have it in mind so quickly. But that may change once I've looked at the Thoth posts more often...

That change has been merged into the master branch now. The number of allowed included posts per author is configurable, so each Thoth account can set according to their own preferences (If we ever get the blockchain beneficiary settings increased to 127 per post, it might make sense to allow more than one article per author per post.)

The blockchain allows 127 entries. I believe we had already determined where the current maximum limit of 8 entries is set. Was it steem-python, Condenser or Hivemind? I don't know right now...

It was failing for Thoth in steem-python. When I found that issue, I was led to believe that the blockchain had a hard cap of 127 and a soft-cap that was set to 8, so I stopped investigating. Maybe I'll try to check again this weekend.

Update: I think github copilot found it for me in the witness plugin.

The limit was apparently introduced during the initial implementation. It is not clear from the discussion why this was done.

I'm a little surprised that it's in the witness plugin. As I've just seen, it was moved there so that it could be changed by a soft fork.

In any case, it is not sufficient to change the limit in steem-python...

It is not clear from the discussion why this was done.

Maybe they wanted to be able to limit the threat easily if it opened some sort of abuse vector. Not sure why they never extended it, though. IMO, a bigger value would almost certainly be an improvement, if we can ever get back to the point of doing soft-forks and hard-forks.

Like a kind of subscription model where you "subscribe" to scientific topics, for example, and then receive suggestions.

Yep, that's one of the first specializations that I had when I first started the project. I created the @thoth-stem account around the same time as @thoth.test for eventual use as a Thoth account that could focus just on STEM topics.

However, if they act in general terms (as I understand the current approach), a comment in German (or another language) would suffice for me to then read the article in its original language.

I agree. With current usage levels, I think that's the most useful service that Thoth is able to provide to readers right now. If I see a title in a language that I don't know, I have no idea whether I want to read the article or not. The single-language summary from Thoth helps me to make an informed decision about clicking through and translating the original post with almost no effort. It was an accidental feature, but it feels extremely useful to me.

Did you get a chance to review Thoth's posts while it was posting in German last week? I'm just wondering if there were any language issues that need to be fixed. Some of the phrasing feels unnatural to me after browser translation back to English, but I can't tell if that's a problem with the original post or with the translation tool...

After working out any remaining language/phrasing issues, to change languages between English, German, and Spanish, it will just require an update of a single line in a config.ini file.

Yes, I looked at the posts and comments. I didn't see any significant errors. However, I didn't compare every summary with the original post.

The wording is generally not unusual. Only the word "Steemizens" seems a bit unfamiliar to me. Here, we usually use "Steemians" instead. You probably can't influence that, can you?

Thank you!

Only the word "Steemizens" seems a bit unfamiliar to me.

Yeah, I specified that for the English output. I know it's uncommon, but (in English) I prefer "Steemizen" because it's related to "citizen". To me, "Steemian" just signifies a presence on the blockchain, but "Steemizen" indicates a sort of shared responsibility for the blockchain's well-being (i.e. citizenship).

I hadn't really thought about how that would flow in other languages, but it would be easily removed in the AI prompt, which can be customized for each Thoth account. (In the architecture, I also imagined prompt custimizations as another competitive tool between different Thoth accounts, so the prompts can be customized without forking.)

Thanks for your explanation.

I have one more addition. Many German users visit the #deutsch tag. So it would be good if this tag could be added.
I took a look at the code and saw that the tags for Thoth's posts can be specified in the config. So I don't need to ask the question anymore... :-)

the tags for Thoth's posts can be specified in the config

Exactly.

lol. Not sure what to make of this.

Thoth wrote:

Über Thoth:
Benannt nach dem altägyptischen Gott des Schreibens, der Wissenschaft, der Kunst, der Weisheit, des Urteils und der Magie, ist Thoth ein Open-Source-Kurations-Bot, der Anreize für Autoren und Investoren zur Produktion und Unterstützung von Kreativität schaffen soll, die menschliche Aufmerksamkeit auf die Steem-Blockchain lenkt.

My browser translated it as:

Over the past decade, the United Nations has made significant progress in implementing the Convention on the Rights of the Child and its Optional Protocol Thoth:
Named after the ancient Egyptian god of writing, science, art, wisdom, judgment and magic, Thoth an open-source curation bot designed to incentivize authors and investors to produce and support creativity that draws human attention to the Steem blockchain.

Anyway, still working through issues, but it's getting closer. It's not ready for review yet, but feel free to let me know if you happen to notice any glaring language issues.

The first sentence does not match, but it is not in the German text either:

Over the past decade, the United Nations has made significant progress in implementing the Convention on the Rights of the Child and its Optional Protocol Thoth:

Otherwise, it looks good! There are one or two things that sound a bit odd. But how much influence do you have on the translated text? I assume the text is also translated by AI?

It's a mix.

The text that's the same in every post is hard-coded in a localization file, so I can fix that if there's anything awkward. (The first translation was done by Google Gemini.)

The AI curation reports that change for every post with the links, summaries, etc. is all written by the AI dynamically so I don't have much influence there. That will also depend on the model. I'm testing with the free model from ArliAI, so it might not be as good with non-English languages.

image.png

It's more involved than I had realized, but I'm getting there.

Coin Marketplace

STEEM 0.05
TRX 0.28
JST 0.045
BTC 64333.27
ETH 1857.16
USDT 1.00
SBD 0.38