You are viewing a single comment's thread from:

RE: Making SPAM fighting on Steem into a game of skill

in #steemexclusive2 years ago

One thing I am concerned about here is that it is not a black and white issue whether a post is abusive or not. For instance, you and I have both discussed someone on this blockchain committing borderline abuse for months, and milking the system of thousands of SBD in rewards, but that actor's post are so realistic (likely written by AI) that whether or not it's abuse is not 100 percent clear. So my question is what will the rules of this game be in judging whether something is abusive? If a post is worth a lot more than a user thinks it should be is that abuse? I get that by having input from multiple users, you will have almost a check and balance system for detecting abuse, but that is not useful unless the definition of abuse is widely agreed upon.

Sort:  

For instance, you and I have both discussed someone on this blockchain committing borderline abuse for months, and milking the system of thousands of SBD in rewards, but that actor's post are so realistic (likely written by AI) that whether or not it's abuse is not 100 percent clear.

I'd be interested in knowing which user you're referring to - it sounds a lot like a few I've caught in the past that were translating content from Chinese or Russian sources. Their content was perfect. Consistently posting something deep and insightful every single day - the kind of quality that an individual would not be able to think of on a daily basis, devoid of any personal touches at all.

I asked @remlaps to respond, but we've noticed a number of accounts that have grossed nearly 100,000 SBD with content that is posted daily, but is not interactive whatsoever, and makes sense, but is not incredibly meaningful. The same large stakeholder votes for each of these accounts, and we are suspicious that they are using ai to generate the articles, and milking the rewards using their stake.

I don't want to type the users in a way that's searchable, but some of them are listed in screen shots here: Using TrufflePig for potential abuse detection - Human curation needed - Multiple $1,000s in rewards at stake. Others are easily identifiable in the recurring "Today's Truffle Picks" reports from @trufflepig (before it stopped running).

Using data from steemdb.io, in December or so, I built a PowerBI report on 14 of the accounts that were identified by trufflepig, and it appears that the accounts collected about 91k SBD in rewards from March 2021 through January 2022. (No idea how many other similar accounts might be out there)

image.png

I was thinking the posts were AI-generated (more recently I've been seeing ads for something called Jarvis), but I guess they could also be language translations.

image.png

This is the most interesting post. If you find it, put the URL into https://www.steemcryptic.me/ and see what the original version was. This is the only post that I came across that was significantly different and I've never spent the time working out why. Maybe I'll spend that time now - not that much can be done against this kind of upvoting power.

That is interesting. I searched Google for a deleted series of words inside quotes and didn't find any matches. The reason for the changes isn't obvious to me.

The timing of their first post and the facts that the recovery account is blocktrades and they were originally posting from SteemPeak reinforces my suspicion about where the funds might be getting directed, although it's mainly just speculation. (I was posting from Steempeak around that time, too, so these are very weak lines of evidence. ;-)

On a separate note, we also have the brute force method... roughly 950 SP and 110 SBD in the last week.

image.png

And another interesting post...

image.png

Especially this paragraph that was deleted from the end of the post:

You might have to let go of things that you love and that you have been attracted to and you might have to let go of things that you know you will never be able to experience again. You might have to let go of things that you have been used to and you might have to let go of things that you will never be able to experience again. You might have to let go of a lot of things that you have been used to and you might have to let go of many things that you have been attracted to. You might have to let go of a lot of things and you might have to let go of a lot of people and you might have to let go of all the people and things that you know that you are supposed to have

Another rare mistake. What do you make of the same sentence being repeated 4 times? It definitely suggests some kind of automation.

Yeah, the sentences are not quite exactly the same, but it's like a "Mad Lib" where someone/something just inserted new clauses into the same sentence structure four times in a row.

My gut feeling is still that it's being crafted by GPT-2 or GPT-3. Especially because of the timing of the early posts. I seem to remember a number of Steem posts in the 2019/2020 time frame where people were experimenting with those tools. If I recall correctly, some of the people involved in those conversations later joined the Witness War hostilities on the Hive side.

Maybe near the beginning a human was proof-reading and fixing the worst parts, but as time went on and they continued to go unnoticed, perhaps they didn't need to bother with edits any more?

This might be useful, but I'm not sure how to interpret it yet. Found it linked from here and tried it out with a couple paragraphs from the top post on that same account. To my eye, the sample below looks similar in coloring to "machine*: unicorn text (GPT2 large)." Also, I note that there's no purple at all.

image.png

Here are a couple paragraphs from my own post, so I know they were written by a human.

image.png

I guess the more green and yellow you see in comparison to red and purple, the more likely it is that an AI wrote it. That article was published in 2019, so maybe there's a better tool available by now.

This is fascinating - I didn't know this existed and it's incredibly interesting. I'm probably going to get sucked in to it now and lose my day.

That would definitely explain why the articles appear to make sense but at the same time, they don't. One of the ones I read seemed to have a random thread that was totally unrelated to the main thread. In a coherent but totally illogical sense. I think you're right in that their posts are being generated in this way - it almost appears obvious now that my eyes have been opened!

These settings seem like they might be decent for distinguishing between human and whatever they're using... Now if I just had 8 or 10 million SP worth of downvote strength. ;-)

GPT-2(?)

image.png

Human

image.png

I still don't really know how to read those histograms. It would be nice if it could all be boiled down to a single number, instead of needing to take a SWAG from the coloring.

Ah, I know the lot. I investigated them some time ago and couldn't find the original source. There are often clues within edited posts (i.e. why did they edit it) and there were very few posts that had been edited (which is suspicious in itself). What I did notice is that one of the edits was correcting some grammar - so somebody who (I believe to be) fluent in English had read it after it had been published.

The other edits often removed a quotation mark (") as the first and last character of the post. So whatever tool had been used to translate or create the post wraps it in quotes.

Looks like I looked into them on 12th April last year - 12 accounts at the time.

I have no doubt though that it's not authentic content and believe that this account holds the key -

image.png

The content was genuine at the start, written by a guy from Pakistan and then it changed completely - like it was being run by somebody completely different.

Ultimately, the final decision lands with the person or team who casts the downvote. But that is similar to the problem that the original game intended to solve. Say it showed you a picture with a cat and a chair. Should the keyword be "cat" or "chair" or both or neither? There's no clear rule on that, but the game relied on player input to inform its decision. The rules for identifying abuse are always going to be fuzzy, but (hopefully) if enough eyes look at it, you can reach a consensus.

the final decision lands with the person or team who casts the downvote.

This doesn't always have to be the case right? We could have a DAO kind of system since this is a downvote trail. Maybe a simple DApp where users can see the flagged posts and based on the common sentiment, the person who initiates the downvote can decide whether to vote or not. This would kind of be similar to the SPS but on a separate DApp.

Your thoughts?

Yeah, that's getting more complicated, but I think you're right. This is connecting it back to the idea about quorum sensing that I mentioned in the opening paragraph.

Is that even possible? To have someone just mark a post and once enough quorum is reached, a downvote is initiated? If I'm not wrong, there is a 1hr expiry of every transaction on Steem. If a transaction is not broadcasted within this 1hr duration, it would fail, right? So, this is where it becomes very tricky.

I don't think it would be possible on-chain. The "vote broadcasting" would have to happen off-chain, through a web site or some other protocol. The only part that would be on-chain would be issuing the downvotes.

Even if it were possible, I think you'd want to keep it off-chain anyway, in order to prevent retaliation.

Oh, and I forgot. This is a good example:

For instance, you and I have both discussed someone on this blockchain committing borderline abuse for months, and milking the system of thousands of SBD in rewards, but that actor's post are so realistic (likely written by AI) that whether or not it's abuse is not 100 percent clear

From what I can see, it's actually somewhere around one hundred thousand SBDs during the course of last 10 months (and maybe more that I'm not aware of). It would be interesting to see if a game like this would find that actor, like @trufflepig did (accidentally, it seems).

Coin Marketplace

STEEM 0.29
TRX 0.12
JST 0.034
BTC 63672.48
ETH 3319.16
USDT 1.00
SBD 3.91