HF20 Exploratory Data Analysis on Proposed PayOut Changes
The upcoming HF20 proposes to make changes in how pay-out is rewarded. The proposed changes will affect posts in which there are votes cast in the first 15 minutes.
Repository
https://github.com/steemit/steem
Currently rewards from votes cast in the first 30 min are on a curve on the split between Author and Curator. it's an increasing % to the curator from the first minute, 0 at 0 mins, 100% from 30 mins on wards. Under the proposed changes, for votes made in the first 15 min the author will no longer get these rewards but instead the will be redistributed between all posts with pay-outs. The reasons given for this change are to discourage self-voting.
This has led to a post by @tcpolymath and a response by @davemccoy where the discussion turns to how the rewards are redistributed. From these conversations the concerns are that these changes will redistribute the rewards to those that are already earning a lot, and the rich will just get richer.
I would suggest that you have a read of both posts and the conversations that take place in the comments
https://steemit.com/steem/@tcpolymath/this-post-will-exist-in-fifteen-minutes
https://steemit.com/steemit/@davemccoy/new-ad-slogan-steemit-where-the-rich-get-richer
Its is because of these posts that I decided to take a look at the data.
Aim of Analysis
This is an exploratory analysis. The aim was to get an indication of the level of votes placed on posts within the first 15 and 30 min to see how much of a problem early voting is. I also wanted to try an establish who votes early and get an idea of how much this drains from the rewards pool. From here I wanted to use this information to draw a conclusion on if HF20 will make the rich richer or have any impact at all.
As this was an exploratory analysis only a small sample of data has been taken. Exploratory analysis allows you use visualizations to get an understanding of data. It does not aim to give a detailed review or accurate calculations. Exploratory analysis are often used to spot things in the data that would require further analysis.
I have taken data for 2nd June and the sample is 130K posts that received 468K votes
As not to distract from the finding, details of the data queries can be found on the bottom of this report. However, within the analysis you will find the steps and methodology taken.
The Analysis
First, I wanted to look at the distribution of vote timings in periods of 30 minutes. To do this I took the vote time less the post time to get the duration and grouped this into bins of 30 minutes.
Below you can see a rather skewed graph as 223K votes were cast in the first 30 min of posting. The skewness is so great the chart is hardly readable.
So, to get a better understanding I created a pie chart showing the % of votes made for each 30 min time period after the vote was made. The first surprise was a limitation in Power BI, it seems that 48 data points in a pie chart requires too much colour and some brackets, such as 30 – 60 min are showing up white.
However, what is clear from this is that 47.74% of votes are placed within the first 30 minutes of posting.
What about self-votes, what % of self-votes are cast in the first 30 min?
Wow almost 84% of self-votes are made in the first 30 min. That is rather interesting and sheds a little light on reasons why Steemit inc have proposed these changes.
But the changes are only on votes made in the first 15 minutes, so to gauge this, I changed the bin size for grouping from 30 to 15. Again, the histogram was skewed to far to be able to read it properly so I prepare pie charts which still had the limitation above but gave me a clearer picture.
27% of votes were cast in the first 15 min
And if we look at self-votes 78% of self-votes are made in the first 15 min.
Next, I wanted to see who votes in the first 15 min and with what weight. First, I sorted the data by the number of votes in the first 15 Minutes
Then by the Steempower controlled by the account
But this does not really give me an indication of the effect on the vote values or rewards pool and from the data taken it would not be possible to calculate the actual worth of each vote. So, I decided to try and work out an approximate value for all votes given in the first 15 min.
As I have the number of votes in the first 15 min, the controlling steem power for each of the accounts and the average weight, I can use this to calculate the effect Steem power for the votes cast in the first 15 min.
No of votes * Steempower * average weight
Using this effective steem power I can now plot this by pie chart and see who contributes the most in vote value on votes made in the first 15 minutes.
It is worth pointing out that the value of a vote is dependent on many factors, including the voting power. With out the voting power at time of vote, any calculations for here are very general.
If ranchorelaxo contributed 4.94% of the SP used for votes in the first 15 min and this account voted 5 times in the first 15 min at 100% power with a current vote worth of $103 then we can say 4.94% of votes made in the first 15 min was worth $515 then 100% would be worth $10,400
It has been said that this will be redistributed between posts that have no votes in 15 min. From the sample data there were 130K posts, of which 29K had a vote in the first 15 minutes. That is 22% of all posts.
Here are the top authors in May
.
Making an assumption that the top authors will not vote in the first 15 minutes we can see from the % column how much of the distribution each author would receive.
However, as it currently stands many of the top paid authors have votes in the first 15 minutes.
Conclusion
27% of votes cast are cast in the first 15 minutes and 78% of self-votes are also made in this time. It would make sense for a change in the code to penalise these early votes so the favour is not for the self-voter.
However, I noted from looking at the data, the are many bots involved in the early voting, including paid voting bots. It would be easy for the bot owners to change their code so that the vote takes place after the first 15 minutes.
What we could see happen is some of the older bots non-updated and fade out, but to be honest I don’t see much of this happening.
I would say a considerable number of votes give in the first 15 min are also auto votes, from which the setting can be changed if and when the HF takes place.
We are currently looking at around $10.5K a day in direct vote worth so about $315K a month. Based on Mays posts this will about 10% of total post pay-outs. Given that many of the votes are auto/bot and easily changed, this will leave the manual voter that is not aware of changes. Steemit could easily implement something into the UI that warns people that try to vote before the 15 min of the consequences with the aim of education.
The concern in the posts from @tcpolymath and @davemccoy that the rich will get richer is not really substantiated. Looking at the data it’s the rich early voting that have the most impact on the values in the first place. Now if these were to continue to vote within 15 min after the HF, then it’s the poor that will benefit.
Given that most of the bots and auto votes will be updated and votes moved passed the 15 min cut off, I don’t see this HF change making any real impact either side of the fence.
As an exploratory analysis would normally highlight things to investigate further, I would not take this analysis any further because the main players involved will quickly change their voting when HF20 comes in and so further analysis on this history data do not make much sense.
Data Queries
I use Power BI to connect to SteemSQL using M lanugage. The following codes were used to extract and transform the data
Votes query
let
Source = Sql.Database("vip.steemsql.com", "DBsteem", [Query="select *#(lf)from txvotes [NOLOCK]#(lf)where timestamp = CONVERT(DATE,'2018-07-02')"]),
#"Split Column by Delimiter" = Table.SplitColumn(Table.TransformColumnTypes(Source, {{"timestamp", type text}}, "en-IE"), "timestamp", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"timestamp.1", "timestamp.2"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"timestamp.1", type date}, {"timestamp.2", type time}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"timestamp.1", "date"}, {"timestamp.2", "time"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "% weight", each [weight]/10000),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"% weight", Percentage.Type}})
in
#"Changed Type1"
Posts query
let
Source = Sql.Database("vip.steemsql.com", "DBsteem", [Query="select author, permlink, created#(lf)from comments [NOLOCK]#(lf)where created = CONVERT(DATE,'2018-07-02')"]),
#"Split Column by Delimiter" = Table.SplitColumn(Table.TransformColumnTypes(Source, {{"created", type text}}, "en-IE"), "created", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"created.1", "created.2"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"created.1", type date}, {"created.2", type time}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"created.1", "date"}, {"created.2", "time"}}),
#"Removed Duplicates" = Table.Distinct(#"Renamed Columns", {"permlink"})
in
#"Removed Duplicates"
and to get the SP controlled I connected to the accounts table using
let
Source = Sql.Database("vip.steemsql.com", "DBsteem", [Query="select name, vesting_shares, delegated_vesting_shares, received_vesting_shares#(lf)from accounts [NOLOCK]#(lf)"]),
#"Replaced Value" = Table.ReplaceValue(Source,"VESTS","",Replacer.ReplaceText,{"vesting_shares", "delegated_vesting_shares", "received_vesting_shares"}),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"vesting_shares", type number}, {"delegated_vesting_shares", type number}, {"received_vesting_shares", type number}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "total vests", each [vesting_shares]-[delegated_vesting_shares]+[received_vesting_shares]),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "SteemPower", each [total vests]*.000492)
in
#"Added Custom1"
From here I created relationships in the model and used DAX for calculations.
That may be a reasonably accurate reflection of Dave's concern, but it's only a tiny part of mine, for what it's worth. Mine mostly reflect on the fact that voters will have a choice on what to vote for based on how much of the rewards they want to go to the pool. Which can't be determined based on current voting behavior.
If I have a choice to vote for a post which has no early votes - and therefore all of my vote's value will go to the post - or a vote which has some - so some of my vote's value will be redistributed to the wealthy - then I have to make an evaluation of whether a post is worth so much to me that I want to give up some of my stake rewards and have them redistributed.
An early vote essentially turns a small percentage of the post into a declined-rewards post, without that being made clear in the interface to any of the later potential voters.
Several unintentional behaviors will arise from this. One is that authors will be motivated to prevent upvotes in the first fifteen minutes, which can't be a good thing for the system or for its image to new users.
Another is that it will be possible to aggressively remove rewards from a post (and discourage future votes by sophisticated voters) by upvoting it immediately. Which is highly entertaining but I can't imagine it's good system design.
Hey @tcpolymath
Here's a tip for your valuable feedback! @Utopian-io loves and incentivises informative comments.
Contributing on Utopian
Learn how to contribute on our website.
Want to chat? Join us on Discord https://discord.gg/h52nFrV.
Vote for Utopian Witness!
Yup he deserves it, clear explanation, in depth analyzes
Saw the bad effects possible, understood and explained how this HF20 is fixing nothing at all execpt in giving less to the tinies and more to the big boys
I have been following along the discussion as well. I'm not too fond of the 'rich get richer' phrase because it only tells half of the story. The poor also get more. I agree the values can't be compared in absolute terms but still, I find it a one-sided representation of the facts as I understand them. The rich, i.e. the ones that have invested more, that took risks earlier (or just happened to be the lucky first miners) will get more than those that have not invested at all. That sounds reasonable. If it would be the inverse I don't think it would be fair for the rich either.
I do agree with you on the fact that it just is a complex solution that will probably lead to even less understanding of the users. The 30 minute window was invented to circumvent bots and you need to be quite seasoned to understand how it works and what effects it has on your earnings. It's not user friendly at all. Still, people and tools adapted to bypass it and maximize their profits anyway. To some extend the same will happen again. The smart and rich will adapt to benefit more.
I'm not sure about the unintentional behaviors. I think those only apply if you vote really very consciously. I just vote for what I like, I don't look how parameters influence my curation rewards. In that sense an incorrect post payout value won't prevent or motivate me to (not) vote.
I'm also not sure how correct your assumption about this declined reward is. Will it be included in the post reward and be subtracted afterwards or won't it be shown on the interface. They could just keep showing 0 SBD the first 15 minutes but show a number of votes. I don't know, just an idea.
To put it short. I don't agree with your conclusion but I do agree on the fact that it is a bad fix of a previous bad fix for other reasons.
would love to know what reasons you think make it a bad fix?
I do not think it is a bad fix, I just don't think it will work. If it worked it would be good because then there is a chance the rewards pool could grow at a greater rate. but as you say, people will just adapt and that's why i think it will fail as a fix.
Well, you say it yourself, it won't work. That makes it a bad fix. I don't mean to say it will make things worse. But if a fix doesn't fix what it is supposed to fix that makes it a bad fix in my opinion. Just apply the same reasoning when you bring your broken car to the garage. The garage says it is fixed. You get it back and it still doesn't work. I'm guessing you won't consider that a good fix.
You certainly have the right to waste some of your voting power by voting naively if that makes sense to you. But you should be aware that you're wasting some of the voting power of everyone who votes behind you as well.
@tcpolymath thank you for the clarification on your concerns and i am very interested in the intentional and unintentional behaviors that will arise. People will try and 'maximize profit ' and find ways to worth within/around the system to do so.
I often early vote and dont worry about curation rewards. I do this because if I like a post, I know I will forget to go back and let lost in other posts and discussions. But I see the funds going back to the rewards pool as a good thing because it gives the possibility of growing the pool at a faster rate. To me, as steemit becomes more mainstream, this could be a big factor for new investors and so only a good thing.
Enjoy making the people you vote on unhappy, then.
This makes zero sense to me. Can you unpack what you're talking about?
@tcpolymath doesn't voting early simply mean that the author gets more and the voter themselves get less? How does that make people unhappy?
That's what it means right now. After HF20 if it goes through as proposed, the author will be capped at 75% and the extra portion will no longer go to the author but be treated as if it was never used at all.
Part of what this means is that an early voter will take a big chunk of the pool of money that goes to curation away from the post. Posts that have no early votes will have 25% of their payout go to curation, while posts with early votes will have less. This will lead later voters to prefer voting on posts with no early votes, and therefore authors will want to make sure their posts don't have any.
where are you getting that? When anyone other than the author votes a post earlier, it still will increase the % of the post payout that goes to the author. The hardfork is talking about removing the incentive to self vote. It is only the curation reward that an author gets currently for self voting before 30 minutes that is going to be returned to the reward pool under the new hardfork.
Ok thanks.
You do understand all this is only for the self voter right? Another user who comes along and upvotes a post earlier is actually helping the author... and this hardfork is not changing that...
This is incorrect. It applies to all of the curation rewards that are currently going to the author, who will now be capped at 75%. See https://github.com/steemit/steem/issues/1874
Ah thanks - the blog posting was not worded very well on the steemit blog - it seemed like they were talking about just making it so you don't earn curation from self vote, and returning that specifically to reward pool. Which seems like the best solution - early votes by others (not the author) should still increase the % of total post payout that goes to author, while curation reward from an author's own early self vote should go to reward pool. I still strongly disagree with your proposal to just return all of it to the curation reward pool of the post in question, as that would incentivize other users to vote on posting that has large self votes to get a share of that extra curation $ - we don't want to encourage people to upvote posting that already has huge self votes. That is still incentivizing self voting.
Then bot votes and dummy account votes would still count and the change would be essentially meaningless.
That's not my actual proposal, by the way, it's just one of several simplified methods that would serve better than what they've implemented. Mine is here, and avoids that problem.
all fund in the rewards pool are not used in a day, there is a constant supply being added and taken away. if the rewards pool is not drained so much every day and there is a larger pool, I believe this will have an impact in the future on investors. Investors = ( or should =) better steem price = better for everyone, including the people I vote for and dont vote for.
And no I do not like to see people unhappy.
I'm reasonably certain that the amount of rewards distributed over a given time cannot be changed by how anyone votes. (Outside of the weird edge-case where no posts have any upvotes at all.) "Returned to the pool" doesn't stick more money into it to be spent later, it redistributes that money across the existing votes.
This hits the nail on the head exactly @tcpolymath
Isn't this what the self-votes at 1 minute are doing at present with regards to the curation that can be earned by others voting on the post?
I tend to avoid voting on posts voted in the first minute, particularly by larger voters.
It is, but right now authors like that because the extra money goes to them. The change will make it so authors dislike it, and will want to discourage people from upvoting them early.
You're the example case here: right now, authors are fine with your vote being discouraged, because of the way the extra curation is distributed. But in HF20, the fact that you are discouraged from voting will mean something to them, because they get the same from every vote no matter when it's cast. So getting a minute-zero vote and not yours is worth less to them than getting that same vote at minute fifteen and also yours later.
Which means that they'll want every vote to come in at fifteen or later if at all possible.
It will also discourage the REAL curators as their VP will be spent for the reward pool instead of the authors they curate as the content was of high quality
Glad I found you @paulag. Saw you referenced in a post and am looking forward to going through your analyses.
I had some points regarding this HF change however I realise I need to do some research before adding them here. Nonetheless, thanks for the deep analyses. You've got my 2 cents worth :)
And you get my 14 cts for your comment and all other positive interaction in this post's comments ;-)
@paulag is worth every minute of your time
Thank you jefpatat that is very kind 🙏🏿
Thanks a lot for the analysis, I was just wondering how to get some numbers concerning the impact of the change after reading the posts you mention yesterday.
For me, as you say, a lot of people will quickly adapt their voting behaviour. I know I will, but being dictated what I can or can not vote (I won't vote anything who had a vote in the first 15 minutes, if the change happens of course) is bad. And the goal is the voter giving a reward to the author, why should a fraction of my vote go to people I didn't upvote ? If I wanted that, I would have upvoted them. Curation is the same thing, but it's still about a post, it's not distributed across all top-paying posts, I'm giving a fraction of my vote to someone who upvoted the post before me and so potentially helped me discover the post.
It's quite simple, so why is Steemit Inc. messing with this principle ?
And I agree with @tcpolymath that not only voters will change their behaviours, authors will too to prevent early votes.
But I see some flaws in Steemit Inc. reflection about this :
I would love to know the position of big accounts : @utopian-io, @busy.org, etc... Do they want a fraction of their vote to be given back to top authors that have no link to the project ?
In the current system, a vote at post creation time is fully credited to the author with no curation rewards for the voter. In HF20, the author would end up with "only" 75% of this vote and still no curation rewards for the voter. I think the voting "sweet spot" will move from the current 20-25 mins towards 10-12 with HF20, but curation wise it's the same game.
It's super easy to create another account, delegate all your SP to there and let that one vote for you. There are reasons where a self-vote is OK. Forbidding it will not solve the problem IMO.
For the self-vote, OK, I spoke a bit too fast and it's clear that self-voters and circle-voters will be able to find a solution to continue their business anyway.
For the rewards, yes, it will change the optimal time for voting but still, if I understood correctly, if a post is voted 15 minutes after its publication, a part of the curation rewards (all votes that follow) you receive now will go to a pool then distributed to top-paid posts.
Maybe I didn't understand everything, but that's this principle I'm speaking about. To get a part on all curation rewards (all votes) on every post, they could upvote everything in that 15-minute window.
We probably mean the same thing, but just to be sure:
For both HF19 and HF20, 75% of the vote value directly go to the author, independent of the voting time.
In HF19, and additional share of the remaining 25% go to the author, if the vote is cast within the first 30 minutes. This has a linear decrease, with the full 25% going to the author for votes at post creation time and the full 25% to all curators for votes at or after 30 minutes.
In HF20, the author only get's the 75%. The remaining 25% either remain in the reward pool for votes at post creation time (e.g. are not payed out), or go to all curators for votes after 15 mins. This curator share also increases linearly from 0 to 15 mins.
Examples:
We mean the same thing. But I'm really looking to understand better the new system, and I thank you for your explanation.
But there's still (for me) some grey areas. For example when you say :
Shouldn't it be :
My question (and I think that's what I understood from tc's post) is that votes following this kind of vote (at 7.5 minutes) will increase the value "redirected to the reward pool" (like with curation, when a fraction of the value of your vote goes to previous voters) - the value redirected to the reward pool during the first 15 minutes will be considered as a "previous vote" gettin curation rewards for all votes after the 15 minutes.
I might be wrong, as I said, I'm still trying to grasp the principle of this change.
not really, because the remaining 1.25$ are simply not taken from the reward pool to be payed out, so they don't go anywhere. This is a strange situation, because the pending payout of such a post will show a value of $10, but only $8.75 will be payed out to authors and curators.
This is correct. The vote will not be eligible for curation reward if it's done immediately after post creation, but for any time later, also before 15 minutes, this vote will also earn curation rewards.
OK, thanks a lot, I'm beggining to understand the whole situation better.
So, to sum up :
If there's a vote before 15 minutes, a fraction of the rshares (linear between 0 and 15 minutes) won't go to the post (so increasing the payout value of all other posts - sames reward divided by less rshares), and the fraction of curation rewards of this "ghost" vote value (generated by all future votes) will also not be attributed to anyone (once again increasing value of all other posts).
Have I finally understood the thing or not ? :D
So what it looks like will happen is that everyone's bots get adjusted 15 minutes, and everything stays about the same.
Appreciate the analysis, I suspected this whole thing would be a big pocket of hot air, but now I'm pretty certain it will achieve nothing.
thank you for the numbers to these HF discussion...
As a conclusion I would say it’s not the right decision since the changes will not solve the problem.
maybe SP should have a logarithmic voting power influence...
I see a lot of bid bots in the graphs and that is understandable. A lots of bots had no minimum post age to accept bids. So, people post according to bid bot's vote rounds and send last minute bids to deny curation rewards to bid bots. As the bot votes between 1-5 minutes, they can keep large portion of the curator SP for themselves. They create lots of garbage posts just to exploit bots for their own gains.
So, HF 20 changes may reduce this behavior and encourage curators to curate more. I am hopeful these changes will bring good to the community, if not there is always a chance of HF 21. :)
I dont think these changes are too bad myself. As i mentioned in the post, calculating the worth of a vote has a lot of factors. Just because the money goes back to the rewards pool, does not mean that it gets paid out proportionally to the higher earners. In fact it could aid in the growth of the rewards pool which I believe will become a factor in attracting new investors.
I am very new here, but it seems to me that we as new comers are at a very disadvantage. I'm probably wrong due to me not understanding how everything works, but I do see a lot of self votes in which I think is ok but my question is........ could a video be made to explain all of this in simple terms for people that are new here.
I'm not that new and this stuff is still way over my head
I feel ya!
I believe you are misreading the proposed change - it is only the curation reward an author currently gets for self-voting early that will be returned to the pool. That is what the latest steemit blog update says, that is what the previous update said. Where are you seeing that all early votes will return $ to the pool? Anyone else (other than the author) who votes early is foregoing a portion of their curation reward, which just means the author gets more payout. EDIT - sorry I see the github now which explains it better than the steemit blog posts. You seem to be correct.
the rewards pools and distribution of it is very complicated to understand, I dont fully get it. I think its time I looked at the code to try and get a better understanding
It seems that the entire method of delegating value based on likes, no matter how refined will simply encourage bots RATHER THAN social communication.
Content is going to be created regardless, and the intent is to draw more people to the platform making more posts, writing comments, etc...
Instead of drawing value from likes, a better system would be how much activity a persons content draws. This would self-balance those that make 50 posts a day where 49 of them are ignored and the person who writes 1 post a day but draws the same attention.
Especially when you look at non-news related content... Tutorials, books, video clips, etc, where often they won't even draw attention or become viral until sometimes months later..
Congratulations! Your post has been selected as a daily Steemit truffle! It is listed on rank 10 of all contributions awarded today. You can find the TOP DAILY TRUFFLE PICKS HERE.
I upvoted your contribution because to my mind your post is at least 21 SBD worth and should receive 128 votes. It's now up to the lovely Steemit community to make this come true.
I am
TrufflePig
, an Artificial Intelligence Bot that helps minnows and content curators using Machine Learning. If you are curious how I select content, you can find an explanation here!Have a nice day and sincerely yours,
TrufflePig