SteemOps: Extracting and Analyzing Key Operations in Steemit Blockchain-based Social Media Platform - ACM

in Steem Links3 years ago (edited)

( April, 2021; Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy )

Researchers from the Beijing Jiaotong University, IBM, and the University of Pittsburgh have created a public dataset with Steem transactions that occurred between March, 2016 and November, 2019 (block 1 to block 38,641,150). The researchers suggest that this data can be used for: (i) blockchain system analysis; and (ii) social network analysis. Under blockchain system analysis, they suggest analysis of decentralization, cryptocurrency transfers, and performance benchmarks. Under blockchain system analysis, they suggest researchers can study behavior analysis, the curation mechanism, and bot detection. The benefit of all this is that the data is well-structured and available to researchers without any special blockchain knowledge.

Abstract

Advancements in distributed ledger technologies are driving the rise of blockchain-based social media platforms such as Steemit, where users interact with each other in similar ways as conventional social networks. These platforms are autonomously managed by users using decentralized consensus protocols in a cryptocurrency ecosystem. The deep integration of social networks and blockchains in these platforms provides potential for numerous cross-domain research studies that are of interest to both the research communities. However, it is challenging to process and analyze large volumes of raw Steemit data as it requires specialized skills in both software engineering and blockchain systems and involves substantial efforts in extracting and filtering various types of operations. To tackle this challenge, we collect over 38 million blocks generated in Steemit during a 45 month time period from 2016/03 to 2019/11 and extract ten key types of operations performed by the users. The results generate SteemOps, a new dataset that organizes more than 900 million operations from Steemit into three sub-datasets namely (i) social-network operation dataset (SOD), (ii) witness-election operation dataset (WOD) and (iii) value-transfer operation dataset (VOD). We describe the dataset schema and its usage in detail and outline possible future research studies using SteemOps. SteemOps is designed to facilitate future research aimed at providing deeper insights on emerging blockchain-based social media platforms.

Read the rest from Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy: SteemOps: Extracting and Analyzing Key Operations in Steemit Blockchain-based Social Media Platform

Related:


100% of this post's author rewards are being directed to @penny4thoughts for distribution to authors of relevant and engaging comments. Please join the discussion below in order to be considered for a share of the liquid rewards when the post pays out.

Check the #penny4thoughts tag to find other active conversations.
Sort:  
 3 years ago 

This will be something similar to your weekly blockchain data graph report that you usually do weekly or is it something more in-depth.

My report is pulled straight from the blockchain, which has advantages and disadvantages. It also incorporates market price information from CoinGecko. The SteemOps dataset is different because it s a static copy of blockchain data that people can download and use for their own analysis.

Pro:

  • Well-defined, structure makes it easier for people without blockchain knowledge
  • Should be faster, since lots of unnecessary data is filtered out

Con:

  • Information ends in 2019
  • Only a subset of blockchain data is available via SteemOps

This reminds me that I haven't published that report in a while. I'll try to remember to do it on Friday.

 3 years ago 

I will be open to reading your report. I have seen one from you since I discovered your account so I think you haven't made one in a long time. I scrolled through your old posts and found this. Is it one of the reports? 'cause it looks like a good report to me.

Yep. That's the one. I'm still collecting the numbers, but I just haven't had time on Friday evenings in a while to sit down and publish a post. Hopefully, I should be able to do it this week.

 3 years ago 

Great. I will look forward to reading it.

 3 years ago 

This is impressive. The research is indirectly putting Steemit on the map. Many people would want to experience how it operates, and it could possibly attract more investors to the Steem ecosystem. I tried downloading the full PDF of the research but it seems premium membership is required. I guess I will watch the free supplementary videos later.

You can download the PDF for free from arXiv, here - https://arxiv.org/pdf/2102.00177.pdf.

 3 years ago 

Thanks. I have downloaded it. I am happy there are only 6 pages. Research papers are notoriously known for their copious length. With 6 pages I will be done in no time should I start going through them.

I agree about the page length. This one is actually a pretty quick read. The previous one that I linked to has something like 22 pages, and I still haven't finished it.

 3 years ago 

I can understand...With those small font sizes and the 22 pages, reading it may even feel like a punishment. Under such situations, note taking may be required to keep track of the bigger picture of the publication.

I finally made it through, but it took a while. It was an interesting article. My favorite idea was an idea to add a capability for users to (positively) flag comments as "constructive" in order to raise the constructive commentary above the trolls.

One thing that is really bizarre to me is that survey respondents seemed quite comfortable with submitting to censorship.

That's too big of an ask for a typical user, though. Automating it would likely not bee too difficult, and would get the job done without depending on users to do the work.

Probably true, but I think that automating it might also turn out to be pretty challenging. AI isn't great at dealing with nuance yet.

Maybe a hybrid model, where you put the option there but recognize that some people will be better at it (and more willing to participate) than others, so you augment user input with AI.

 3 years ago 

flag comments as "constructive" in order to raise the constructive commentary above the trolls.

This is an excellent idea. Quite often, people use the words troll and constructive criticism interchangeably, but they are two different things. These labels will help them stand apart.
Also, censorship on a decentralized network seems out of place to me.

I find it excellent with the structuring of the data, and taking into account the levels of interpretation of those who benefit. Thank you.