Tag Archives: Mastodon

On Scraping Mastodon

2020-01-27 Blogghoran Leave a comment

Mastodon was scraped, again. It was not the first time it had happened, and it probably wont be the last. This time it was for research, not just archiving which we had encountered in the past. The actual scraping happened in 2018, but the research was recently published, and this is why we’re talking about it now.

Background:

The research article, “Mastodon Content Warnings: Inappropriate Contents in a Microblogging Platform”, was written by authors from the Computer Science Department, University of Milan. The same group of people have previously published another research article related to Mastodon, “The Footprints of a “Mastodon”: How a Decentralized Architecture Influences Online Social Relationships”. In their previous paper they also had a lot of misunderstandings of the technology as well as the culture of Mastodon.

While it is tempting to do a complete analysis of the research, in this post I will point out a few issues with it, both from a technical perspective and an ethical one. In doing so I will reference and quote a few sections. However, it will not be a full analysis of all of the paper.

They wrote that they hashed the usernames, but included the URI of the posts in their database, which has the username in it. — Screenshot from Mastodon

The research papers both contained datasets: the first one had focused on meta data; and this last one’s dataset was match-able with the previous one, even though it was “anonymized”. However, it was brought to my attention that their anonymization was pointless, because the username was still in the URI.

The 2nd dataset, for the latest research paper, has been removed from online access with the comment:

“Deaccessioned Reason: Legal issue or Data Usage Agreement Many entries in the datasets do not fulfill the law about personal data release since they allow identification of personal information.”

Does this mean that they did not take any of these things into account when they wrote the paper to begin with? If we look at their ethical and legal considerations we can see that they half-considered it, and I would argue missed the mark. The way most people were talking about it, it did not actually seem like they even had made any ethical nor legal considerations in their research. Reading them, I realized that they probably would’ve been better off if they had written the legal consideration first, and then have that inform the ethical consideration.

Legal and Ethical Considerations

In the legal consideration, they said that from what they had gathered they had not found anything in the ToS (Terms of Service) of the standard agreement, bundled in with a Mastodon installation, indicating that they were breaking it by doing this gathering of data. I would like to argue that there may be ethical considerations about not technically breaking any legal barriers. What do I mean when I say this? I’m trying to convey that the legal considerations could have also had ethical concerns. As the saying goes: just because you can do something doesn’t mean you should.

In the legal section they also write:

“In the terms of service and privacy policy the gathering and the usage of public available data is never explicitly mentioned, consequently our data collection seems to be complaint with the policy of the instance.”

I can understand that if a legal document does not explicitly mention something you may feel like you have free rein. Stating that there is nothing explicitly mentioned, may indicate that there’s something implicit that they chose to ignore. However, they do not elaborate. If they had followed the legal considerations up with the ethical considerations, maybe they could have discussed the ethical implications of the decision they made there.

Further, they do recognize that each instance has the ability to adopt their own Terms of Service (ToS), but then seemed to have not followed through and actually checked if any of these 300 something servers had added their own ToS. I feel like there’s a clear disregard for the possibility of there being other ToS. With no indication that they checked a certain % (say 10%) of the listed servers and their ToS, which would have showed that a clear “majority” used the standard ToS. They could have recognized what differences do exist. I feel like there was simply an assumption rather than actual research done for this part.

Did they make any ethical considerations? It seems to mostly reflect the collection methodology, rather than answering any ethical questions, such as:

Would the users of Mastodon want to / expect to have their data scraped?
Would it be better to ask servers/users if they would want to participate in the research?
Is this research actually a Computer Science research, or should it be a Social studies research paper, taking into consideration such ETHICAL questions?
Should Computer Science have mandatory ethics courses?

Credit where credit is due: The last question is lifted from several people on the fediverse who’ve asked it before this research paper was published, and continued to ask after it was published.

I think the biggest issue here, is that because these researchers do not seem to understand some of the culture on Mastodon (no there’s not only one culture, but there are some which come to mind for me) and have some basic misconceptions about the community and software, it was hard to come to any useful ethical considerations. Would they have allowed themselves to come to the conclusion that they should not publish their paper? Probably not.

Technically the Content Warning

While there are two research papers available to me, I only want to focus on the misconceptions in this research paper: “Mastodon Content Warnings: Inappropriate Contents in a Microblogging Platform”. I believe that their entire conclusion is way off because they simply misinterpreted how a feature is used on the servers.

In their methodology they described how they interpreted the technological “sensitive” field in the meta data:

“each toot provides the fields related to the inappropriate-ness of its content, namely the entries “sensitive”, “content”,“spoiler-text” and “language”. The boolean field “’sensitive” indicates whether or not the author of the toot thinks that the content is appropriate. If the toot is inappropriate, the field is set up to “True” and the field “spoiler-text” would contain a brief and publicly available description of the content.” (Sic)

Correction: The sensitive tag happens when someone adds a Content Warning to their post. The sensitive tag says nothing about the actual content, and what the person thought about it when they did us (I’ll elaborate on what Content Warnings mean culturally on Mastodon further down).

However, they had interpreted the technical function of content warnings correctly, with this first two sentences:

“By clicking on the “CW” button, a user can enter a short
summary of what the ”body” of her post contains, namely a
spoiler-text, and the full content of her toot. Automatically,
the system marks this toot as “sensitive” and only shows the
spoiler-text in all the timelines. (…)

The next part was unfortunately where one of the misinterpretations of the data happened:

“(…) We exploit this latter feature
to build our released dataset. This way the toots are labelled
by the users, and we assume that they are aware of the policy
of the instance and aware of what is appropriate or not for
their community.”

This section emphasizes that they believe that the Content Warning is only used to mark content as sensitive if it’s inappropriate, and if it does not belong on the server. Correction: If the content does not belong on the server, the users is most likely going to be banned.

This point was an reiteration of the previous statement in the methodology:

“Here we describe the collection methodology of the two main elements of our dataset: i) the instance meta-data and ii) the local timelines of all the instances which allow toots written in English.

…

Specifically, we are interested in the full description of each instance and the list of allowed topics. From our viewpoint, these two fields contain the information related to the context which makes a post inappropriate or not.”

The misinterpretations seem to be stemming from assumptions, rather than research, about how the technology is used, what the “sensitive” tag actually means, and how it’s used on the over 300 servers used. This leads me to the cultural and social misinterpretation.

The Social Construct of the Content Warning

I believe that the biggest issue is that this research was in computer science, without any social science involved, with no consideration to the social part of social media. I’ve already noted that their assumption and interpretation is incorrect, so how are the Content Warnings used?

While I only have the empirical evidence from the servers I’m connected with, I’m still going to go out and say that: Content Warnings are in fact not used for content we do not believe belong in our communities.

Rather, Content Warnings can be used in many ways. One way to describe it is simply as a subject line, similar to email. In some cases we will talk about more sensitive subjects, like addictions, drugs, war, news, politics. This is not to hide the content, but rather to offer the people reading it a chance to decide if they want to open it or not. If today is a day where reading about US Politics would just drain all my energy, I can choose to not open it.

We can also use it for other things, that may be slightly sensitive to some, like food, meat, sex, nudity, private, venting (of emotions). It’s also common to use for post about money, house-hunting, mental and physical health, very positive emotions and very negative emotions. In some cases it offers us a chance to unburden ourselves, without dumping those emotions onto someone who is not given a fair chance to prepare themselves for it.

There are other fantastic uses for Content Warnings, one which is especially dear to the community’s heart is as a setup for a joke. Some times the same CW will circulate in a meme like fashion, and contain things that make us giggle. Another common one is as spoiler warnings for Movies or TV series, or even books or other readings. You can then use the headline to tell everyone which TV series you’re about to talk about, and also denote which episode. This was great towards the last year of Game of Thrones for example, when a lot of people would be talking about it the day of the new episode.

So, to emphasize, we do not post Content Warnings because we believe the subject is inappropriate, we just want to offer the reader of the post the chance to give informed consent. And using informed consent, is something which I believe the authors of the research could take a lesson from.

This article was supported by my patrons. If you enjoyed it and would like me to be able to write more of them, feel free to head over to my patreon page and pledge your support!
Alternatively, check out my support page for more info.

Writing

Mastodon Account Migration Turned Malicious

2020-01-25 Blogghoran Leave a comment

A new feature was added to Mastodon, and that was account migration. I moved accounts from mastodon.social to elekk.xyz 1.5 years ago and I was wondering how my followers would look like if I migrated them over. Prepare for epic mistake, a realization of unintended consequences of the migration feature, and what we can try to learn from it?

I did not reflect on this until after I saw how many accounts I was forcing to follow my “new” account this morning while I was pressing accept on them. I had started the process last night mostly thinking “it can’t be that much can it”. Watching all the follow requests (because my current account is private) made me realize that I was forcing a lot of accounts, who may have chosen to unfollow me or not follow my new account, to follow me. In light of that I decided to use this opportunity to write a blog post about it, to actually share my findings and thoughts about the whole process.

Important things to note: Mastodon.social does not seem to purge old accounts, at any point. (I don’t remember if this is a software issue that it’s not available, but the admin of mastodon.social is also the main developer of the software.) I think maybe because people use it as a backup, which makes sense. But I still strongly believe that they could send out a warning email after 3-6-12 months saying that the account will be removed if they don’t log in within say 3 months from that email. This would mean that I probably wouldn’t still have access to my old account, which I don’t use and haven’t really used for 1.5 years, and I would not be able to do what I just did.

What is Account Migration on Mastodon?

Simply put, it’s a built in feature, where on your new account you tell it that you’ve moved from another account.

In a few steps: Account A is my previous account, Account B is my new account. Account B sets “I have migrated from account A” in it’s settings, and allows for the ability to migrate in followers. On Account A you start the migration by saying that you’ve now moved to Account B. As soon as you start the process the Account B’s server will probably chug for a bit as it starts processing the requests from Account A’s server (I’m unsure in which way it’s doing it exactly).

Account Migration is Good

Being able to migrate your account is great. And a few days ago I noticed it when someone wrote “I guess I’m posting here now” and I was already following them. I was first a bit confused, and then realized that they had used the migration tool, and I was pleased to see that I was following that new account immediately and not missing out.

Recently, I had caught someone’s “This is my last post from this account, if you haven’t already you should follow my new account”, I can’t remember if it was on Twitter or Mastodon. But it happened, and I was happy they reminded us, because sometimes you just don’t see those things.

These are some obvious cases where it’s good as we’re able to keep following people we’ve chosen to follow. But what happens if we chose to not follow their new account, or later unfollowed it, and they do this migration late?

Malicious by Mistake

What I ended up doing last night is definitely malicious use of this current feature, even if this was never my intent when I started the migration. I was mid-through accepting followers when I realized that this was turning into a very malicious use of this feature. I want to apologize for that, I’m sorry that I ended up doing this, and following through with it. When I realized, I did still followed through because I felt it was an important part of the process, and would yield useful data as allow me to process what was happening, and what to write in this post about it.

It is possible that you saw the link to this post when I posted it after having accepted the follow requests that you did not make. Because as soon as it hit me, I realized I needed to not post anything until I had finished off with this blog post, so you could read about what happened.

Lessons to Learn

I will definitely not do something like this again, where I migrate very old account followers. Why did I not just stop immediately when I realized? I think I was mechanically just going through the process, and doing so allowed me to figure out all the things that was wrong with it. And being half way through my brain kept insisting that we finished, because there’s no other way to remove follow requests.

Do migrate your followers, when you’re on an active account. But do not migrate your followers from an account that has been inactive for 1.5 years.

Suggested Improvements

There’s a few improvements that I’ve been mulling over with regards to the follower and follows management of Mastodon, and in some regard they also apply here.

Do not migrate followers that are other migrated accounts (I’m not 100% sure if it does this, but it seemed like it may have.)
Offer option to only migrate mutual followers.
Allow Account C (the follower account) to receive a follows suggestion or request instead of doing it “seamlessly”
Do not allow accounts that have been inactive for 6-12 months migrate their followers.
Do not migrate inactive accounts (maybe allow user to set a time frame, 3 – 6 – 12 months.)
Allow private accounts to mass reject follow requests (because right now I’m stuck with a long queue.)

Do you have any suggestions for improvements?

If you enjoyed this article and would like me to be able to write more of them, feel free to head over to my patreon and pledge your support!
Alternatively, check out my support page for more info.

Musings

On Mastodon and Nazis

2019-07-12 Blogghoran 1 Comment

For the past 2 years Mastodon has been promoted as a place without Nazis. Anyone familiar with social media technology knows that it’s not necessarily possible to entirely make such a promise, especially with a network which allows users to set up their own village to invite their friends.

The Fediverse is the interconnected villages of decentralized alternatives of popular social networks such as YouTube (PeerTube), Twitter (Mastodon), Facebook (Hubzilla), SoundCloud (FunkWhale), Instagram (Pixelfed), to mention a few.
It isn’t immune to Nazis, but offers the tools to everyday users, and local leaders (administrators and moderators) to protect their village from them. On Twitter you can report, and block, but then you have to sit around and wait for that content to maybe get removed or maybe not. On Mastodon you get the chance to join a village, where you know that the admin has made a promise to you that Nazis, racist, or homophobes etc. aren’t welcome there. If your admin doesn’t fulfill this promise you have the power to move to a different village. With Twitter you simply can’t do that.

Nazis on the Fediverse: Gab

On the 4th of July, a big group of Nazi’s migrated into their own little village: Gab.com. They used Mastodon’s software to run the village. Gab has been a home to Nazis for a very long time, and anyone who’s been keeping an eye on social networks that keep popping up knew that their policies would welcome a lot of dangerous people. Gab the Social Network actively encourages people to harm other people, and let people run loose with harassment, all in the name of Free Speech. They have also been directly linked to a mass shooting. Yes, we could argue that mass shooters have been on Twitter and Facebook too, because duh it’s social networks. The major difference is, this place has become a breeding ground for these kind of ideas, and they are actively encouraged.

The Vice Article

This migration into the Fediverse by these racists and Nazis caught the interest of VICE, who wrote an article now proclaiming that Mastodon “the nazi-free alternative to Twitter, is now home to the biggest far right social network”.

This is incorrect. While Gab has made their home in the Fediverse, they are not the biggest instance. The Vice article utilized numbers from fediverse.network displaying user count to decide that gab was the largest instance on the fediverse.

A list of the top 5 instaces by user count on the fediverse — list of instances sorted by user count

The marked instance in 3rd place, is the Mastodon Flagship instance. The instance in 2nd place is pawoo.net which is a Japanese equivalent to DeviantArt.

How can an instance so new have so many users?

995391 users. Here’s the tricky part, they don’t. Not really. Basically what they did was migrate all the existing accounts from Gab. Simply just importing all existing accounts, including suspended and inactive ones, all old beta accounts from 2016 (because as far as I know they have not actually cleared any of those old accounts). So this number, while it sounds incredibly big doesn’t translate to much in activity:

Comparatively they are not nearly as high up, but still fairly big. There are a few ways to spoof and fake numbers that show up for these stats. The below screenshot was taken just a few moments ago (and less than an hour after the above ones), here banana.dog is on the top of this list:

Eugen (creator of Mastodon) points out himself that:

Gargron commenting on Active User count numbers being removed from Gab. — toot by Eugen about Gab removing Monthly Active Users

“Gab already removed the Monthly Active User counter from their frontpage (a default Mastodon feature). That’s easier than faking active user numbers I suppose” — Eugen

Their public timeline is also filled with spam posts, for accounts which haven’t been suspended, and even if those accounts were suspended they would still count as a body for the user count.

Is the Fediverse riddled with Nazis now?

No it’s not, unless you join a village which actively wants to communicate with them. First, let me cover how Gab migrated to the fediverse, and what that means for communication. Simply put, Gab installed a radio station (Activity Pub), by making a copy of the Mastodon software, and making it their own. This means that they can now call all the other villages if they so please. Or at least attempt to call the other villages. A major part of the Fediverse and Mastodon servers prepared by preemptively blocking gab.com, before they officially joined on the 4th of July. By blocking them, we’re effectively not listening to their radio station.

Unfortunately because the radio waves are publicly available, they are still able to listen into us, and “interact” with our radio shows (Public Posts), on their side of the fediverse, even if we refuse to listen to them (by blocking them). This is a flaw in the current design in the Mastodon software, and to some degree the Activity Pub (the radio waves). There is a lot of people on here who are working on the software, or are at least interested in it are working on different ways to deal with this issue, and hopefully we’ll be coming up with even more creative solutions in the future.

To use Eugen’s own words. Mastodon has still hard-lined against Nazis, and their fairly new covenant, enforces that by deciding which servers JoinMastodon.org will advertise for. If you don’t follow the covenant you wont be featured, if you’re a racist / Nazi instance you wont be featured.

On top of that there has been massive efforts between instance (village) admins to organize against this influx of racist and Nazi users. There are even apps developers have decided to block gab.com users from connecting through the app (eg. Tusky and Sengi — full disclosure, I merged the feature to block gab via Tusky as I work for that app). And users are actively sharing lists of Fascist-harbouring instances that they have blocked.

We are still here, and we’re still fighting Nazis and by no means welcoming them into our midst.

If you enjoyed this piece of writing, and would like me to be able to write them more, feel free to head over to my Ko-fi or my Liberapay and throw me a little coin.
Alternatively, check out my support page for more info.

Musings

Mastodon, compassion vs Facebook, in your face.

2018-11-17 Blogghoran Leave a comment

After having spent a good 1.5 years on Mastodon, I feel like I just get bombarded with crap on Facebook that I don’t want to see / not comfortable with seeing.

Why? It’s not because my friends are bad people, it’s because Facebook doesn’t offer a way for my friends to add content warnings which protect the images.

On Mastodon, while it has it’s flaws, you can choose to put up a warning for what your content contains.

You can use this for Trigger Warnings, Sensitive Subjects, Food, and even SPOILERS for movies/series. Or just put your nerdy discussions behind it, and let people opt in to see it.

People will only see it if they click through, and it’s such a different experience. Even though I mostly almost always click through I find that when I’m prepared it’s a lot easier to deal with.

It allows the people posting to be cognizant about what they put out there, and how it presents to other people. It makes a lot of the people on the platform a lot more compassionate, than I’ll ever see here on Facebook unfortunately.

ForkTogether

The first #ForkTogether meeting, and what went wrong

2018-07-10 Blogghoran Leave a comment

On the 30th of June, we had the first Fork Off Together meeting, for which the goal is to fork off from the Mastodon project. The idea had been simmering for a while, and the required logistics was a lot bigger than one person could do on their own, yet, I tried to do it on my own.

Let me explain, I was not doing it on my own per se, but rather I was doing a lot of the preparations for this one meeting alone, even if I had two people that I worked fairly tight with, at one point my head just got too tired to properly communicate with others about what help I needed, so it more or less got easier to “just do it myself”, or ask my live-in boyfriend for help, as I could point and grunt at things, when words wouldn’t come out properly.

So, what went wrong with the first meeting?

To start off, over all it was a good experience, but we definitely had some teaching moments which we seemed to, as a group, react well to.

However, I want to start by pointing out what went wrong from my side ie what I could have done better or different. This isn’t about placing blame, but rather a reflection on why I did it the way I did.

So, my initial idea was based on something that I had experienced and learnt when I was active in the Pirate Party here in Sweden between 2009-2014. The organization had a way, which is common (from my understanding) for certain types of organizations, namely the type that has a lot of smaller organizations under the same umbrella. Eg. political org or youth organizations that wants to try and get funding for their work locally.

Having this kind of meeting, is a way to make it easy to start up one of those new small orgs, only requiring 3-5 people, and being youth orgs it meant that they could get a little money from the government. This also doubles as a means to encourage youth to get engaged in activities which will in the long run keep them too busy and away from crime, (but don’t quote me on this, this is just my general understanding of the concept).

What I tried to do was leverage that knowledge I had, to have our own startup meeting, and jeebus I had to try really hard to not accidentally use that term.

In the political org case, it was easy to adopt the same bylaws, coc, operational plan etc. because we were all part of the same organization. This was definitely my first mistake.Unfortunately I didn’t mentally connect the dots until the actual meeting, and I couldn’t have done it different at the time.

I need to highlight here, that the accepting the bylaws and things during the first meeting of this kind was based on it being a sub-association of a bigger org. There wasn’t supposed to be a need to do to much with the bylaws and if there was it would’ve been done before the meeting.

In my foggy mind I didn’t get this out in time and worded correctly. Heck, I had even said “no we won’t draft bylaws” before I realized the translation in full. I could have, and probably should have, checked myself when I realized that that document translated to bylaws. But I didn’t.

Another member of the meeting wrote some good reflections about this type of meeting too.

If that was my first mistake, what was my second one?

I thought that I could distance myself from responsibility and active choices by leveraging that I was just inviting people to a meeting. I didn’t want to make decisions for us, and this was the only way I knew how.

This isn’t as much a mistake as it is paradoxical. Mostly because either way I make decisions and it becomes a really weird situation. Especially if I couldn’t get all the info out of my head as fast as the questions came my way.

About 14 days in I was able to entirely fall apart, and did some public spectacle which didn’t reflect well on me, and also ended up possibly harming the project, I pushed away some people I really wanted in on the ground floor.

I could make excuses, and I could try to explain myself, but it won’t change anything. However, what I can do is recognize that I did screw up and that I can do better in the future. I understand my why, and that means that I can take preventive measures.

So, what preventive measures can I take in the future?

A big one, delegate. While we were 3 people working together in the early days, the same people who’ve also rejected any direct involvement in management, or interim-committee or the committee / board for the first year of this project, I did a bulk of the work and had trouble getting stuff out of my head.

When I felt like I was about to entirely break, the incident referenced above, I should’ve let go right there and just set up a Discord server and invited everyone, and continued to contribute to the group in their process of preparing a meeting together etc.

But at the same time, if I hadn’t done the meeting the way I did, I would not have learnt the lessons I did, so this is a double edged sword, imo.

So, what went right?

I took my time, and worked through it slowly. I built small road maps for myself to guide me along the way and asked for help when I felt stuck.

I need to remind you all that the survey blew up way bigger than I had ever expected. I think by the end we had almost 200 responses to the survey, and over 120 saying “let’s do this”. [link to the shared data on June 11th]

I couldn’t have planned for that, but when it happened I tried to baby-step my way through it.

The meeting, even though it was long and had it’s issues, was also pretty damn fantastic. The way I had translated Swedish meeting formalities to a discord server turned out to work pretty well, and once people got a hang of it they seemed to appreciate the somewhat rigid structure.

I hope, that using this experience I can create a template for hosting a first meeting when a group of people want to start an org together, and maybe I can help someone else avoid some of the problems that we encountered. Because there’s some solid structure here that definitely can be reused. That said, I will be publishing a separate post about the actual meeting structure and how set it up.