Tech reporter reveals first look at Twitter's long-awaited "Safety Mode"

Jun 28, 2021 9:38 AM EDT

Tech reporter reveals first look at Twitter's long-awaited "Safety Mode"

Reverse engineer Jane Manchun Wong managed to activate Twitter's long-awaited "Safety Mode" and block spammy followers instantly.

By Anisha Hoppe

San Francisco, California - Reverse engineer Jane Manchun Wong managed to activate Twitter's long-awaited "Safety Mode" before its release. Her followers helped her in an experiment by posting bullying comments – and managed to get themselves automatically blocked in the attempt.

Twitter is close to rolling out an anti-trolling feature that will block abusive commenters from contacting particular accounts for a full week (stock image). © 123RF/Tomasz Śmigla

Jane Manchun Wong has often been able to reveal the inner workings of various platforms by going into their code and discovering what companies have planned, often before they're even announced.

She's done it again with Twitter, managing to uncover and activate the long-rumored "Safety Mode" feature over the weekend.

The company hinted back in February that a feature was in the works that would protect people from spam, rude slurs, and other trolling behavior. Wong says a user pointed out to her that they had been blocked from commenting on her account, which led to the discovery that she had somehow activated the yet-to-be-released Safety Mode.

To test the theory, various followers commented words or slurs that would typically get users banned to see what happened. Not only were their comments filtered, 40 of them were subjected to a full seven days of being blocked from commenting on Wong's Tweets.

Her followers had to resort to alternative accounts to be able to directly message her, and she then had to manually un-block each user.

The new feature is going to need some controls

Twitter is making a lot of changes, and it's worth checking your settings in coming weeks to make sure your followers haven't gotten themselves accidentally blocked (stock image). © 123RF/ bloomua

Wong says that some of the comments that got her followers banned weren't particularly offensive. Other platforms like Instagram have released features to fight online harassment through filters and key word lists. Twitter likely will need to follow suit and provide users with a customizable word list, as some are more lax than others their opinions of what constitutes trolling.

Twitter does currently provide a "muted words" list, but this only prevents you from getting notifications about Tweets containing those words.

The platform says that its algorithms block up to 50% of abusive replies before many users even read them to report them.

However, should you notice that you're not seeing replies from your usual commenters, be on the lookout for the addition of the new safety features in your Twitter settings, just in case they've been blocked without warning.

Cover photo: 123RF/Tomasz Śmigla