Twitter To Depend On Automation For Moderating Harmful Content

Elon Musk’s Twitter is relying heavily on automation to moderate content, eliminating certain manual reviews and preferring distribution restrictions over outright censorship, according to the company’s new head of trust and safety, who spoke to Reuters.

Twitter is also restricting abuse-prone hashtags and search results in areas such as child exploitation, regardless of the potential impact on “benign uses” of those terms, according to Ella Irwin, Twitter Vice President of Trust and Safety Product.

“The biggest thing that’s changed is the team is fully empowered to move fast and be as aggressive as possible,” Irwin said, in what was the first interview a Twitter executive since Musk’s acquisition of the social media company in late October.

Her remarks come as researchers report an increase in hate speech on the social media service following Musk’s announcement of an amnesty for accounts suspended under the previous leadership of the company that had not broken the law or engaged in “egregious spam.”

Since Musk slashed half of Twitter’s staff and issued an ultimatum to work long hours, which resulted in the loss of hundreds more employees, the company has faced pointed questions about its ability and willingness to moderate harmful and illegal content.

Furthermore, advertisers, Twitter’s primary revenue source, have abandoned the platform due to concerns about brand safety.

Musk promised “significant reinforcement of content moderation and protection of free expression” in a meeting with French President Emmanuel Macron on Friday.

Irwin stated that Musk encouraged the team to be less concerned with how their actions would impact user growth or revenue, stating that safety was the company’s top priority. “He emphasizes it every day, multiple times a day,” she explained.

The safety strategy According to former employees familiar with the work, the changes Irwin described reflect, at least in part, an acceleration of changes that had already been planned since last year around Twitter’s handling of hateful conduct and other policy violations.

One approach, encapsulated in the industry mantra “freedom of speech, not freedom of reach,” entails keeping certain tweets that violate the company’s policies up but preventing them from appearing in places such as the home timeline and search.

Before Musk’s acquisition, Twitter had long used such “visibility filtering” tools to combat misinformation and had already incorporated them into its official hateful conduct policy. The approach allows for more free expression while reducing the potential harms associated with viral abusive content.

According to the Center for Countering Digital Hate, the number of tweets containing hateful content on Twitter increased sharply in the week before Musk tweeted on Nov. 23 that impressions, or views, of hateful speech were declining – in one example of researchers pointing to the prevalence of such content while Musk touts a reduction in visibility.

Tweets comprising of anti-Black words were triple the number seen in the month before Musk took over, while tweets containing a gay slur were up 31%, according to the researchers.

Irwin, who joined Twitter in June and previously worked in safety roles at and Google, disputed claims that the company lacked the resources or willingness to protect the platform.

She stated that layoffs had no significant impact on full-time employees or contractors working on the company’s “Health” divisions, including “critical areas” such as child safety and content moderation.

According to two sources familiar with the layoffs, more than half of the Health engineering unit was laid off. Irwin did not immediately respond to a request for comment on the assertion, but previously denied that layoffs had a significant impact on the Health team.

She also stated that the number of people working on child safety had not changed since the acquisition, and that the team’s product manager was still in place. Irwin stated that Twitter backfilled some positions for people who left the company, but she declined to provide specific figures for the turnover.

She claimed Musk was more focused on using automation, arguing that the company had previously erred by relying on time- and labor-intensive human reviews of harmful content.

“He’s encouraged the team to take more risks, move fast, get the platform safe,” she said.

In terms of child safety, Irwin stated that Twitter had shifted toward automatically removing tweets reported by trusted figures who have a track record of accurately flagging harmful posts.

Carolina Christofoletti, a threat intelligence researcher at TRM Labs who specializes in child sexual abuse content, said Twitter has recently removed some content as quickly as 30 seconds after she reports it, without acknowledging receipt of her report or confirming its decision.

In an interview on Thursday, Irwin stated that Twitter, in collaboration with cybersecurity firm Ghost Data, removed approximately 44,000 accounts involved in child safety violations.

Twitter is also restricting hashtags and search results that are frequently associated with abuse, such as searches for “teen” pornography.

She claimed that previous concerns about the impact of such restrictions on permitted uses of the terms were no longer valid.

The use of “trusted reporters” was “something we’ve discussed in the past at Twitter,” according to Irwin, “but there was some hesitancy and frankly just some delay.”

“I think we now have the ability to actually move forward with things like that,” she said.

(Adapted from


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s