Nov 28, 2017 12:19 PM

How Bots Broke the FCC's Public Comment System

The FCC's net neutrality public comment period was overrun with bots, making it all but impossible for any one voice to be heard. That's not how this is supposed to work.

Image may contain Accessories Tie and Accessory

On a single day in late May, hundreds of thousands of public comments poured into the Federal Communications Commission regarding its plans to roll back net neutrality protections. A week and a half later, on June 3, hundreds of thousands more followed. The spikes weren't the voices of pro-net neutrality Americans, worried what will happen if the FCC allows internet service providers to block and throttle content whenever it so chooses. In fact, they weren’t really voices at all.

According to multiple researchers, more than one million of the record 22 million comments the FCC received were from bots that used natural language generation to artificially amplify the call to repeal net neutrality protections. That number may only represent a fraction of the actual bot submissions. The New York Attorney General's office is currently investigating their source.

But while reports so far have focused on bad actors flooding the FCC with phony content, some of those same techniques also allowed legitimate groups, like the Electronic Frontier Foundation, to tell their members to click a button and send an auto-generated—albeit earnest—comment to the FCC, creating a groundswell of activism among actual humans. The result: A net neutrality comment period that garnered more input from the public than all previous comment periods across all government agencies—combined.

“It makes it easier for people to speak out, but much more difficult for them to be heard,” says Zach Schloss, an account manager at FiscalNote, a government relationship management company that’s been analyzing the FCC’s comments.

Now, as the commission attempts to sift through this unprecedented abundance of comments, discerning the legitimate from the bots could prove an insurmountable task.

Bots on Both Sides

The net neutrality comment debacle illustrates a central challenge of managing open platforms in an age of automation. Bots are overtaking the very system that’s supposed to give consumers a say in the rules that govern them, but weeding them out may jeopardize legitimate comments.

It’s a conflict platforms like Facebook and Twitter also face, as they work to eradicate fake or spammy activity on their platforms. Except unlike those companies, the FCC and other government agencies are bound by law to give the public a chance to participate in the rulemaking process. They're also required to consider “the relevant matter presented” in those public comments. When bots dominate the system, they drown out those relevant comments. And as language generation tools grow more sophisticated, they become harder to weed out. For a government legally required to hear out its constituents, this confusion is a brewing crisis.

“The current state of the art in natural language generation is fairly robust and genuine-sounding,” says Vlad Eidelman, FiscalNote’s vice president of research. The company analyzes the entire history of public comments to help business clients predict new changes to government regulation. “You could generate a lot of comments that would seem legitimate, feel legitimate, and come from legitimate email addresses, but would not be representative of the public voice.”

FiscalNote analyzed all 22 million net neutrality comments, and found a number of suspicious patterns emerge among them. For starters, there was the historic volume. There was also the fact that so many comments came in on just two days: May 23 and June 3.

Those abnormalities alone weren't enough to conclude that the comments were fake. To determine that, FiscalNote’s researchers used natural language processing techniques to cluster the comments into groups. They divided them by sentiment—whether they were for or against net neutrality. They separated out comments that were identical or nearly identical, judging them to be form letters, which advocacy groups often prompt their members to submit. They also analyzed comments that touched on the same themes without duplicating the text exactly, to find similarities in their structure and word usage.

What they found were hundreds of thousands of comments with identical sentence and paragraph structure that used different words to communicate the same message. Think of it as a Mad Libs guide to influencing the regulatory environment. Every comment could be produced by picking a word or phrase from a couple dozen options, and stringing them all together to create a paragraph.

For instance: Swap the word “regulate” for “control” in this sentence, and you’ve got two unique sentences.

I advocate Ajit Pai to rescind The previous administration's plan to control the web.

I advocate Ajit Pai to rescind The previous administration's plan to regulate the web.

Swap out “the web” for “the Internet,” and you’ve got another.

I advocate Ajit Pai to rescind The previous administration's plan to control the Internet.

The bots behind the comment assault bundled together these sentence variations to form short comments. Each one was distinct, but they all included 35 phrases arranged in the same order, with up to 25 synonymous words and phrases filling in each blank. FiscalNote found hundreds of thousands of comments that fit this pattern, but there are 4.5 septillion possible combinations of words and phrases that the bot could have chosen from to draft those comments.

Fiscalnote

And that's just one of the patterns FiscalNote detected. The researchers found another series of pro-net neutrality comments that followed similar patterns. This time, though, they were connected with the group Electronic Frontier Foundation, which created a website called dearfcc.org. It asked people to submit a comment opposing the repeal to the FCC, and auto-generated a message for them. Those auto-populated comments varied user-to-user.

For instance, just one paragraph from that comment contained the following possible options, according to FiscalNote's research:

In the EFF's case, automated tools helped real people get their messages across more efficiently, a far cry from the bots that generated at least a million fake comments. The effect still helped overwhelm the FCC's comment system. “People have been pointing out the illegal or nefarious use of bots to spam the FCC,” says Eidelman. “But automation cuts both ways.”

Starting From Scratch

That’s what makes this such a dicey issue for the FCC—and all the other government agencies required to take public comments. It’s a relatively new one, too. Back in 2015, when the FCC passed its net neutrality protections and opened its electronic comment filing system, the biggest concern was managing capacity, says Gigi Sohn, a former advisor to former FCC chairman Tom Wheeler. According to Sohn, there were some “back office conversations” about whether the FCC ought to delineate what is and is not a legitimate comment, but, she says, “It never got moved forward.”

She acknowledges, however, that fixing the problem isn’t as easy as a system upgrade. “Having seen it from the inside, I think they have to start from square one,” Sohn says.

That could mean the FCC institutes some kind of two-step authentication system, to ensure commenters are real people, for example. FiscalNote, meanwhile, is working on a tool that would score each comment based on how likely it is that the FCC would take it into serious consideration. Called a "gravitas score," it's based on the company's analysis of decades of public comments. FiscalNote looked into what it takes for a public comment to get a shout out in the FCC's final rule, and found that often, only comments that include a serious legal argument or are affiliated with some known entity like a big business or academic institution, make their way in. By that measure, a comment's gravitas score would be higher if, say, it was written by Verizon's general counsel.

"Our hypothesis is that agencies pay attention to those comments much much more than any individual submitter," Eidelman says. Creating some kind of hierarchy would at least help the agency sift through the 22 million comments—a number that makes it impossible for the FCC to actually vet each one.

Of course, such a system would present its own problems. For starters, it would be easy enough for bad actors to game, once they understand what it takes to catch the FCC's attention. But there's a more fundamental issue at play. It may be true that the FCC weighs lengthy comments submitted by lawyers and businesses more heavily than it does short comments written by the public, that's not how the system is supposed to work, says Sohn.

"So, if it’s not written by expensive lawyers, it’s not worth a damn?" she says. "Just because something’s short doesn't meant it doesn't have value."

Not only does this approach limit the impact that ordinary citizens can have, but because the government is obligated to consider all "relevant matter," Sohn says, it puts the Chairman Pai on shaky legal footing as he moves forward with rule-making. "Ignoring the short comments entirely makes his case in court more vulnerable," she says. "There are real questions about the integrity of the docket that can and will be used against him in court."

When the Administrative Procedure Act became law in 1946, requiring government agencies to accept public comments, a world in which bots wreaked havoc on the rule of law was the stuff of science fiction. Today, it's a reality that the FCC can no longer ignore.