Spot biased websites and false news with the help of AI

Artificial Intelligence (i2tutorials.com)

In the Internet age, the fake news was overwhelming and extremely confusing. Facebook was once mired in the muddy news of false news. Not only was it accused of affecting the results of the US presidential election, it even caused huge fines from the German government.

Even the BBC, which is known for its credibility, is hard to escape. For example, the Twitter account of the BBC Northampton sub-station has sent such a message: Breaking News: President Trump is injured in the arm by gunfire #Inauguration. President Ramp was shot and injured in the arm after the inauguration.)

As for the domestic fake news, it is also full of tricks, even the WeChat dialogue can be forged, PS technology is superb, such as this screenshot that once detonated the Internet circle:

AI system: establish the multi-dimensional vector for data detection

On October 4th, the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Laboratory (CSAIL) published a news on its official website, claiming that the lab has worked with researchers at the Qatar Computing Research Institute. An AI system that can identify the accuracy of information sources and personal political biases will be officially announced at the end of this month at the 2018 Natural Language Processing Experience Method Meeting (EMNLP) in Brussels, Belgium.

Using this AI system, the researchers created an open source dataset with more than 1,000 news sources labeled “authenticity” and “prejudice” scores. It is said that this is the data set with the largest number of news sources in the dataset.

The novelty of the AI ​​system is that it has a broad contextual understanding of the medium being evaluated. It does not separately extract feature values ​​from the news articles (variables trained by machine learning models), but rather takes into consideration Wikipedia, social media, and even The credibility is determined based on the structure of the URL and web traffic data.

The system supports vector (SVM) training to evaluate facts and deviations. The authenticity is divided into: low, medium and high; political tendencies are divided into extreme left, left, center left, center right, right, and extreme right.

According to the team, the system only needs to detect 150 articles to determine whether a new source code is reliable. It has an accuracy rate of 65% in detecting whether a news source has high, low or medium “authenticity” and 70% in detecting whether its political tendency is left, right or neutral.

In the article shown above, the AI ​​system tested the article’s copy and title in six dimensions, not only analyzing the structure, sentiment, and participation of the article (in this case, analyzing the number of stocks, responses, and Facebook). The comments) also analyzed the subject, complexity, prejudice, and morality, and calculated the score for each eigenvalue, then averaged the scores for a set of articles.

Wikipedia and Twitter have also been added to the predictive model of the AI ​​system. As the researchers said, the lack of a Wikipedia page may indicate that a website is untrustworthy, or that the political tendency to mention the issue on a web page is ironic or obviously left-leading. In addition, they also pointed out that it is unlikely that a verified Twitter account, or a message created using a newly created account that is not explicitly labeled, would be true.

The last two vectors of the model are the URL structure and web traffic, which can detect URLs that attempt to emulate a trusted news source (for example, “foxnews.co”), referencing the Alexa ranking of a website based on the total number of page views. Calculation.

The team trained the AI ​​system on 1066 news sources on the MBFC (Media Bias/Fact Check) website. They used the collected accuracy and bias data to manually tag the site information. In order to generate the above database, the researchers published 10-100 articles (94,814 articles) on each site.

As the researchers have painstakingly introduced in their reports, not every eigenvalue can effectively predict factual accuracy or political bias. For example, some websites that do not have a Wikipedia page or a Twitter profile are likely to publish information that is fair and trustworthy. The top Alexa news source is not always fairer or more authentic than a less traffic source.

Scientists have an interesting discovering: articles from phony news locales will probably utilize misrepresented and passionate dialect and left-inclining media will probably allude to “correspondence” and “decency.” Likewise, meanwhile, circulations with longer Wikipedia pages are generally more true, as are URLs that contain a couple of uncommon characters and complex subdirectories.

In the future, the team intends to explore whether the AI ​​system can adapt to other languages ​​(it currently only has English training) and whether it can be trained to detect bias in specific areas.

They also plan to launch an app that automatically responds to news through articles that “cross the political spectrum.”

Ramy Baly, the first author and postdoctoral assistant of the paper, said: “If a website has previously published fake news, they are likely to publish it again.” “By automatically crawling the data of these websites, we hope Our system can help find out which sites might do this first.”

Of course, they are not the only institutions that try to combat fake news through artificial intelligence.

New Delhi-based startup MetaFact uses NLP algorithms to flag error messages and biases in news stories and social media posts; the SAAS platform AdVerify.ai launched a beta version last year that analyzes error messages, malware, and other problematic content. It can also cross-reference a regularly updated database of thousands of fake and legitimate news.

As mentioned in the previous article, Facebook was once immersed in the muddy of fake news, and has begun to try to use artificial intelligence tools to “identify false news” and recently acquired the London-based startup Bloomsbury AI to help it identify and eliminate fakes. news.

Will fake news be eliminated?

However, some experts do not believe that artificial intelligence can do the job. Dean Pomerleau, a scientist at the Carnegie Mellon University Robotics Institute, said in an interview with foreign media the Verge that artificial intelligence lacks a subtle understanding of language and that understanding It is necessary to identify lies and false statements.

“Our initial goal was to create a system to answer ‘This is fake news, is it or not?'” he said, “but we quickly realized that machine learning is not up to the task.”

However, human fact checkers do not necessarily do better than AI. This year, Google suspended the “Fact Check” label, which was once in the Google News section, after conservative media also accused Google of showing bias against them.

However, no matter whether the solution to the final identification of fake news and personal bias is AI system or labor, or both, the day when the an fake news is completely eliminated will not come immediately.

According to consulting firm Gartner, by 2022, if the current trend remains the same, most people in developed countries will see more false information than real information.