HOAXYbeta

Visualize the spread of claims and fact checking.

Frequently Asked Questions About Hoaxy

FAQ Index
Q: What is Hoaxy?
A: Hoaxy visualizes the spread of articles online. Articles can be found on Twitter, or in a corpus of claims and related fact checking.
Q: How does it work?
A: The Hoaxy corpus tracks the social sharing of links to stories published by two types of websites: (1) Low-credibility sources that often publish inaccurate, unverified, or satirical claims according to lists compiled and published by reputable news and fact-checking organizations. (2) Independent fact-checking organizations, such as snopes.com, politifact.com, and factcheck.org, that routinely fact check unverified claims.
Q: What is a “claim”? Who decides what is true or not?
A: We do not decide what is true or false. Low-credibility sources often publish false news, hoaxes, rumors, conspiracy theories, and satire, but may also publish accurate reports. Therefore not all claims you can visualize on Hoaxy are false, nor can we track all false stories. We aren’t even saying that the fact checkers are 100% correct all the time. You can use the Hoaxy tool to observe how unverified stories and the fact checking of those stories spread on public social media. We welcome users to click on links to fact-checking sites to see what they’ve found in their research, but it’s up to you to evaluate the evidence about a claim and its rebuttals.
Q: Do you have an editorial team?
A: No. Hoaxy tracks claims and fact checks automatically, 24/7. We do not read the contents of the articles we track. This is why we cannot establish whether a claim is accurate, nor whether a particular claim was verified by a particular fact check.
Q: What does the visualization show?
A: Hoaxy visualizes two aspects of the spread of claims and fact checking: temporal trends and diffusion networks. Temporal trends plot the cumulative number of Twitter shares over time. The user can zoom in on any time interval. Diffusion networks display how claims spread from person to person. Each node is a Twitter account and two nodes are connected if a link to a story is passed between those two accounts via retweets, replies, quotes, or mentions. The color of a connection indicates the type of information: claims and fact checks. Clicking on an edge reveals the tweet(s) and the link to the shared story; clicking on a node reveals claims shared by the corresponding user. The network may be pruned for performance.
Q: What is the bot score and how is it calculated?
A: One can think of the bot score as the likely level of automation of an account, where a 5 may indicate a large amount of automation, and 0 may indicate little to no automation. Bot scores are calculated using a machine learning algorithm trained to classify the level of automation an account presents. More information about this topic can be found in the Botometer FAQ.
Q: What if I see some bot scores that are wrong? How can I help?
A: Social bot detection is a hard problem. We are constantly improving our tool’s accuracy, but there will be accounts that our tool fails to classify. You can assist us in making more accurate classifications. You might recognize your own account. Or you might have information that allows you to recognize some other account as most likely human or bot. In these cases, you can provide feedback on those accounts. Do this by clicking on the account (node) and then the Feedback button. Feedback helps us better distinguish between humans, bots, and everything in between, so your help is greatly appreciated.
Q: What are Trending News, Popular Claims, and Fact-Checks in the landing page panel?
A: The landing page panel provides shortcuts to search for new and relevant articles. Trending News are top and breaking headlines for the United States. Popular Claims are the articles from low-credibility sources most tweeted in the last month. Similarly, Popular Fact-Checks are the most tweeted articles in the last month published by fact-checking organizations.
Q: What is the difference between Hoaxy search and Twitter search?
A: There are two search modes. The Hoaxy search finds claims and related fact checking in a corpus of articles from low-credibility and fact-checking sources. This mode leverages the Hoaxy API to retrieve relevant articles, accounts, and tweets. The Twitter search lets users track articles from any sources posted on Twitter during the last 7 days. This mode uses the Twitter Search API to retrieve relevant, popular, or mixed tweets matching your search query. It is compatible with all advanced search operators.
Q: How do you match claims to fact-checks?
A: We use search engine technology (think of Google) to retrieve claims and fact checks. The user enters a query and we match it against our index of claims to find relevant articles. We perform the same procedure to find fact checks matching the query. The user can select claims and fact-checking articles to be visualized.
Q: How does Hoaxy track the spread of articles?
A: We collect public tweets that include links to stories. We then fetch the page linked in the tweet and store the URL and the text of the page of the article, adding them to our corpus together with the tweet. When the user submits a query in Hoaxy search mode, we match the most relevant or recent articles (claims and fact checks) and select all the tweets that linked to them.
Q: What is the source of social media data?
A: At the moment we only collect data from Twitter.
Q: How do I add a Hoaxy story to my own website?
A: Click the “Embed” button at the bottom of the middle navigation menu. Then, copy the code in the popup and paste it into the body of your site’s html code or in the platform you are using as directed by that platform. What will be visible on your site or platform is a widget that should look similar to the one in the popup.
Q: Can I download the results of a story on my own computer?
A: Yes! Click the “Export” button, found at the bottom of the middle navigation menu. This should download or ask you permission to download a comma separated values (CSV) file. Each row in the file represents a connection in the visualization and can be thought of as a tweet. The columns correspond to the features of the tweet, e.g., who posted it, who was mentioned, when the tweet was published, etc.
Q: Do you access any private conversations?
A: No, we only access public tweets.
Q: Do you provide an API to the data you collect?
A: Yes, check out the free Hoaxy API available on Mashape.
Q: Can I cite Hoaxy in my work?
A: Yes, if you use Hoaxy for your work then please cite the following articles:

[1] Chengcheng Shao, Pik-Mai Hui, Lei Wang, Xinwen Jiang, Alessandro Flammini, Filippo Menczer, Giovanni Luca Ciampaglia (2018). Anatomy of an online misinformation network. PLOS ONE, e0196087. https://doi.org/10.1371/journal.pone.0196087

[2] Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol, Alessandro Flammini, and Filippo Menczer (2017). The spread of misinformation by social bots. Technical Report arXiv 1707.07592. https://arxiv.org/abs/1707.07592

[3] Chengcheng Shao, Giovanni Luca Ciampaglia, Alessandro Flammini, and Filippo Menczer (2016). Hoaxy: A Platform for Tracking Online Misinformation. In Proceedings of the 25th International Conference Companion on World Wide Web (WWW '16 Companion), pp. 745-750. http://dx.doi.org/10.1145/2872518.2890098

Q: What technology does Hoaxy use?
A: Hoaxy is written primarily in Python. On the back-end we use Apache Lucene (for full-text indexing and retrieval), Scrapy (for web crawling), Apache Tika (for metadata extraction), RSS (for feed aggregation), PostgreSQL (for data indexing and storage), and SQLAlchemy (for object-relational mapping). On the front-end we use Javascript, Bootstrap, NV.D3 (for the chart), and Sigma-js (for the network). We collect data from Twitter using the Filter API. Top trending articles in the dashboard of the landing page are powered by the News API.
Q: Is Hoaxy open source?
A: Yes. This makes it possible for colleagues, fact checkers, and reporters around the world to deploy versions of the tool to map the spread of claims from their own sources, in their own countries and languages. The code is available in two repositories: a backend and a frontend. Please contact us to let us know if you are using our code.
Q: Why am I asked to log in using my Twitter account?
A: To retrieve search results from the Twitter API on your behalf. We also seek your permission to connect to the Twitter API if you want to refresh the bot scores in the visualization. This is needed to fetch the data needed to recompute the scores on your behalf. We do not store your Twitter personal information, nor do we use any permissions or data to do anything beyond what is necessary to provide the Hoaxy service. More information can be found on the Botometer FAQ.
Q: Who are the Hoaxy developers?
A: Hoaxy is a joint project of the Indiana University Network Science Institute (IUNI) and the Center for Complex Networks and Systems Research (CNetS). Filippo Menczer, Alessandro Flammini, and Giovanni Luca Ciampaglia coordinate the project. Other past or current team members include Chengcheng Shao, Mihai Avram, Ben Serrette, Valentin Pentchev, Lei Wang, Gregory Maus, Liang Chen, Onur Varol, Clayton Davis, and Kaicheng Yang. We are members of the of First Draft Academic Partner Network. The project is supported by a Knight Prototype grant from the Democracy Fund.
Q: How can I contact the Hoaxy team?
A: The best way to contact the team is by using the contact information found at the OSoMe website. You can also tweet us at @TruthyAtIndiana but we cannot promise to monitor Twitter at all times.