Veracity: The Most Important V in Big Data

One Glitch could lead to Financial Meltdown

Veracity: The Most Important V in Big Data

One Glitch could lead to Financial Meltdown

Introduction

You may not know it, but there is a whole army of automated agents (bots) who are not attacking Web sites for DDoS or spreading malware, they are out to harvest information in order to make money for someone. They are continually mining the Web for trusted sources of information on companies, and then automatically analyse the information for sentiments. At one time the information was fed to human analysts, but now we have information overload, so increasingly automated software agents are making decisions on the stock market.

So if good things are said about a company, then an analyst will be advised to purchase their stock, and for bad things, they will bail out of the stock. Sometimes, too, the information-gathering agents can be programmed to automatically trade on the stock. So we get to the four V of Big Data …. veracity … and probably the most important one … the trustworthiness of the data.

And so it happened in 2016 where automated agents were primed to automatically sell the UK Pound and where they picked up bad news related to Brexit, and which led to a 10% drop in value. With the UK already feeling the strain on their currency — dropping to the 6th strongest economy in the world — it is not good news for the country that bad news triggers even more bad news. This can lead to a downward spiral, where bad data propagates through the system.

Bad news from a trusted source … causes a fall in the value of currency … which causes bad news in reporting the fall … which causes a fall in the value of currency … and so on … and within hours a currency could have crashed. In electronics, this type of scenario is known as positive feedback, and we see it in audio systems where a small signal gets amplified each time around the loop, and the initial signal ends up swamping the original signal.

The drop happened in the Asian market and where the UK Pound fell at one point by 10% to US$1.1378, and where traders struggled to see the reason why it had dropped so much. Many now worry that automated systems could actually completely crash a nation’s currency, almost within hours. A nightmare situation is that malicious agents could actually prime the news network with good or bad information, and chaos.

There has been speculation that the automated training agents picked up on information coming from senior EU sources, such as from the French president who said that the UK … “would have to ‘suffer’ for its decision to leave”, and where the UK would get restricted access to the European single market, in order that it can control its borders. The word ‘suffer’ in sentiment analysis has a high score, and, as it comes from a trusted source — the French president — it can be rated as having a high impact on trading.

Tricking the bots …

If there’s one thing that motivates people, it is money, so anything that dealers can do to have an edge will often make them money. So if they can pick up a rumor of a change in the operation of a company, they can either bail-in or bail out of the stock.

For good news (for them), such as a take-over, they will bail-in, and cash out when it is announced, and for bad news (such as a drop in sales), they bail out before the announcement, and then buy back when the stock is lower. These are the basic laws that have driven the stock exchange for decades. Unfortunately, in a world of Big Data and targeted spear phishing, we could be seeing the signs of a crack in the new feeds. With this, we have increased volumes of news fragments (V for Volume), along with problems related to the trustworthiness of sources (V for Veracity), and any noise in the system could lead to bad effects.

Big Data and Automated Agents

In the past dealers have used respected sources, such as the Financial Times, to get their information, and where sources of take-overs were checked, but in an era of Big Data, the news feeds are often automated, and “noise” can easily get into the system, especially it the noise has been primed to cause a disruption. It has now been seen to be affecting the stock market, with a glitch just yesterday in Twitter’s share price… the rumor was that they were getting bought out … “buy Twitter shares asap” … was the cry!

With Big Data we get the four V’s of:

  • V(olume) — the amount of information that is now gathered in real-time is massive.
  • V(eracity) — the trustworthiness of sources can be a problem, especially in trigger rumors.
  • V(ariery) — this is the different forms of information such as from new items, social media, video feeds, and son.
  • V(elocity) — the speed of information of the Internet very increases, and it is a major challenge to process the amount of relevant information, in order to make sense of it.

With stock market manipulation, it is not too difficult to trick the systems to force information into the processing agents or use targeted spear phishing email to dealers. As we can see in Figure 1, crawling agents read fragments of data from blogs, social media, and Web sites, and then these are processed using NLP — Natural Language Processing — to make sense of them, and then clustered to give their significance, and bind them together. These are then fed to the stock market analysts/dealers to make sense of whether the event is significant for the prices of the shares. For example, an earthquake in Japan had a significant effect on the supply of electronic components to a range of customers, all of which can affect the stock price.

Figure 1: News fragment gathering

One problem that we have is that a malicious person can feed the crawling agents with incorrect information (or misguided information), and these can be picked up by the agents. Along with this, the scammer can feed the information onto other sites in order to boost its credibility (Figure 2).

On the Internet, if enough sources say it is true, then it becomes a fact! For example, a story of a woman with a third breast implant received nearly 200,000 shares, but actually, it related to a three-breast prosthesis found in her luggage.

Figure 2: A rumor spreads!

Along with the risk of incorrect processing of news items, phishing emails can be used to trick analysts (Figure 3). In this case, we see that the URL is fake, but the content looks valid, as the scammer has used all the graphics and styles from the proper site.

Figure 3: The World is Flat! — note this is a fake page!

Twitter boost

Questions were asked about the spike in Twitter’s shares on Tuesday 14 July 2015 (Figure 4), and which related to a false report of a buy-out. The stock surged by 8%, and ended-up up 2.6%. Something wasn’t quite right in the claims, and it has since been traced to a hoax, where the news item appeared to be from Bloomberg News, but it has an incorrect URL.

Basically, it was like a phishing email, where the email sent spoofs the content from a site in the email. In this case, the page had been mocked-up like the Bloomberg site and tricked Internet agents into thinking the content could be trusted. Automated agents then started to push it onto other sites, and each one boosted the credibility.

The spike mirrors other suspicious activities around Avon, Tower Group and Rocky Mountain Chocolate.

Figure 4

Avon, Tower Group, and Rocky Mountain Chocolate

The same trace of spoofing news articles happened with a spike in Avon Products Inc. where Nedko Nedev tricked the stock market with a fake news story of a take-over. It is also alleged that he did it for Tower Group International Ltd. and Rocky Mountain Chocolate Factory Inc. where he purchased shares in companies, and leaked a fake story about takeovers.

On 14 May 2015, Avon saw stock volumes rise to over 21 million the day before, and to nearly 70 million (448% increase in volume), with a 20% surge in the price. The news item itself was full of typos, and lots of misinformation and bore all the signs of a scam, but the rumor spread quickly, and few people had time to trace its source.

On 13 May 2014, Euroins Insurance Group announced that it was going to purchase Tower Group, with which increase share prices by 32%, and where Nedev earned over $23,368 on the share spike. Before this, on 28 Dec 2012, PST Capital announced a take-over of Rocky Mountain Chocolate, and which saw shares leap 23%.

It must be said that there were official documents submitted for the takeovers, but they were unlikely to be real.

Conclusions

With so much information arriving on the Internet every day, and so much of it is automated in its gathering, it is not too difficult for someone to create false rumors. With many looking to catch the latest news before everyone else, the demand for fast processing of new fragments will increase, and analysts will be looking at minutes, if not seconds, in terms of the delay between an event happening and it appears on their screen. With margins to be made by quick response, it may be that they will not have enough time to check the credibility of this, and cause some serious problems, as the stock market governs our modern economy, and its failure could cause many problems.

Do you know …

The Earth is flat!

It is, because it says it here … (please excuse my editing of a page here … it is just an illustration):