Financial market data is one of the most valuable data in the current time. If analysed correctly, it holds the potential of turning an organisation’s economic issues upside down. Among a few of them, Yahoo finance is one such website which provides free access to this valuable data of stocks and commodities prices. In this blog, we are going to implement a simple web crawler in python which will help us in scraping yahoo finance website. Some of the applications of scraping Yahoo finance data can be forecasting stock prices, predicting market sentiment towards a stock, gaining an investive edge and cryptocurrency trading. Also, the process of generating investment plans can make good use of this data!
Before scraping yahoo finance website, let us first understand more about Yahoo finance Data in the next section.
What is Yahoo Finance?
Yahoo finance is a business media platform from Yahoo which provides comprehensive offerings in the world of business and investment. It has a plethora of available business information like financial news, data about stock quotes, press releases and financial reports. Whether you are an investor or are just looking for some business news, Yahoo finance is the place to go. The biggest plus of Yahoo finance is that it provides all of this information for free. Hence by scraping Yahoo finance data, you can actually get valuable information at your end and do an analysis of stocks and currencies trends. Moreover, you get real-time information about stock prices along with access to other financial investment/management tools.
Why Scrape Finance websites?
Financial data if extracted and analysed in real time can provide a wealth of information for investments, trading, research and sentiment analysis.
1. Stock trading
Online trading involves stocks trading via an online platform. Online trading portals facilitate the trading of different financial instruments such as stocks, mutual funds and commodities. In online stock trading, owners of one stock meet different buyers virtually and sell the stocks to buyers. The selling part only happens when a buyer and a seller has negotiated the price of exchange.
Furthermore, these prices are market dependent and are provided by scraping yahoo finance. Moreover, stock trading organisations can leverage yahoo finance data to keep a record of changing stock prices and market trend. This analysis will help financial and investment companies to predict the market and buy/sell stocks for maximum profits.
2. Sentiment analysis of the market
Organisations can perform sentiment analysis over the blogs, news, tweets and social media posts in business and financial domains to analyse the market trend. Furthermore, scraping Yahoo finance will help them in collecting data for natural language processing algorithms to identify the sentiment of the market. Through this, one can track the emotion towards a particular product, stock, commodity or currency and make the right investment decision.
3. Equity research
Equity Research refers to analysing a company’s financial data, perform analysis over it and identify recommendations for buying and selling of stocks. The main aim of equity research is to provide investors with financial analysis reports and recommendations on buying, holding, or selling a particular investment. Also, banks and financial investment organisations often use equity research for their investments and sales & trading clients, by providing timely, high-quality information and analysis.
4. Regulatory compliance
Business and financial investment jobs are high-risk jobs. A lot of investment decisions are directly dependent on the government scheme and policies regarding trade. Hence, it is essential to keep track of the government sites and other official forums to extract any policy changes related to trading. Mainly, risk analysts should crawl news outlets and government sites for real-time actions about the events and decisions which are directly correlated with their business.
Approach for scraping Yahoo finance data
Yahoo finance provides a plethora of information of about stock market and investment. Our primary goal is to fetch the data by scraping Yahoo finance and store it on our own premises for later analysis. In this blog, we are going to extract data about cryptocurrencies, currencies, world-indices, active-stocks and commodities. These data points can also be scraped from the results of search engine too, but we will keep the scope to scraping Yahoo finance only in this blog.
We will be writing simple python code for scraping Yahoo finance data which will visit the website and get all this data for us. Python is used for the crawler implementation. We are using the Beautiful Soup library to do crawling for us!
Python implementation for scraping Yahoo finance data
We start by importing the required libraries for us. We have imported the pandas and Beautiful Soup library here. Pandas library will help us in arranging the collected data in the form of tables whereas the Beautiful Soup library provides us with the crawling abilities in python
import requests from bs4
import BeautifulSoup import csv
import pandas as pd
Scraping Crypto Currencies
A cryptocurrency is a digital currency using cryptographic security. Cryptocurrencies are decentralised systems based on blockchain technology, a distributed network of computers. Due to advanced protection, these currencies are harder to counterfeit.
By now, cryptocurrencies have become a global phenomenon. With significant growth in recent years, investments in cryptocurrencies proved beneficial to a large number of investors.
crypto currencies from scraping yahoo finance
In below code section, we have given the yahoo finance link for the cryptocurrencies page. There are multiple pages which contain information about the cryptocurrencies. This code iterates through all the pages and pulls out the relevant information. Pulling of any relevant information happens through HTML tags present in the source code of the website. We just need to identify those tags and put them in attributes placeholder in the code!
names=[]
prices=[]
changes=[]
percentChanges=[]
marketCaps=[]
totalVolumes=[]
circulatingSupplys=[]
for i in range(0,10):
CryptoCurrenciesUrl = "https://in.finance.yahoo.com/cryptocurrencies?offset="+str(i)+"&count=50"
r= requests.get(CryptoCurrenciesUrl)
data=r.text
soup=BeautifulSoup(data)
for listing in soup.find_all('tr', attrs={'class':'SimpleDataTableRow'}):
for name in listing.find_all('td', attrs={'aria-label':'Name'}):
names.append(name.text)
for price in listing.find_all('td', attrs={'aria-label':'Price (intraday)'}):
prices.append(price.find('span').text)
for change in listing.find_all('td', attrs={'aria-label':'Change'}):
changes.append(change.text)
for percentChange in listing.find_all('td', attrs={'aria-label':'% change'}):
percentChanges.append(percentChange.text)
for marketCap in listing.find_all('td', attrs={'aria-label':'Market cap'}):
marketCaps.append(marketCap.text)
for totalVolume in listing.find_all('td', attrs={'aria-label':'Total volume all currencies (24 hrs)'}):
totalVolumes.append(totalVolume.text)
for circulatingSupply in listing.find_all('td', attrs={'aria-label':'Circulating supply'}):
circulatingSupplys.append(circulatingSupply.text)
data of cryptocurrencies
Also, you can find the snapshot of cryptocurrencies data collected after scraping yahoo finance below.
Scraping Currencies
Following code will help you in scraping Yahoo finance for currencies. The approach is more or less the same here. We have to identify the tags which hold the required information.
Currencies data from scraping yahoo finance
In Yahoo finance, there are no specific attributes in HTML code, so we pulled data through data-id’s present.
names=[]
prices=[]
changes=[]
percentChanges=[]
marketCaps=[]
totalVolumes=[]
circulatingSupplys=[]
CryptoCurrenciesUrl = "https://in.finance.yahoo.com/currencies"
r= requests.get(CryptoCurrenciesUrl)
data=r.text
soup=BeautifulSoup(data)
counter = 40
for i in range(40, 404, 14):
for listing in soup.find_all('tr', attrs={'data-reactid':i}):
for name in listing.find_all('td', attrs={'data-reactid':i+3}):
names.append(name.text)
for price in listing.find_all('td', attrs={'data-reactid':i+4}):
prices.append(price.text)
for change in listing.find_all('td', attrs={'data-reactid':i+5}):
changes.append(change.text)
for percentChange in listing.find_all('td', attrs={'data-reactid':i+7}):
percentChanges.append(percentChange.text)
pd.DataFrame({"Names": names, "Prices": prices, "Change": changes, "% Change": percentChanges})
Also, you can find the snapshot of currencies data collected after scraping yahoo finance below.
Currency data after scraping yahoo finance
Scraping World Indices
The MSCI World is a market cap weighted stock market index of 1,649 stocks from companies throughout the world.
The index and their movements give an insight into the general attitude of the investing public towards companies of all sizes and industries.
World Indices after scraping yahoo finance
By Scraping yahoo finance, we get access to attributes of world indices like prices, percentage changes, market volume about the different world indices.
prices=[]
names=[]
changes=[]
percentChanges=[]
marketCaps=[]
totalVolumes=[]
circulatingSupplys=[]
CryptoCurrenciesUrl = "https://in.finance.yahoo.com/world-indices"
r= requests.get(CryptoCurrenciesUrl)
data=r.text
soup=BeautifulSoup(data)
counter = 40
for i in range(40, 404, 14):
for row in soup.find_all('tbody'):
for srow in row.find_all('tr'):
for name in srow.find_all('td', attrs={'class':'data-col1'}):
names.append(name.text)
for price in srow.find_all('td', attrs={'class':'data-col2'}):
prices.append(price.text)
for change in srow.find_all('td', attrs={'class':'data-col3'}):
changes.append(change.text)
for percentChange in srow.find_all('td', attrs={'class':'data-col4'}):
percentChanges.append(percentChange.text)
pd.DataFrame({"Names": names, "Prices": prices, "Change": changes, "% Change": percentChanges})
World indices data after scraping yahoo finance
Also, you can find the snapshot of world-indices data collected after scraping yahoo finance below.
Scraping most-active stocks
The stocks on an exchange with the highest volume over a given period are the most active. Because of significantly important new information affecting the stock reaching the market, stocks usually have a higher than average trading volume. This gives investors a strong impetus to buy or sell the stock for high profits.
Most active stocks from scraping yahoo finance
Following code helps in scraping Yahoo finance about most-active stocks!
names=[]
prices=[]
changes=[]
percentChanges=[]
marketCaps=[]
totalVolumes=[]
circulatingSupplys=[]
for i in range(0,11):
CryptoCurrenciesUrl = "https://in.finance.yahoo.com/most-active?offset="+str(i)+"&count=100"
r= requests.get(CryptoCurrenciesUrl)
data=r.text
soup=BeautifulSoup(data)
for listing in soup.find_all('tr', attrs={'class':'SimpleDataTableRow'}):
for name in listing.find_all('td', attrs={'aria-label':'Name'}):
names.append(name.text)
for price in listing.find_all('td', attrs={'aria-label':'Price (intraday)'}):
prices.append(price.find('span').text)
for change in listing.find_all('td', attrs={'aria-label':'Change'}):
changes.append(change.text)
for percentChange in listing.find_all('td', attrs={'aria-label':'% change'}):
percentChanges.append(percentChange.text)
for marketCap in listing.find_all('td', attrs={'aria-label':'Market cap'}):
marketCaps.append(marketCap.text)
for totalVolume in listing.find_all('td', attrs={'aria-label':'Avg vol (3-month)'}):
totalVolumes.append(totalVolume.text)
for circulatingSupply in listing.find_all('td', attrs={'aria-label':'Volume'}):
circulatingSupplys.append(circulatingSupply.text)
pd.DataFrame({"Names": names, "Prices": prices, "Change": changes, "% Change": percentChanges, "Market Cap": marketCaps, "Average Volume": totalVolumes,"Volume":circulatingSupplys})
Also, you can find the snapshot of most-active stocks data collected after scraping yahoo finance below.
Data of most active stocks collected after scraping yahoo finance
Scraping commodities
A commodity is an essential commodity used in trade that can be exchanged
with the same type of commodity.
The Commodities are most frequently used as inputs to other goods or services. Traders trade in commodity markets solely to benefit from volatile price changes. These traders never intend to supply the actual commodity or take it when the futures contract expires.
Most active stocks from scraping yahoo finance
Below helps in scraping yahoo finance for the data about different commodities like gold and silver.
prices=[]
names=[]
changes=[]
percentChanges=[]
marketCaps=[]
marketTimes=[]
totalVolumes=[]
openInterests=[]
CryptoCurrenciesUrl = "https://in.finance.yahoo.com/commodities"
r= requests.get(CryptoCurrenciesUrl)
data=r.text
soup=BeautifulSoup(data)
counter = 40
for i in range(40, 404, 14):
for row in soup.find_all('tbody'):
for srow in row.find_all('tr'):
for name in srow.find_all('td', attrs={'class':'data-col1'}):
names.append(name.text)
for price in srow.find_all('td', attrs={'class':'data-col2'}):
prices.append(price.text)
for time in srow.find_all('td', attrs={'class':'data-col3'}):
marketTimes.append(time.text)
for change in srow.find_all('td', attrs={'class':'data-col4'}):
changes.append(change.text)
for percentChange in srow.find_all('td', attrs={'class':'data-col5'}):
percentChanges.append(percentChange.text)
for volume in srow.find_all('td', attrs={'class':'data-col6'}):
totalVolumes.append(volume.text)
for openInterest in srow.find_all('td', attrs={'class':'data-col7'}):
openInterests.append(openInterest.text)
pd.DataFrame({"Names": names, "Prices": prices, "Change": changes, "% Change": percentChanges, "Market Time": marketTimes,'Open Interest': openInterests ,"Volume": totalVolumes})
Also, you can find the snapshot of commodities data collected after scraping yahoo finance below.
Commodities data after scraping yahoo finance
Datahut as your reliable scraping partner
There are a lot of tools that can help you scrape data yourself. However, if you need professional assistance with minimal technical know-how, Datahut can help you. We have a well-structured and transparent process for extracting data from the web in real time and provide in the desired format. We have helped enterprises across various industrial verticals. From assistance to the recruitment industry to retail solutions, Datahut has designed sophisticated solutions for most of these use-cases.
You should join the bandwagon of using data-scraping in your operations before it is too late. It will help you boost the performance of your organisation. Furthermore, it will help you derive insights that you might not know currently. This will enable informed decision-making in your business processes.
Summary
In this article, we had a look at how simple scraping yahoo finance for stock market data can be using python. Furthermore, the data about stocks, commodities and currencies were also collected by scraping yahoo finance website. Beautiful soup is a simple and powerful scraping library in python which made the task of scraping Yahoo finance website really simple. Also, the data collected by scraping Yahoo finance can be used by the financial organisations to predict the stock prices or predict the market trend for generating optimised investment plans. Apart from financial organisations, many industries across different verticals have leveraged the benefits of web scraping.
Start leveraging the benefits of web scraping for your organisation with Datahut as your web-scraping partner.