How To Download Income Statement From Yahoo Finance UPDATED

How To Download Income Statement From Yahoo Finance

Web Scraping Yahoo Finance

Pull fiscal statements and stock data from any publicly traded company

Randy Macaraeg

The code to this blog tin can be found on my GitHub .

In the business earth, it's important to know the fiscal health of a company. Looking at the financial statements is a great way to get some insight into how well a company is doing.

In this blog, I'll go over pulling the financial statements from Yahoo Finance for any company in their database in Python. Because Yahoo Finance uses JavaScript, we apply a combination of BeautifulSoup and Selenium

Import the Libraries

Let'south get-go with the necessary libraries:

              import pandas as pd
from bs4 import BeautifulSoup
import re
from selenium import webdriver
import chromedriver_binary
import string
pd.options.display.float_format = '{:.0f}'.format

Ready and run the driver

              is_link = 'https://finance.yahoo.com/quote/AAPL/financials?p=AAPL'              driver = webdriver.Chrome()
driver.get(is_link)
html = driver.execute_script('return document.body.innerHTML;')
soup = BeautifulSoup(html,'lxml')

I adopt using C hrome as my web browser, simply feel free to use whatever you're most comfortable with (Firefox, Safari, etc.). I'm likewise using Apple every bit my case company, merely you can modify the AAPL ticker in the link to another visitor's stock ticker to change the data.

The higher up code will open the folio in a dummy browser and pull all of the data within the body of the website. Since Yahoo Finance operates on JavaScript, running the code through this method pulls all of the data and saves it every bit if it were a static website. This is of import for pulling the stock price, as those are dynamic items on the webpage and tin can refresh/update at regular intervals.

To pull the stock information, we beginning by getting the location of the stock prices. By hovering over the stock price we tin apply the Inspect tool to find the verbal syntax for the stock toll.

Transferring this information into Python, the code would wait like this:

              close_price = [entry.text for entry in soup.find_all('span', {'class':'Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)'})]            

This code searches for the 'bridge' tag inside all of the HTML code and looks for the form attribute that matches the one entered. Luckily this pulls but one number, which is the stock price at the close.

Now that nosotros have this we can motion on to the fiscal statements.

Pulling fiscal statement data

Continuing on with the scraping, we search the page to find all of the div containers, and dive in a fleck further to find the features we want to work with. I plant that each row of the financial data is stored within a div container with a common class attribute of 'D(tbr)'. In the case below in that location are additional pieces of data in the class attribute, just as long every bit the get-go portion matches what we're searching it volition pull that data.

Each line on the financial statements is stored in a div
              features = soup.find_all('div', class_='D(tbr)')            

This will pull a lot of messy looking data which confused me for a while. After digging into the data pulled, I used a find part to encounter where each line of data was that I wanted. Afterward a lot of trial and error, I was able to produce the lawmaking to pull only the financial data:

              headers = []
temp_list = []
label_list = []
final = []
index = 0
#create headers
for item in features[0].find_all('div', class_='D(ib)'):
headers.append(item.text)
#statement contents
while index <= len(features)-one:
#filter for each line of the argument
temp = features[alphabetize].find_all('div', class_='D(tbc)')
for line in temp:
#each particular calculation to a temporary listing
temp_list.append(line.text)
#temp_list added to final list
final.append(temp_list)
#clear temp_list
temp_list = []
index+=1
df = pd.DataFrame(final[one:])
df.columns = headers

The headers were separated from the rest of the data since information technology was causing some issues when putting all of the information together. After creating a list of the headers and multiple lists for the data within the fiscal statements, it combines everything together to produce a re-create!

It looks just like Apple tree'southward income argument (minus formatting), but while it looks complete, the next consequence is that all of the numbers on the page are saved as strings. If we want to exercise any future calculations on it, we'd accept to change them all to integers:

              #function to make all values numerical
def convert_to_numeric(column):
first_col = [i.supplant(',','') for i in column]
second_col = [i.replace('-','') for i in first_col]
final_col = pd.to_numeric(second_col)

render final_col

With a quick for loop, nosotros can turn all of the number strings into integers:

              for column in headers[1:]:

df[column] = convert_to_numeric(df[column])

final_df = df.fillna('-')

This converts all numbered strings into actual numbers, and for all NaNs it is replaced by a nuance.

The final production looks pretty good and can now be used for calculations!

Finalized Income Statement

While this pulls simply the income statement from Apple tree, if you utilize Yahoo Finance's link on the residuum sheet or statement of greenbacks flows, the lawmaking should pull everything fine and put the statement together just like the above.

I'm sure in that location's a lot of additional piece of work that can be done to clean this upwardly and I'd love to brand it expect more like an official statement in the virtually future. While continuing to work on this, I'd like to go over some deep dives into the data to find basic financial ratios and become over things like company valuation, investment assay, and other trends/findings I can dig out from this data. Stay tuned!

I hope this lawmaking helps, and if you have whatsoever feedback or suggestions I'd love to hear information technology!

DOWNLOAD HERE

Posted by: savoyexcums.blogspot.com

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel