finsets
  1. WRDS
  2. ratios
  • finsets
  • FRED
    • fred
    • fred_api
  • WRDS
    • wrds_api
    • crspm
    • crspd
    • compa
    • compa_ccm
    • compq
    • compq_ccm
    • ratios
    • ibes_ltg
    • bondret
    • mergent
    • linking
  • Papers
    • Dickerson, et al. (2023)
    • Gilchrist and Zakrajsek (2012)
    • Gürkaynak, et al. (2007)
    • Hassan, et al. (2019)
    • Hoberg, Phillips (2010, 2016)
    • Peters and Taylor (2016)

On this page

  • From the manual
  • NOTE:
  • list_all_vars
  • get_raw_data
  • process_raw_data
  • keep_only_ratios
  • Report an issue
  1. WRDS
  2. ratios

ratios

Retrieve and process data from WRDS Financial Ratios Suite.

From the manual

Data Source:

All accounting related data are obtained from Compustat Quarterly and Annual file. Pricing related data, such as Market Capitalization and Price, are obtained from both CRSP and Compustat, and we rely on CRSP as the primary data source for pricing data. Earnings related data are from IBES database.

Data Frequency:

The final outputs for both individual firm and industry-level aggregated value are at monthly frequency. In order to populate the data to monthly frequency, we carry forward the most recent quarterly or annual data item, whichever is most recently available at a given time stamp, to the subsequent months before the next filing data becomes available. In addition, in order to make sure that all data is publicly available at the monthly time stamp, we lag all observations by two months to avoid any look ahead bias.4

Outlier Control:

As ratio metrics often produce unintended extreme outliers, we impose two layers of outliers control before aggregating at the industry level. First, for all the monthly frequency firm level individual ratio results, we impose a winsorization at 1% level for extreme values, and truncate the outliers in the top and bottom percentile to be missing. Secondly, to arrive at the final ratio output, we enforce a 12 month moving average on the monthly frequency financial ratios. The second step serves two purpose: to further smooth the final output, and to fill in the truncated extreme months (from step 1) with firm-specific moving average. Note that the outlier controls are only applied to the ratios fed to the industry-level aggregation. Outputs for firm-level financial ratios are raw ratios without any truncation or smoothing. Hence researchers are advised to censor/smooth the raw ratios to get rid of the extreme outliers before conducting further analysis.


public_date: date on which the information was available to the public

adate: fiscal year to which the information pertains

qdate: fiscal quarter to which the information pertains


NOTE:

  • This dataset has three variables that are in levels (apart from the ID variables): be, mktcap, and price (i.e. book equity, market cap, and stock price).

  • Excludes financials.

  • ID variables are: permno, gvkey, ticker, cusip, public_date, adate, qdate, gsector, gicdesc, and all variables starting with “ffi”


from __future__ import annotations
from typing import List

import pandas as pd

import pandasmore as pdm
from finsets.wrds import wrds_api
PROVIDER = 'Wharton Research Data Services (WRDS)'
URL = 'https://wrds-www.wharton.upenn.edu/pages/get-data/financial-ratios-suite-wrds/financial-ratios-with-ibes-subscription/financial-ratios-firm-level-ibes/'
LIBRARY = 'wrdsapps_finratio_ibes'
TABLE = 'firm_ratio_ibes'
FREQ = 'M'
MIN_YEAR = 1970
MAX_YEAR = None
ENTITY_ID_IN_RAW_DSET = 'permno'
ENTITY_ID_IN_CLEAN_DSET = 'permno'
TIME_VAR_IN_RAW_DSET = 'public_date'
TIME_VAR_IN_CLEAN_DSET = f'{FREQ}date'

source

list_all_vars

 list_all_vars ()

Collects names of all available variables from WRDS f{LIBRARY}.{TABLE}

all_vars = list_all_vars()
all_vars.name.count()
100

source

get_raw_data

 get_raw_data (vars:List[str]=None, nrows:int=None, start_date:str=None,
               end_date:str=None)

Downloads vars from start_date to end_date from WRDS {LIBRARY}.{TABLE} library

Type Default Details
vars List[str] None If None or ’*’, downloads all variables
nrows int None Number of rows to download. If None, full dataset will be downloaded
start_date str None Start date in MM/DD/YYYY format
end_date str None End date in MM/DD/YYYY format
Returns pd.DataFrame
raw = get_raw_data(start_date='01/01/2021', nrows=1000)
raw.head(0)
gvkey permno adate qdate public_date capei be bm evm pe_op_basic ... ffi30_desc ffi30 ffi38_desc ffi38 ffi48_desc ffi48 ffi49_desc ffi49 ticker cusip

0 rows × 100 columns


source

process_raw_data

 process_raw_data (df:pandas.core.frame.DataFrame=None,
                   clean_kwargs:dict={})

Converts some variables to categorical and applies pandasmore.setup_panel to df

Type Default Details
df pd.DataFrame None Must contain permno and datadate columns
clean_kwargs dict {} Params to pass to pdm.setup_panel other than panel_ids, time_var, and freq
Returns pd.DataFrame
df_clean = process_raw_data(raw)
df_clean.head(1)
public_date dtdate gvkey adate qdate capei be bm evm pe_op_basic ... ffi30_desc ffi30 ffi38_desc ffi38 ffi48_desc ffi48 ffi49_desc ffi49 ticker cusip
permno Mdate
10145 2021-01 2021-01-31 2021-01-31 001300 2019-12-31 2020-09-30 28.329023 19548.0 0.16924 17.302839 24.574843 ... NaN NaN GOVT 37 NaN NaN NaN NaN HON 43851610

1 rows × 100 columns


source

keep_only_ratios

 keep_only_ratios (df:pandas.core.frame.DataFrame)
keep_only_ratios(df_clean).head(1)
capei bm evm pe_op_basic pe_op_dil pe_exi pe_inc ps pcf dpr ... rd_sale adv_sale staff_sale accrual ret_crsp ptb peg_trailing divyield peg_1yrforward peg_ltgforward
permno Mdate
10145 2021-01 28.329023 0.16924 17.302839 24.574843 24.856234 28.030129 28.030129 4.125068 22.696753 0.514853 ... 0.046821 0.0 0.0 -0.017443 -0.081476 7.012911 0.958785 0.019041 -2.024123 10.657844

1 rows × 72 columns

  • Report an issue