Bank Holding Companies (BHCs) in the US are required to file quarterly consolidated financial statements by the Federal Reserve System. In this article I use an LLM to examine the FR Y-9C reports for the six largest BHCs.
Background
The Federal Financial Institutions Examination Council (FFIEC) provides access to the National Information Center (NIC) website, and selecting the tab for Large Holding Companies, I see:
Showing the six largest BHCs ranked by Total Assets, in thousands of USD, so JPMorgan is $4.36 trillion.
(Not shown is the list of 149 bank holding companies with total assets > $10 billion).
Clicking on JPMorgan, shows the page with access to six distinct report types.
Selecting the drop down for FR Y-9C, shows the quarterly report history is available in both CSV and PDF format, starting 3/31/2020 and can easily be downloaded by clicking on the links.
Let’s Start
In contrast to my recent Examining Blog in which I examined a single report, today I look at the data across multiple reports for the same and competitor institutions.
I downloaded the 5 most recent quarters for JP Morgan and then did the same for the next five largest holding companies, so 6 times 5 or 30 reports.
The PDF format is 3.6 Mb and the CSV is 230Kb (1,700 rows).
Given the size and number of files, it makes little sense to upload all these into an LLM GUI and try to examine.
It is better to first extract structured data.
Inspecting a PDF and CSV and keeping it simple for this blog, I decided on the Income Statement and within this the specific measures for Trading Revenue.
Extracting Data
Before wide adoption of LLM Apps, I would have used Excel to upload all these CSV files and used Pivot Tables to examine the data. That is still a reasonable approach, but as this blog is about LLMs, which offer new capabilities, let’s explore that.
Given I want to extract specific data from the 30 CSV files, into one file, I need a simple script to do this, ideally python.
So using Claude Desktop with Sonnet 4 model selected, I uploaded a single FRY9C report and entered the below prompt:
A simple 4 lines, having given some thought to what I wanted to achieve and decided on the first concrete step.
I then ran this script and examined the output file produced.
A simple CSV file with 4 columns and 180 rows of data (6 measures, 5 quarters and 6 holding companies) and just 14 KB in size.
Far more efficient in token usage of LLM APIs than uploading all 30 CSVs.
(Note I could further improve the python script to directly download the FRY9C files from the NIC website, and then extract data of interest).
Now we have our data of interest, we can examine it.
Analysing Data
I uploaded my output.csv file and entered:
A bit verbose, but I have learnt to be specific-ish.
The process then kicked off as below:
On using the Dashboard, I thought it could benefit from some more commentary:
This then came back with a nice interactive dashboard, with both Single Institution Trends and Institution Comparison charts and commentary, allowing selection of specific measures
The quarterly figures here are cumulative calendar year to date, which explains the increase size of the bars for Q2, Q3 & Q4.
You can download this output.csv file and try the Interactive Dashboard yourself. (Note this sometimes results in a blank page, if it does please close the tab and try again, it is definitely not production software).
I think you will agree that it is a nice representation and analysis of the data.
The interactive nature of the dashboard is great and and the analysis commentary is helpful.
However, as we know, it is always good to test anything new.
In this case, it was simple enough for me to use Excel and a Pivot table to check the numbers shown in the dashboard.
(I could also have created Pivot Charts to mimic the ones in the Dashboard).
I am pleased to say that everything I tested matched between Excel and the Dashboard and the observations/analysis updated appropriately based on my selections.
Let’s also create a pdf summary.
First section shown below
(Given the cumulative calendar year to date figures, the QoQ change shown above is not meaningful, but the YoY change is. Given time and hindsight, I would have pre-adjusted the figures before loading to be the revenue for that quarter only and not cumulative).
Much like creating a cooking recipe, there was some trial and error in my work above, (which I have not shown) and a bit of frustration at the end.
The extracting data by generating python code worked well.
As did the Interactive Dashboard I could test myself.
However to share this with readers, the output.csv in my context window had to be made available from somewhere else.
Claude presented me with a number of options and I picked URL of the file in Google Drive, which led me up the garden path and a wasted/frustrating 30-40 minutes culminating in context window overflows…
This also included deleting my chat and starting a new one, with the same file and prompt, but getting a different Interactive Dashboard display, not as nice as my first and my trying to improve this and getting errors and then trying to revert to the prior version.
Finally I got back to the version I liked and asked Claude to add a File upload, which worked.
Which does remind me that in giving up control and asking Claude to do the data analysis, is not going to get the exact same result each time.
Unlike Excel, where I can design the Pivot Charts and Tables that I want.
Still the slick interactive Web App and PDF with analysis commentary that I can share on the web are great.
This allows for quick understanding of the Trading Revenue data.
In Summary
Using Claude + Python I extracted data from FR Y-9C reports for six major bank holding companies over five quarters.
Claude created an Interactive dashboard to analyse, compare and reveal trends in the data.
Excel Pivot tables are a mature technology, which I could have used to create similar analysis.
GenAI Apps and LLMs offer a new way of working with greater capability.
Leave a Reply