ChatGPT OpenAI Find Cash Flowing Real Estate Properties

In this video, I'm going to show you how open AI can answer questions on investment properties. We will walk through an example of using open AI to answer which properties have the highest cash on cash return.

chat GPT allows us to quickly ask questions and receive human like responses. I found Chat GPT to be most useful when exploring new datasets. In this video, I'm going to show you how open AI can answer questions on investment properties. We will walk through an example of using open AI to answer which properties have the highest cash on cash return. My name is Ariel Herrera, your fellow data scientists with the analytics aerial channel where we bridge the gap between real estate and technology. Please like this video and subscribe to help us reach a wider audience. All right, let's get started. Whether you're a new or seasoned investors, one of the main questions that always comes up is where should I be investing. One of the hottest areas to invest over the last several years have been the Sun Belt or smile states. They're called a smile states because they're at towards the bottom part of the US and look almost like a smile. In this case, let's imagine that we know he wants to invest in one of these states. And we do a little bit more research as to which one to invest in. This brings us to listening to podcasts, reading books, going into bigger pockets forums, and ultimately, we land on Birmingham, we hear that Birmingham Alabama is one of the ripe cities to invest in. Not only is it home to world class medical research, similar beta for food and art scene, it is also an area for tourism as well, for me besides this, I know nothing about Birmingham, I don't know which streets to invest in, which are the best properties to cash flow. But let's see if we can start asking chat GPT some questions that can help us get us there. Right now. I'm on chat GPTS page, we can ask questions and receive human like responses. So let's imagine that we're newbie investors, and we keep hearing this term cash on cash return as a metric to use. So let's add Chat GPT. If we were to invest in Birmingham, Alabama, what is this cash on cash return metric? So I'm going to ask and type at the bottom. What is cash on cash return in real estate chat GPT is using a massive amount of data that is trained on to provide us an answer. It tells us cash on cash return is a financial metric used in real estate that measures the return on actual cash invested basically for the money that you put in? What is the ratio of your return based on your revenue? Great. So at this point, we know we want to focus on a smile state particularly Alabama, we listen to a podcast we want to invest in Birmingham and we're going to use cash on cash return as our metric. So one of the next steps that I would take is going to a free site like Zillow for example. However, Zillow is not meant for investors it's really meant for homebuyers say a first time homebuyer but if this was the only tool that I would be using, I would start trying to scour some of these properties here. However, again, I have no idea where to start and which could be the best properties. In this case. This is where we're going to start using open AI with Python to ask questions about a data set that has property information and listings. With investment metrics provided by coffee closers, I've been able to extract property listings, I'm able to obtain all single family homes in Birmingham, Alabama, that are current listings for sale. I have information that's provided on sites like Zillow, including the price that it's listed at your bill square footage. But even more important information for investors, like what the mortgage costs would be vacancy rate, and towards the end what the monthly profit and cash on cash return. It's all though the spreadsheet is loaded with information. It's still a lot of data to sort through. So let's now jump into the Python notebook where we will read in this data and be able to ask questions to it. Right now. I'm on Google collab. Google collab is a free notebook environment that you could use to be able to run Python code without having to have it installed on your machine. So what I'm going to do is walk you through how to use this code and we're going to ask questions using open AI. So if you use the link below, you can come to file and save a copy in drive if you like to replicate and work with this notebook that I already have. So our first step we're going to To take is installing Lang chain open API. This will allow us to be able to ask questions to our dataset, you're going to need an open API key, which is free to obtain but does have a cost per API call that you make. Use the link below if you'd like to sign up for open AI. The next steps will be importing the packages that we're going to use. So I'm running that cell by using Shift Enter. Or you could also press the play button on the left hand side. Next, I want to read in my open API key as a variable. So I'm using get passed that allows me to enter in my open API key while still masking it. Once entered, I could see this has completed. And now my open API key is set to a variable called Open AI underscore API underscore key, which will reference later the next step that we're going to take is uploading our file. So I'm going to run the cell which will allow me to upload the file that I have a Birmingham properties. So I'm going to select that here. And I've already filter this down based on certain criteria of what I'm looking for. For properties that are within a certain price range. As well as only looking at single family homes. I can preview my list by clicking play. And I'm reading in my data that came from the upload using Pandas dot read underscore CSV. In total, I have 81 rows with 33 columns. This includes property information, like the full address, as well as house type, year built, lot size description, which will detail in a few. And then some metrics that are more related to those who are looking at this for an investment property, including cash on cash return. So let's dive in more and start to add some features. Right now we have our address separated by street, city, state and zip code. So we can quickly apply lamda to be able to concatenate each of these columns, press play. And now we can go all the way to the right, and we could see that we have the full address here, which we'll reference in a few. Next we could use DF dot describe. To learn a little bit more about our dataset, DF dot describe gives us a little bit more insight into our data. If we look at price, we could see that there 81 rows that have price filled which makes sense, the mean price is almost 250k, with the lowest being 104 and highest being five to eight 500, almost 30k. If we were to have say million dollar homes that fell into this list, we may want to filter that out. We can also visualize our data to understand it a little bit more. If I press play here, we can look at a histogram on cash on cash return looks like there are some properties that would not really do well as investment properties, but others that have a higher cash on cash return. So let's see if we can ask some more detailed questions about our data set with open AI. Going to the docs, you can reference Liang Chang to learn a little bit more about their agent toolkit

Panda's data frame, we will be using the Create Panda's data frame agent function in order for us to start asking questions. So to create our agent, I'm passing in my open API key. Then I'm also passing in my data frame here. Once I press play, the agent is now trained on our dataset. So I can ask questions, the first one being which address has the highest cash on cash return? Using agent dot run, I can ask this question of Ns strings. If I press play, we are now entering a new agent executor chain. What's happening here is that the agent is going to be querying our data using Pandas. So let's see what we have here. Thought I need to find the address with the highest cash on cash action. We see here that we're now filtering for that column cash on cash to look for the max. The observation is that the max value is 18 point 19. Now, the agent is smart enough to recognize that we asked for the address specifically not what the highest cash on cash was in the data set. So it now looks I need to find the address associated with this cash on cash. It then uses pandas to filter again on that column that has the highest cash on cash as it now has that max value that you see here. And then it takes the address column and just locates that first day. value, which is 106, Avenue B, Birmingham, Alabama 35214. And it says I now know the final answer and provides us a spinal answer in the finished chain, which is a more intuitive way of responding to this, we now know the address with the highest cash on cash is the address we just stated. Now, I noticed this correct, because I sorted this dataset already by highest cash on cash to lowest. So if I index to view that first row, I could look a little bit more detail about this property, I have a full address. And I could see the description says the property has two houses in the same lot. Whoa, you can live in one while renting the other sounds like a house hack, or use one of them as an in law suite. This property has many possibilities come see it today. And it will be sold as it's this makes me think this could be a great opportunity to be able to purchase a property with House hacking as a strategy as well being sold as is does make me a little bit skeptical that there could be some major fixes. So I would want to dive deeper by possibly going with an agent to see the property. Now what if this property ends up not working out? That's okay. But we should have a better understanding as to what our dataset looks like how many properties actually have a positive cash flow. So in order for us to ask how many dresses have a positive cash flow, we may want to tweak this a bit to say how many have a positive monthly profit so that we can map to what that column name is. Let's press play and see how the agent responds. I had to stop the agent because it just continued to keep going. It seems like I didn't recognize monthly profit as my column. But interestingly enough, it understood what the calculations would be based off of my columns. So it tried to actually start to calculate monthly profit from the beginning. Without recognizing that I already had the column there. It tried to calculate monthly profit by adding rent and revenue, and then subtracting the other columns that I had that were related to costs. And this is pretty mind blowing that it was able to even identify this to begin with. If then started to convert all of my columns to numeric, even though both of them were already and float values is then imported pandas, and then started to calculate monthly profit, and kind of got stuck as it started to continue converting columns to numeric. And since I'm being very conscious of how many API calls are being made, I decided to just pause this process. But from learning lesson, we can see that open AI is smart enough to detect how to even calculate real estate metrics from scratch. Now, let's abstract GPT one more question. So we had a top the property that had the highest cash on cash return, we saw description on the property. Now what if we want to focus on properties that are marketed for investors? Maybe it's a recent flip, perhaps, can chat GPT parse information from the description column. Now if we ask our final question to open AI, we can ask how many addresses have a description that contain investment property, or in this case, let's put investment op or tunity. Let's click Run. And now we're entering a new agent executor chain, the agent states as a thought, I need to find other descriptions that contain the phrase investment opportunity, we start to filter on a data frame by going to the column description and use string dot contains investment opportunity, then it takes a dress and accounts it which the final answer is one, there's one single property that has an investment opportunity in that description. Let's copy this and actually pull that property out so we can make sure that this is true. So instead of having the count, we're going to remove this last part where we just filter on the data frame. And we see a description here. So let's actually open up a new cell, d f dot locate this is row 15. As we could see in the index here 15 will show all columns. And let's try to grab that description so we can look at it fully. So I'll type in description. And let's read this here. This property is currently tenant occupied and has been well maintained making it a perfect investment opportunity. So open AI was correct. It was able to interpret our question, it provided us a total count of addresses with this description. And we were able to verify it by copying the code that it provided to us using Pandas. I want to leave you with another thought. What if we wanted to actually say flip properties? What kind of terms we want to look at in the description? Maybe we want to see terms like fix and flip investment opportunity TLC terms like those, how would we maybe train a model to identify which properties possibly could be used for fix and flip or need rehab versus those that don't? Well, in the next several videos, I'm going to show you a project that I previously worked on using NLP to identify which properties require rehab or not, based on the descriptions from property listings. Thanks so much for watching and hope to catch you in the next episode.

Previous
Previous

Real Estate Data Science Project | Find Fixer Uppers using NLP in Python | Part 1

Next
Next

Get Property and Rent Estimates for Any Spreadsheet