Implications of “Regression Fishing” over Cogent Modeling.

One of the most recurrent questions in quantitative research refers to on how to assess and rank the relevance of variables included in multiple regression models. This type of uncertainty arises most of the time when researches prioritize data mining over well-thought theories. Recently, a contact of mine in social media formulated the following question: “Does anyone know of a way of demonstrating the importance of a variable in a regression model, apart from using the standard regression coefficient? Has anyone had any experience in using Johnson’s epsilon or alternatives to solve this issue? Any help would be greatly appreciated, thank you in advance for your help”. In this post, I would like to share the answer I offered to him by stressing the fact that what he wanted was to justify the inclusion of a given variable further than its weighted effect on the dependent variable. In the context of science and research, I pointed out to the need of modeling appropriately over the mere “p-hacking” or questionable practices of “regression fishing”.

[wp-paywall]

What I think his concern was all about and pertained to is modeling in general. If I am right, then, the way researchers should tackle such a challenge is by establishing the relative relevance of a regressor further than the coefficients’ absolute values, which requires a combination of intuition and just a bit of data mining. Thus, I advised my LinkedIn contact by suggesting how he would have almost to gauge the appropriateness of the variables by comparing them against themselves, and analyze them on their own. The easiest way to proceed was scrutinizing the variables independently and then jointly. Therefore, assessing the soundness of each variable is the first procedure I suggested him to go through.

In other words, for each of the variables I recommended to check the following:

First, data availability and the degree of measurement error;

Second, make sure every variable is consistent with your thinking –your theory;

Third, check the core assumptions;

Fourth, try to include rival models that explain the same your model is explaining.

Now, for whatever reason all variables seemed appropriate to the researcher. He did check out the standards for including variables, and everything looked good. In addition, he believed that his model was sound and cogent regarding the theory he surveyed at the moment. So, I suggested raising the bar for decanting the model by analyzing the variables in the context of the model. Here is where the second step begins by starting a so-called post-mortem analysis.

Post-mortem analysis meant that after running as much regression as he could we would start a variable scrutiny for either specification errors or measurement errors or both. Given that specification errors were present in the model, I suggested a test of nested hypothesis, which is the same as saying that the model omitted relevant variables (misspecification error), or added an irrelevant variable (overfitting error). In this case the modeling error was the latter.

The bottom line, in this case, was that regardless of the test my client decided to run, the critical issue will always be to track and analyze the nuances in the error term of the competing models.

I recognized Heteroscedasticity by running this flawed regression.

In a previous post, I covered how heteroscedasticity “happened” to me. The anecdote I mentioned mostly pertains to time series data. Given the purpose of the research that I was developing back then, change over time played a key factor in the variables I analyzed. The fact that the rate of change manifested over time made my post limited to heteroscedasticity in time series analysis. However, we all know heteroscedasticity is also present in cross-sectional data. So, I decided to write something about it. Not only because did not I include cross-sectional data, but also because I believe I finally understood what heteroscedasticity was about when I identified it in cross-sectional data. In this post, I will try to depict, literally, heteroscedasticity so that we can share some opinions about it here.

As I mentioned before, my research project at the moment was not very sophisticated. I had said that I aimed at identifying the effects of the Great Recession in the Massachusetts economy. So, one of the obvious comparisons was to match U.S. states regarding employment levels. I use employment levels as an example given that employment by itself creates many econometric troubles, being heteroscedasticity one of them.

The place to start looking for data was U.S. Labor Bureau of Statistics, which is a nice place to find high quality economic and employment data. I downloaded all the fifty states and their jobs level statistics. Here in this post, I am going to restrict the number of states to the first seventeen in alphabetical order in the data set below. At first glance, the reader should notice that variance in the alphabetical array looks close to random. Perhaps, if the researcher has no other information -as I often do- about the states listed in the data set, she may conclude that there could be an association between the alphabetical order of States and their level of employment.

Heteroscedasticity 1

I could take any other variable (check these data sources on U.S. housing market) and set it alongside employment level and regress on it for me to explain the effect of the Great Recession on employment levels or vice versa. I could find also any coefficients for the number of patents per employment level and states, or whatever I could imagine. However, my estimated coefficients will always be biased because of heteroscedasticity. Well, I am going to pick a given variable randomly. Today, I happen to think that there is a strong correlation between Household’s Pounds of meat eaten per month and level of employment. Please do not take wrong, I believe that just for today. I have to caution the reader; I may change my mind after I am done with the example. So, please allow me to assume such a relation does exist.

Thus, if you look the table below you will find interesting the fact that employment levels are strongly correlated to the number of Household’s pound of meat eaten per month.

Heteroscedasticity 2

Okay, it is clear that when we array the data set by alphabetical order the correlation between employment level and Household’s Pounds of meat eaten per month is not as clear as I would like it to be. Then, let me re-array the data set below by employment level from lowest to the highest value. When I sort out the data by employment level, the correlation becomes self-evident. The reader can see now that employment drives data on Household’s Pounds of meat eaten per month up. Thus, the higher the number of employment level, the greater the number of Household’s Pounds of meat consumed per month. For those of us who appreciate protein –with all due respect for vegans and vegetarians- it makes sense that when people have access to employment, they also have access to better food and protein, right?

Heteroscedasticity 3

In this case, given that I have a small data set I can re-array the columns and visually identify the correlation. If you look at the table above, you will see how both growth together. It is possible to see the trend clearly, even without a graph.

But, let us now be a bit more rigorous. When I regressed Employment levels on Household’s Pounds of meat eaten per month, I got the following results:

Heteroscedasticity 4

After running the regression (Ordinary Least Squares), I found that there is a small effect of employment on consumption of meat indeed; nonetheless, it is statistically significant. Indeed, the regression R-squared is very high (.99) to the extent that it becomes suspicious. And, to be honest, there are in fact reasons for the R-squared to be suspicious. All I have done was tricking the reader with a fake data on meat consumption. The real data behind meat consumption used in the regression is the corresponding state population. The actual effect in the variance of employment level stems from the fact that states do vary in population size. In other words, it is clear that the scale of the states affects the variance of the level of employment. So, if I do not remove size effect from the data, heteroscedasticity will taint every single regression I could make when comparing different states, cities, households, firms, companies, schools, universities, towns, regions, son on and so forth. All this example means that if the researcher does not test for heteroscedasticity as well as the other six core assumptions, the coefficients will always be biased.

Heteroscedasticity 5

For some smart people, this thing is self-explanatory. For others like me, it takes a bit of time before we can grasp the real concept of the variance of the error term. Heteroscedasticity-related mistakes occur most of the time because social scientists look directly onto the relation among variables. Regardless of the research topic, we tend to forget to factor in how population affects the subject of our analysis. So, we tend to believe that it is enough to find the coefficient of the relation between, for instance, milk intake in children and household income without considering size effect. A social scientist surveying such a relation would regress the number of litters of milk drunk by the household on income by family.

Here is the story of how I met heteroscedasticity.

Sometimes it is good to learn about issues such as heteroscedasticity by empirically identifying them. Here is how I detected heteroscedasticity was present in time series analysis. I started working on a research project intended to measure how Massachusetts economy had recovery from the Great Recession of 2009. Neither was it a sophisticated research, nor the scope went further than to describe the way Mass’ economy had reallocated resources after the crisis. I knew at the time that descriptive statistics would suffice my research objectives. So, I picked a bunch of metrics that I thought would depict mostly downward slopes lines. I remember having chosen Gross Domestic Product by industry. So, I started plotting data in charts and graphs. Then I turned onto municipalities. I gathered some data on employment levels cross-sectional and time series. Once I was done with the exploratory phase of the research, I started to see strange patterns in the graphs. Everything went up and up, even after a recession. Apparently, it did not make sense at all, and I had to research the reason behind upward slopes in the time of economic distress.

It turned out heteroscedasticity was the phenomenon bumping up the lines. I said, it is nice to meet you Miss, but who the heck are you? Not knowing heteroscedasticity is almost the same thing as ignoring lurking or confounding variables in your regression model. However, the difference stems from the fact that heteroscedasticity does aggregate lurking variables and hides them within the model’s error term. In descriptive statistics of time series analysis, heteroscedasticity manifests as a portion of the area underneath the line, which makes time series lines to have a false rate of change. It looks like the lines had been inflated artificially. Obviously, this is clear when the measure tracks currency. We all know that currency grows over the time as its value depreciate. Therefore, we all adjust by inflation, right? Although adjusting for inflation was an easy task, the lines kept on showing upward trends. Something was going under definitely -I thought at the moment.

By Catherine De Las Salas

By Catherine De Las Salas

On the other side, measures like employment levels also were trending upwards. Even though employment is an economic measure, I am not idiot enough for confounding and associating it with inflation. Perhaps, there might be a theory in which employment could depreciate over time as currency; but, I know it performs differently to price inflation. So, after doing my research, I found that it was the growth of population that bolstered employment growth after the crisis. Does that count as real job growth? No, it does not. Then, how should I measure such a distorted effect? Once again, heteroscedasticity held the answer.

What is technically heteroscedasticity?

Heteroscedasticity is a data defect that thesis advisors use for to make you work harder. No, seriously. What is heteroscedasticity? Technically, heteroscedasticity is the correlation between the error term and one of the independent variables. In other words, it is an effect caused by the nature of the data most of the times. It is a phenomenon that data collected over time suffer from, and which means that the error term of the model has variance different than zero. In time series analysis, econometricians call such a thing Non-stationary Process, hence one of the main assumptions in linear regression analysis is to aim at analyzing data that is Stationary Stochastic Process.

By Catherine De Las salas

By Catherine De Las salas

What makes heteroscedasticity a problem?

Heteroscedasticity taints estimated coefficients in regression analysis. The collection technique can generate heteroscedasticity, outliers can trigger heteroscedasticity, incorrect data transformation can create heteroscedasticity, and skewness in the distribution of the data can produce heteroscedasticity.

Ever since the first test I use for heteroscedasticity in time series analysis is the graphical method. Yes, it is an informal method, but it gives researchers an idea of what transformation to do in the data. Finally, if you want to hear about how to estimate heteroscedasticity with a formal procedure here is my advice.

Although I use mostly either White test or Park test when testing for heteroscedasticity, if you must use Breusch-Pagan for whatever reason, here is what you need to do. The goal in Breusch-Pagan test is to estimate the ½ of Explained Sum of Squares (ESS), which follows approximately a Chi-Square distribution. You will have to build an additional regression model based on the model you suspect the heteroscedasticity is present in. The first thing is to obtain the residuals from your model through OLS. Then, estimate a rough statistic of its variance by adding up and squaring the residuals to ultimately dividing by the number of your observations. Once you have the approximate variance of the residuals, proceed to create a new variable by dividing each residual squared by the estimated variance above. Let us call such a new variable p. Now, regress p on the independent variables of your original model. Obtain the Explained Sum of Squares and divide it by 2. Then compare your 1/2ESS statistic with those in the Chi Square Table.

Let me know if you need help with getting rid of heteroscedasticity!

The Current Need for Mixed Methods in Economics.

Economists and policy analysts continue to wonder what is going on in the U.S. economy currently. Most of the uncertainty stems from both the anemic pace of economic growth as well as from fears of a new recession. In regards to economic growth, analysts point out to sluggish changes in productivity, while fears of new recessions derive from global markets (i.e. Brexit). Unlike fears from a global economic downturn, the previous issue drives many hypothesis and passions given that action relies on fiscal and monetary policy further than just market events. Hence, both productivity and capacity utilization concentrate most of the attention these days on newspapers and op-eds. Much talk needs to undergo public debate before the economists’ community could pinpoint the areas of the economy that require an urgent overhaul; indeed, I would argue that analysts need to get out there and see through not conventional lens how tech firms struggle to realize profits. Mixed methods in research would offer insights of what is holding economic growth lackluster.

Why do economists sound these days more like political scientists?

Paradoxically enough, politics is playing a key role in unveiling circumstances that otherwise economists would ignore, and it is doing so by touching the fiber of the layman’s economic situation. The current political cycle in the U.S. could hold answers for many of the questions economists have not been able to address lately. What does that mean for analysts and economists? Well, the fact that leading economists sound these days more like political scientists than actual economists means that the discipline must make use of interdisciplinary methods for fleshing out current economic transformations.

Current economic changes, in both the structure of business as well as the structure of the economy, demand a combination of research approaches. At first instance, it is clear that economists have come to realize that traditional data for economic analysis and forecast have limitations when it comes to measuring the new economy. That is only natural as most economic measures were designed for older economic circumstances surrounding the second industrial revolution. Although traditional metrics are still relevant for economic analysis, current progress in technology seems not to be captured by such a set of survey instruments. That is why analysts focusing on economic matters these days should get out and see for themselves what data cannot capture for them. In spite of the bad press in this regard, no one could argue convincingly that Silicon Valley is not adding to productivity in the nation’s businesses. Everyone everywhere witnesses how Silicon Valley and tech firms populate the startup scene. Intuitively, it is hard to deny that there are little to none gains from tech innovation nowadays.

Get out there and see how tech firms struggle to realize profits.

So, what is going on in the economy should not be blurred by what is going on with the tools economists use for researching it. One could blame the analysts’ incapability of understanding current changes. In fact, that is what happens first when structural changes undergo economic growth, usually. Think of how Adam Smith and David Ricardo fleshed out something that nobody had seen before their time: profit. I would argue that something similar with a twist is happening now in America. Analysts need to get out there and see how tech firms struggle to realize profits. Simply put, and albeit generalizations, the vast majority of newly entrepreneurs do not know yet what and how much to charge for new services offered through the internet. Capital investment in innovative tech firms ventures most of the times without knowing how to monetize services. This situation exacerbates amid a hail of goods and services offered at no charge in the World Wide Web, which could prove that not knowing how to charge for services drives current stagnation. Look at the news industry for a vivid example.

Identifying this situation could shed light onto economic growth data as well as current data on productivity. With so much innovation around us, it is hard to believe that productivity is neither improving nor contributing to economic growth in U.S. Perhaps, qualitative approaches to research could yield valuable insights for analysis in this regard. The discipline needs desperately answers for policy design, and different approaches to research may help us all to understand actual economic transformations.

U.S. economic slowdown? Look at Real Estate labor market.

One month of weak payroll data does not make a crisis. The US economy appears to have added only 160,000 new jobs during the month of April in 2016, the Bureau of Labor Statistics reported on Friday. A similar number was published earlier in that week by the payroll firm ADP. Although the slowdown in hiring came from local and federal government (-11,000) as well as from mining (8,000), the sector that should get more attention is Real Estate and Leasing Services. Indeed, this sector could be revealing what is happening in the current economic conditions.

For the last economic quarter, analysts have seen employment growth being incongruent when compared to GDP growth. And now that the employment payroll looks weak, many analysts would like to rush and call out an economic recession. However, it is too soon for asserting anything akin a crisis mainly because the slowdown in hiring came from local and federal government (-11,000) as well as from mining (8,000). Those two sectors were expected not to grow given that oil prices are still low, and the electoral cycle continues. Retail trade also failed to add jobs at the same pace the sector was adding during the past three months, but the -3,000 jobs slowdown is not alarming since the industry’s previous growth was strong.

Otherwise, the sector that should get more attention is Real Estate and Leasing Services as the spring season brings business to their stores. Establishing how busy real estate agents are around this time of the year could shed light onto how the economy is running actually for two reasons. Not only because weather season affects their business cycle, but also because their business depends highly on the interest rate. In fact, Real Estate labor market seems anemic lately. The sector’s change over the month of April seems to have added about 600 new jobs, which is certainly poor for what the season should have demanded.

The fact that housing sales depend on interest rates allows for inferences on how expectations on Federal Reserve bonds influence the job market. In other words, the anemic employment growth in Real Estate appears not to derive from a sluggish demand for housing as it does from interest rate expectations. Thus, persistent market speculations on rising interest rates could have had an effect on current consumer expectation on both housing and consumption. Therefore, it seems logical to think that because of that companies halted hiring in April, especially the Real Estate ones. Only time will unveil the outcome though.

The focus right now is on the next meeting of the Federal Open Market Committee in which monetary policy maker will decide again whether to increase the rates or leave them unchanged.

Unemployment √. Inflation √. So… what is the Fed worrying about?

Although the Federal Open Market Committee (hereafter FOMC) March’s meeting on monetary policy focused on what apparently was a disagreement over the timing for modifying the Federal Bonds interest rates, the minutes indicate that the disagreement is not only on timing issues but also on exchange rate challenges. Not only does the Fed struggle with when the best moment is to raise the rate, but also it grapples with the extent to which its policy decisions can reach. The FOMC current economic outlook and their consensus on the state of the U.S. economy have no room for doubts on domestic issues as it does for uncertainties on foreign markets. Thus, the minutes of the meeting held in Washington on March 15th – 16th 2016 unveils an understated intent for influencing global markets by stabilizing the U.S. currency. On one hand, both objectives of monetary policy seem accomplished regarding labor markets and inflation. On the other, the global deceleration is the only factor that concerns the Fed since it could have adverse spillovers on America. The most recent monetary policy meeting reveals a subtle attempt to stabilize the U.S dollar exchange rate at some level, thereby favoring American exports.

Unemployment rate √. Inflation rate √.

The institutional objective of the Federal Reserve Bank seems uncompromised these days. Economic activity is picking up overall, the labor market is at desired levels, and inflation seems somewhat under control. The confidence economists have right now starts by the U.S. Household Sector. Household spending looks healthy, and officials at the Bank are confident such spending will keep on buoying labor markets. As stated in the minutes, “strong expansion of household demand could result in rapid employment growth and overly tight resource utilization, particularly if productivity gains remained sluggish” (Page 6). Indeed, the labor market is showing strong gains in employment level which has made the unemployment rate to decrease down to 5.0 percent by the end of the first quarter of 2016.

Furthermore, FOMC understands the high levels of consumer confidence as a warranty for a sustained path for growth. The committee also pointed out that low gasoline prices are stimulating not only higher level of consumption but also motor vehicles sales. They know of the excellent situation of the relative high household wealth to income ratio. Otherwise, members of the Committee recognize that regions affected by oil prices are starting to struggle while business fixed investment shows signs of weakening. Nevertheless, the consensus among members of the Committee reflects an overall optimism in the resilience of the economy rather than a worrisome situation about the outlook.

By Catherine De Las Salas

By Catherine De Las Salas

The fear comes from overseas.

The transcripts, which were released on April 6th, 2016, show that  Fed officials the concerns stem from global economic and financial developments. The FOMC “saw foreign economic growth as likely to run at a somewhat slower pace than previously expected, a development that probably would further restrain growth in U.S. exports and tend to damp overall aggregate demand” (Pag. 8). They also flagged warnings on wider credit spreads on riskier corporate bonds. In sum, policymakers at the FOMC interpret the current lackluster global situation as a threat to the economic growth of the United States.

To discard choices.

Therefore, the fact that those two conditions overlap has made the Committee anxious to intervene in an arena that perhaps could be out of its reach. By keeping unmoved the interest rate of the federal bonds during March -and perhaps doing so until June-, the FOMC does not aim at stimulating investment domestically. Nor does it at controlling inflation. In fact, the policy choice reveals a subtle attempt for keeping the U.S dollar exchange rate stable overseas, thereby favoring American exports. The latter statement could be inferred from the minutes based on the Committee’s consensus on the state of the economy. First, U.S. labor markets are strong, and the Fed considers that the actual unemployment rate corresponds to the longer-run estimated rate. Second, inflation –either headline or core- are projected and expected to be on target. And third, domestic conditions are in general satisfactory. The only factor that remains risky is the rest of the world. Therefore, whatever action they took last March meeting could be interpreted as intended for influencing global markets.

 

Why is America’s center of gravity shifting South and West?

Ever since Florida surpassed New York as the third most populous state in the nation, journalists started to document the ways in which the South region of the United States began attracting young sun-lovers enthusiasts. Two factors have been identified as drivers of an apparent migration from the north towards the south. On one hand, real estate prices have been arguably one of the major causes for people heading south. On the other, employment growth and better job opportunities allegedly support decisions on moving out regionally. This article checks empirical data on those two factors to determine the effect on population growth of major cities in the United States. The conclusion, in spite of the statistical model limits, indicates that employment dynamic seems to drive a slightly higher level of influence in population growth when compared to housing costs.

Is it because of real estate prices?

The first factor some prominent people have identified is real estate prices. Professor Paul Krugman highlighted in his NYTimes commentary of August 24th, 2014 that the most probable reason for people heading south is housing costs, even over employment opportunities. From his perspective, employment has little effect on such a change given that wages and salaries are substantially lower in southern states when compared to the north. Whereas, housing costs are significantly lower in southern regions of the country. Professor Krugman asserts that “America’s center of gravity is shifting South and West.” He furthers his argument “by suggesting that the places Americans are leaving actually have higher productivity and more job opportunities than the places they’re going”.

By Catherine De Las Salas

By Catherine De Las Salas

Is it because of employment opportunities?

Otherwise, Patricia Cohen –also from the NYtimes- stresses the relevance of employment opportunities in cities like Denver in Colorado. In her article, the journalist unfolds the story of promising entrepreneurs immersed in an economically fertile environment. The opposite situation to that prosperous environment happens to locate northeast of the United States. Cohen writes that not only “in the Mountain West — but also in places as varied as Seattle and Portland, Ore., in the Northwest, and Atlanta and Orlando, Fla., in the Southeast — employers are hiring at a steady clip, housing prices are up, and consumers are spending more freely”. Her article focuses on contrasting the development of urban-like amenities and how those attractions lure entrepreneurs.

A brief statistical analysis of cross-sectional data.

At first glance, both factors seem to be contributing factors for having an effect on migration within states. However, although both articles are well documented, neither of those readings goes beyond anecdotal facts. So, confirming those very plausible anecdotes deserves a brief statistical analysis of cross-sectional data. For doing so, I took data on estimated population growth for the 71 major cities in the U.S. from 2010 to 2015 (U.S. Census Bureau), and regressed it on the average unemployment rate in 2015 (U.S. Bureau of Labor Statistics), median sale price of existing houses for the same year (National Association of Realtors), and the U.S. Census Bureau’s vacancy rate for the same year and cities (Despite that the latter regressor might be multicollinear with sale price of existing houses, its inclusion in the model aims at reinforcing a proxy for housing demand). The statistical level of significance for the regression is a 90 percent confidence interval.

Results.

The results show that, for these data sets and model, the unemployment rate has a bigger effect on population growth than vacancy rate and median home sale prices altogether. The regression yielded a significant coefficient of -2.78 change in population growth as unemployment decreases. In other words, the lower the unemployment rate, the greater the population growth. A brief revision of empirical evidence shows that, once the coefficients are standardized, unemployment rate causes a higher effect on the dependent variable. If we were to decide which of the two factors affects population growth greater, then we would have to conclude that employment opportunities do it largely.

Regression Results.

Regression Results.

By using these data sets and this model, the employment dynamic seems to drive a slightly higher level of influence in population growth, when compared to housing costs. The unemployment rate has a standardized effect of negative 56 percent. On the other hand, median sale price of houses pushes a standardized change effect of 23 percent. Likewise, vacancy rate causes in the model an estimated 24 percent change in population change. Standardized coefficients are a tool meant to allow for disentangling the combined effect of variables in a model. Thus, despite that the model explains only 35 percent of population growth, standardized coefficients give insights on both competing factors.

Limits of the analysis.

These estimates are not very reliable given that population growth variable mirrors a five years lapse while the other variables do so for one year. In technical words, the delta of the regressand is longer than the delta of the regressors. For this and many other reasons, it is hard to conclude that employment constitutes the primary motivation for people moving out south and west. Nonetheless, this regression sheds light onto a dichotomy that needs to be understood .

Why Is the Homeownership Rate Still Falling? An alternative explanation.

When it comes to loan rates, the one that concerns the most regular consumers is the mortgage loan interest rate. This past February 2016 a 30-year mortgage interest rate averaged 3.66 percent accordingly to Freddie Mac, whereas the homeownership stubbornly kept its 63 percent level. So, with the mortgage interest rates averaging 3.53 percent (15-year mortgage loan), why the homeownership has not come back to 69 percent level as it was before the Great Recession? Some analysts have proposed cynically that 69 percent homeownership represents an unsustainable level, and that homeownership is no longer attractive. Neither of those explanations would look rational to a maximizing agent. Otherwise, an alternative analysis could lead to a different conclusion. That is, low-interest rates are helping investors to outbid competitors rather than prospective homeowners to get a house. The worrisome part of the problem is that this situation could lead the housing market to a crisis due to inflated home prices, as well as to higher levels of inequality.

Given that purchasing a house represents arguably the biggest investment of a lifetime of a regular person, these rates are mainly observed by monetary authorities, analysts, and homebuyers. In fact, these rates have become even more relevant since the Great Recession originated ostensibly from failures within the regulation of the housing market.

By Catherine De Las Salas

By Catherine De Las Salas

Homeownership rate has been declining.

A rapid view of real estate market indexes will show firstly that homeownership rate is flat. This rate has been flat and declining since its highest level before the Great Recession for which it reached 69 percent. Last economic quarter of 2015, homeownership registered 63.7 percent. Second, prices of both sales and rents are up to the extent that cost of shelter is among the only factors driving up inflation in the United States. Following the Case-Shiller index and the Federal Housing Finance Agency, home prices have increased at a yearly rate of 6.0 percent. Third, new residential sales, as measured by the U.S. Census Bureau, were also up by 6.1 percent in January 2016 when compared to the same month of 2015. Likewise, Pending Home Sales in January recorded 3.5 percent increase. Fourth, mortgage loan interest rate averaged 2.96 percent for a 15-year fixed loan during February 2016 (find more on housing indicators)

Investors could be outbidding prospective homeowners.

All these indexes beg the question on why homeownership has not increased due to the rising levels of sales, as well as the cost of shelter, and upturns in home prices. One of the answers available for this puzzle is that investors are taking over the market. Investors could be outbidding prospective homeowners making it harder for them to access ownership. Likewise, having investors controlling the housing market retains the risk that speculative money could inflate a bubble again in the housing sector, leading loans to go underwater at some point afterwards. A housing sector crisis could repeat under the same circumstances of the Great Recession nowadays.

The counter argument derives from the fact that the housing crisis was only the trigger for the Great Recession to start. Indeed, default in mortgage-backed loans trickled down in the form of multiple spillovers on the banking system. Securitization of banking products through the practice of bundling subprime mortgages led to the spreading of toxic assets all over the financial system (learn more of this issue here). Therefore, the fact that recent regulation within the financial system, as well as regulation governing lending practices, makes it less vulnerable for the rest of the economy. So, if the housing sector happens to be a risky position, an eventual crisis will not spread inasmuch as it did before the Great Recession.

The consequence, rising levels of inequality.

So, although a housing sector crisis could be discarded by looking at the arguments herein, the effects on inequality could not. As of March 2016, there appears to be no worrying signs or data with respect to the housing market. Nevertheless, assuming that homeownership has not increased because an alleged lack of incentives in owing seems ridiculously naïve.  And concluding that pre-Great Recession levels of homeownership were unsustainable appears not rational either. Then, an alternative explanation points at the competition of capital for seizing valuable assets. The consequence, low level of homeownership rate while rising levels of inequality.

Eight Data Sources for Research on U.S. Housing Market.

The National Association of Realtors communicated today that its index of Pending Homes Sales increased 3.5 percent in February 2016. This indicator offers valuable insight for housing market analysis here in the United States. Indeed, the index makes up a leading indicator of housing market and forecasts since it is based on signed real estate contracts, including single family homes, condos and co-ops. The relevance of tracking this index’s evolution, and other metrics listed herein, stems from the fact that the Great Recession originated presumably from failures within the regulation of the housing market.

By Catherine De Las Salas

By Catherine De Las Salas

Although the Pending Homes Sales moved upwards on February, this news is contradicting the long term trend of Home Ownership rate, which has been steadily declining since the beginning of the Great Recession. This fact could be pointing to a fascinating development in the sector. Precisely, these type of contradictions is the reason the U.S. housing market has become so intriguing for researchers, especially since toxic Mortgage Backed Securities triggered the Great Recession in the United States.

There are several resources at hand for advancing research in U.S. Housing Market. The ones that econometricus.com monitors frequently are the following:

  1. Pending home Sales. Data Source: National Association of Realtors.
  2. Case-Shiller Home Prices Index. Data Source: S&P Down Jones Indices.
  3. House Price Index. Data source: U.S. Federal Housing Finance Agency.
  4. Existing Home Sales. Data Source: National Association of Realtors.
  5. New Residential Construction. Data Source: U.S. Census Bureau.
  6. Housing Market Index. Data Source: National Association of Home Builders.
  7. Housing Vacancies and Home Ownership. Data Source: U.S. Census Bureau.
  8. Construction Put in Place. Data Source: U.S. Census Bureau.

Moreover, some of the most trusted housing sector metrics were proposed after the Great Recession (2009). For those who consider that the Great Recession was not an exclusive event of banking leverage, complexity and liquidity (learn more on this issue here), the following measures may shed light on valuable research questions and answers. In other words, flaws in the supply side of the housing market –Mortgage lending banks- might have had an impact in spreading the Great Recession, but, more importantly, the demand side could have had a more relevant role in triggering the crisis. Thus, these data may help researchers in explaining when and why mortgages went underwater in the first place.

Finally, Econometricus.com helps clients in understanding the economic relationship between a specific research and the United States’ Housing Market environment. Applied-Analysis can be either “Snapshots” of the Housing Market in U.S. Economy or historical trends (Time-series Analysis). Clients may simplify or augment the scope of their research by including these important variables in their models.