Tuesday, May 5, 2020

Dataset to Forecast The Values

Question: Write a report analyses a dataset to forecast the values of two variables for a span of six years time. Answer: The report analyses a dataset to forecast the values of two variables for a span of six years time. The scenario states that a person named Scot Jansen has a daughter who is twelve years old now. Mr. Scott wants to pay the tuition fees of the University in which he would admit his daughter in six years from now. The fees of the University for the First Year would be $20000. Scot has a plan of investing $300 per month for the next six years before his daughter gets admitted to the university. Scot has started depositing money in two mutual funds. Very low monthly fees characterize both the investment funds. The investment strategy, which the first fund follows, is so designed that it would match the return of the SP 500. The second fund offers short-term (one-year) investment in Treasury Bills. Scot has planned to follow a policy in which he decided to contribute a fixed portion of $300 to both the funds in a fixed proportion. Scot consulted two advisors to guide him with the investments in each fund. The advisor guiding him with the first fund suggested that he should invest 80% of his proposed amount to the first fund, and the remaining 20% should go to t he second fund that is the fund of Treasury Bills. The advisor backing the first fund is of the view that the SP fund has managed to earn higher returns than the Treasury bill fund. The second advisor suggested Scot exactly the opposite that is he should deposit more in the Treasury bill fund than in the SP 500 funds. The advisor argues that though in the short run, stock returns become a risky investment, the investor can avoid such risks in the span of six years time. He also says that if Scot invests following his plans then maybe the average return will be lower, but he will have enough money accumulated in his account to meet the expenses to pay his daughter's university fees in the first year of her admission to the university. In this situation, Scot wants himself to be guided as what strategy he can follow. Analysis of the data The data set, which Scot has at hand, describes the rates of interest for both the SP500 fund as well as the short-term Treasury Bill Fund. The data that is provided illustrates the monthly rates of interest for both the funds in which Scot intends to invest his 300 dollars. The data also provides the month index of each month. The rates of interest are given from the month of January in 1990 to the month of December of 2013. The analyst finds the following values for the rates of interest for the next two years that is the values from January 2014 to December 2015 from the Yahoo finance website (Yahoo Finance Business Finance, Stock Market, Quotes, News, 2016). The rates of interest for each month from January 1990 to December 2015 are the actual monthly returns which Scot will get if he invests money in both this account. The data that the analyst collects from the yahoo finance website which gives the monthly returns from the month of January 2014 to the month of December 2015 th at is the data for the next 24 months is added to the given data set along with the previous values for 288 months. The new data set so formed is named as college fund.xlsx. After the collection of the full data set into the spreadsheet of an excel file, the analyst observes that during the period from January 2014 to December 2015 many of the values observed previously in the values of monthly returns for the past 288 months repeats themselves in the added data set. The analyst develops a spreadsheet model and simulates the two investment plans suggested by the two agents supporting the two mutual fund policies. The analyst plots the values of the monthly returns of both the mutual funds, first for the span of 288 months; that is from January 1990 to December 2013 and the for the values of the next 24 months against time. The analyst observes that the line chart drawn for the mutual fund SP 500 for both the periods of 288 months first, then 24 months does not depict the picture of any trend nor seasonality nor any cyclic variation. Both the graphs of the SP mutual funds depict irregular variation in the values of the monthly returns. The analyst now obs erves the line chart that he has drawn for the Treasury bill fund over a period of 288 months first and then for the remaining period of 24 months. Both the line charts of the Treasury Bills fund shows that the monthly returns are characterized by a trend. The four main components of time series data are a trend, seasonal variations, cyclic variations and irregular variations. Data collected over a period as in this case where the data spans over a period of 25 years, a long-term oscillation may appear in the data. This long term oscillation is termed as a trend. A trend may be of two types- linear or nonlinear. Another component that may be present in a time series data is random variations. The cause of the presence of random variation in a time series data is mostly unknown and the irregularity in the data cannot be removed by any calculations. Irregularity may be present due to changes in weather conditions or a sudden attack of natural calamity, etc. The other two variations present in time series data are cyclic and seasonal variations. One can see seasonal variation in the data after a fixed interval of time. That span of time is less than a year. Seasonal variations can be seen in the sale figures of stores selling seasonal pro ducts say woolen garments. The sale for woollen garments increases every year during the winter season. Hence, one can observe seasonality. Cyclic variations repeat itself after a span of time which may be more than a year (Granger Newbold, 2014). If Scot follows the first policy of investing $240 in SP 500 and $60 in the short term Treasury Bills, then the total amount that will be accumulated in his account after the span of six years will be $24359.52. When Scot follows the second plan of investing $60 in SP 500 and $240 in Treasury Bills, then the total amount accumulated in his account after six years would be $24156.93. Hence, one can see that the value accumulated in Scot's account following both the plans would surpass the value of money, which he intends to build up in six years time. The analyst then simulates 100 iterations of the total value of monthly returns that Scot will receive following both the strategies described earlier over a span of six years time. Simulation is a method of gathering or accumulation of large data so that the data can be further used to perform data analysis with that data. Nowadays different softwares are available which made the task of simulating data quite easily and time saving. The following are the stages to simulate data: Formulation of the model: seeing the data set the analyst forms a mathematical model with that data set. Implementation of the model formed: the analyst then runs programs in different statistical softwares to match the model formulation. Validation of the model: the analyst then validates the model to check if the data provided truly fits the model Experimental design: the analyst performs an experiment in a controlled set up with the validated data. Data analysis: finally the Analyst performs data analysis to achieve the accurate results from the data. Following the method of simulation the analyst frees himself from a great load of repetitive work involving substitution of numbers. Nowadays softwares in computers assist analysts to perform simulation quite easily (Box, et al., 2015). After obtaining the simulated values of the monthly returns, the analyst draws histogram based on the final values. The analyst draws a histogram for both the scenarios. The histogram describing the first plan where Scot invests $240 in the SP 500 and $60 in the short term Treasury Bills fund shows that the highest frequency is observed for the monthly return value of $24496.8. The histogram analysis for the second plan where Scot invests $60 in the SP 500 and the remaining $240 shows that the frequency for the monthly return value of $20875 is the highest. Based on the simulation results and looking at the histograms and line diagrams, the analyst can make certain recommendations to Scot. The analyst has already discussed above that the phenomenon of irregularity is observed in the monthly returns of the SP 500 fund, whereas a trend is observed in the values of the monthly return for the short-term Treasury bill fund. The analyst is of the view that more return can be generated from a process possessing irregular variations, as it does not follow any particular probability law or distribution. Though more risk is attached with a process, having irregularity but the returns that can be obtained is also very high. The process that observes a particular trend would fetch lower returns following the trend. The risk associated with investing in such a process possessing trend is also very less as compared to the process showing irregularity (Petitjean, et al., 2012). Hence, in this regard, the analyst may suggest Scot to follow the words of the first advisor who advised him to invest $240 in SP bills and the remaining amount of $60 in the short-term Treasury bill fund. The histogram analysis of the above two plans also depicts the same picture. The histogram drawn for the first scenario shows that the amount with the highest frequency is even higher than the value with the highest frequency in scenario two. Moreover, the line charts drawn for both the mutual funds shows that the monthly returns obtained from the SP 500 fund fluctuate with time. The line charts drawn for the Treasury bills fund shows more or less a constant trend followed by the monthly return values. Considering all the above observations, the analyst suggests Scot to follow the first plan in which he would invest more money in the SP 500 fund whose monthly return values are quite irregular. If Scot needs to pay $10000 more that is a total of $30000 as the university fees of his daughter after six years then also the analyst would suggest him to follow the same conventions to invest his money in the mutual funds. In this case, he needs to accumulate more money in six years time by investing the same amount of $300 per month in both the mutual funds. As one can see from the above analysis that value generated following the first plan is higher than the value generated following the second plan in six year's time, the analyst would suggest Scot to follow the same strategy of investing $240 in SP 500 fund and $60 in the Treasury bill fund. The analyst may have to consider some real world factors that might affect the simulations made and the conclusions drawn. The analyst may need to check certain factors that may be present in the data. The analyst can search for the modal value from the values of the monthly returns of both the funds. The analyst may check for the presence of seasonality or cyclic variations in the data. The presence of such factors may direct the analyst to consider a different model to analyze the data and predict the value of returns in six years time. Conclusion After analyzing the given data, the analyst can successfully guide Scot in the right directions to proceed with his investment plans. The findings of the analysis depict the picture that the SP 500 mutual fund is more volatile as compared to the short-term Treasury bill fund. Hence, the analyst suggests Scot to invest following the plan suggested to him by the first advisor to gain higher returns to pay the fees of his daughter's university in six years time. References: Anderberg, M. R. (2014).Cluster analysis for applications: probability and mathematical statistics: a series of monographs and textbooks(Vol. 19). Academic press. Aoki, M. (2013).State space modeling of time series. Springer Science Business Media. Box, G. E., Jenkins, G. M., Reinsel, G. C., Ljung, G. M. (2015).Time series analysis: forecasting and control. John Wiley Sons. Brockwell, P. J., Davis, R. A. (2013).Time series: theory and methods. Springer Science Business Media. Chatfield, C. (2013).The analysis of time series: an introduction. CRC press. DeFusco, R. A., McLeavey, D. W., Pinto, J., Runkle, D. E., Anson, M. J. (2015).Quantitative investment analysis. John Wiley Sons. Godsill, S. J., Doucet, A., West, M. (2012). Monte Carlo smoothing for nonlinear time series.Journal of the american statistical association. Granger, C. W. J., Newbold, P. (2014).Forecasting economic time series. Academic Press. Gyrfi, L., Hrdle, W., Sarda, P., Vieu, P. (2013).Nonparametric curve estimation from time series(Vol. 60). Springer. Petitjean, F., Inglada, J., Ganarski, P. (2012). Satellite image time series analysis under time warping.Geoscience and Remote Sensing, IEEE Transactions on,50(8), 3081-3095. Robson, C., McCartan, K. (2016).Real world research. Wiley. Shumway, R. H., Stoffer, D. S. (2013).Time series analysis and its applications. Springer Science Business Media. Treiman, D. J. (2014).Quantitative data analysis: Doing social research to test ideas. John Wiley Sons. Woodward, M. (2013).Epidemiology: study design and data analysis. CRC Press. Xia, J., Mandal, R., Sinelnikov, I. V., Broadhurst, D., Wishart, D. S. (2012). MetaboAnalyst 2.0a comprehensive server for metabolomic data analysis.Nucleic acids research,40(W1), W127-W133. Xia, L. C., Ai, D., Cram, J., Fuhrman, J. A., Sun, F. (2013). Efficient statistical significance approximation for local similarity analysis of high-throughput time series data.Bioinformatics,29(2), 230-237. Yahoo Finance - Business Finance, Stock Market, Quotes, News. (2016).Yahoo Finance. Retrieved 27 May 2016, from https://finance.yahoo.com/ Zhu, J. (2014).Quantitative models for performance evaluation and benchmarking: data envelopment analysis with spreadsheets(Vol. 213). Springer.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.